Problems with Traditional Object Oriented Ideas - Jonathan Pryor's web log

« Things I Didn't Know - What Is Obesity? | Main | Comparing Java and C# Generics »

Problems with Traditional Object Oriented Ideas

I've been training with Michael Meeks, and he gave Hubert and I an overview of the history of OpenOffice.org.

One of the more notable comments was the binfilter module, which is a stripped-down copy of StarOffice 5.2 (so if you build it you wind up with an ancient version of StarOffice embedded within your current OpenOffice.org build).

Why is a embedded StarOffice required? Because of mis-informed "traditional" Object Oriented practice. :-)

Frequently in program design, you'll need to save state to disk and read it back again. Sometimes this needs to be done manually, and sometimes you have a framework to help you (such as .NET Serialization). Normally, you design the individual classes to read/write themselves to external storage. This has lots of nice benefits, such as better encapsulation (the class doesn't need to expose it's internals), the serialization logic is in the class itself "where it belongs," etc. It's all good.

Except it isn't. By tying the serialization logic to your internal data structures, you severely reduce your ability to change your internal data structures for optimization, maintenance, etc.

Which is why OpenOffice.org needs to embed StarOffice 5.2: the StarOffice 5.2 format serialized internal data structures, but as time went on they wanted to change the internal structure for a variety of reasons, The result: they couldn't easily read or write their older storage format without having a copy of the version of StarOffice that generated that format.

The take away from this is that if you expect your software to change in any significant way (and why shouldn't you?), then you should aim to keep your internal data structures as far away from your serialization format as possible. This may complicate things, or it may require "duplicating" code (e.g. your real data structure, and then a [Serializable] version of the "same" class -- with the data members but not the non-serialization logic -- to be used when actually saving your state), but failure to do so may complicate future maintenance.

(Which is why Advanced .NET Remoting suggests thinking about serialization formats before you publish your first version...)

Posted on 28 Aug 2007 | Path: /development/openoffice.org/ | Permalink