Thomas Alva Edison's light bulb VBInfoZine Home
   An ordinary VB developer shares his own successes and failures
   FREE to registered subscribers.  

Real world .NET serialization

Do you know what Jeff Richter didn't tell you about .NET serialization? Read on and learn what I've learned the hard way applying serialization to a real world .NET project.
This article can be regarded as a "free continuation" of the article "My first .NET project", because the issues explained here are drawn from that same project. For those of you who didn't read the article, I'd recommend to read at least the first few paragraphs in order to get familiar with the project's requirements and architecture...

In a hurry? OK, click this link to read the relevant paragraphs right here:

About the time .NET Framework 1.0 was released I managed to win a project whose large part consisted of parsing and processing large amounts of text. The text was in the form of huge, semi-structured files (representing a kind of legislative documents). The project's job was mainly to:

  • Parse the files and display the parsed tree-like structure to the user.
  • Let the user edit the hierarchical structure and the contents of the individual nodes.
  • Save the (potentially modified) hierarchical structure to an SQL database.
I've designed the application to consist of three major parts:
  • The parsing engine to parse an input file and produce an in-memory tree of Block objects (I've used the System.Text.RegularExpressions classes for this). The Block tree was wrapped in a BlockSet object (more on that later).
  • The user interface controls and windows for display and modification of the Block tree and the individual Block objects.
  • The DB class responsible for saving and restoring BlockSets to / from database.

Now let's go back to the project.

Shortly after I released my first BETA, the customer requested a new feature that would allow the editors to work offline, i.e. disconnected from the central database. Suddenly I had to implement a file-based persistence.

I've researched some of the design options and I've quickly discovered the System.Runtime.Serialization namespace. I've flipped through the documentation and read the Jeff Richter's excellent series of articles about serialization ( #1, #2 and #3)

I was a bit lazy so I took the easier path of "automatic" serialization. I've rushed through the source code adding the Serializable attribute to the Block class, the BlockSet class and some other helper classes found in the hierarchy. After that, I've added a code to actually serialize and deserialize the Block graph, which was rather easy (look at this sample from the official documentation).

I've launched the application, created a simple Block tree and then I've invoked the command to save the tree to disk. Guess what? I've got a SerializationException saying that "Serialization will not deserialize delegates to non-public methods."

Of course, the BlockSet class had several private event handling methods and they were hooked to the underlying Block events. But why they couldn't be deserialized is beyond me. After all, they were already serialized, right?

I can only speculate that this has something to do with security and I'd really appreciate if someone can explain the reasoning behind this "non-public delegates" exception.

I've even searched through the MONO and Rotor project's source code and I've found that the deserialization code just checks if the delegate being deserialized is public and if not, it throws an exception. Without any comments.

(Rotor: See the DelegateSerializationHolder.GetDelegate(DelegateEntry) method in the delegateserializationholder.cs file. MONO: See the DeserializeDelegate(SerializationInfo) method)

At this point I stopped struggling with the "automatic" serialization approach opting to implement ISerializable. (I could just make the Block's private event handlers public, but I didn't want to do it. You know, I'm an OO purist:-). I've implemented the ISerializable.GetObjectData method serializing just the data fields. I've implemented the required deserialization constructor to do the deserialization.

Everything worked great!

The application eventually went to production and the users were happily saving and loading their Block trees to and from disk files. After a few weeks, the customer asked to add an additional Notes property to the Block class.

There you have it - VERSIONING!

Here is how I did it with the Block class.

This code shows the deserialization constructor before the new Notes property was added:

Protected Sub New( _
 ByVal info As SerializationInfo, _
 ByVal context As StreamingContext)
	...
	_dateFrom = info.GetDateTime("_dateFrom")
	_dateTo = info.GetDateTime("_dateTo")
	_caption = info.GetString("_caption")
	...
End Sub
This is the new code:
Protected Sub New( _
 ByVal info As SerializationInfo, _
 ByVal context As StreamingContext)
	...
	_dateFrom = info.GetDateTime("_dateFrom")
	_dateTo = info.GetDateTime("_dateTo")
	_caption = info.GetString("_caption")
	
	Try
		_notes = info.GetString("_notes")
	Catch
		_notes = String.Empty
	End Try
	
	...
End Sub
Simple, isn't it? Elegant? Far from it, IMHO.

I'd expect the SerializationInfo class to have some kind of "lookup" method to query if a slot with a given name is present in the deserialized data, but there is no such method. One approach would be to call SerializationInfo.GetEnumerator and find out if a slot with a given name exists by "manually" iterating through the returned SerializationInfoEnumerator object. Another approach would be trying to read a named slot and ignoring the exception when no slot with a given name exists. This is what I did in the example above.

(I know that the latter approach violates the rule that one should never use exceptions to handle the "normal" code flow, but I can live with that:-).

To wrap things up, here is the message I'd like you to remember:

Always implement ISerializable and never rely on the "automatic" serialization.

When you start with a simple class it might be tempting to use the "automatic" serialization mechanism just by adding the Serializable attribute to your class. Please, don't do it. (Remember, we've been talking about real applications, not about code samples or quick starts or whatever...) Sooner or later you will have to extend the class (by adding a property, changing a property's type, removing property...) and your code will have to read serialization streams created with the old code.

This kind of versioning simply cannot be reasonably implemented without using ISerializable.

To better illustrate the differences between the "automatic" and ISerializable approaches, I've built two VB projects--AutomaticSerialization.vbproj and ISerializableSerialization.vbproj. They are both contained within the Serialization.sln solution file. You can download the compressed solution here. After downloading, expand the files including folder names and take a look at the source code. I hope it will be an inspirational reading.

© Palo Mraz - Sunday, July 06, 2003

 ©2003-2007 Palo Mraz. All Rights Reserved.   See my 'new browser window' policy