Thursday, December 4, 2008

Study Notes: Serialization.

refers: http://msdn.microsoft.com/en-us/magazine/cc301761.aspx

Serialization is the process of converting an object or a connected graph of objects into a contiguous stream of bytes. Deserialization is the process of converting a contiguous stream of bytes back into its graph of connected object.

How serialization works:

1.1) If any fields in the graph refer to other objects, then it serialize these objects, too. In other words, it is convenient to implement deep clone using serialization and deserialization.

1.2) If two objects in the graph refer to each other, then the formatter detects this, serializes each object just once, and avoids entering into an infinite loop.

1.3) It is possible and also quite useful to serialize multiple object graphs out to a single byte stream.

1.4) When serializing an object, the full name of the type and the name of the type's defining assembly are written to the byte stream. By default, the BinaryFormatter and SoapFormatter types output the assembly's full identity, which includes the assembly's file name (without extension), version number, culture, and public key information

1.5) Whole Process
a.1) MemberInfo[] GetSerializableMembers(Type type);
a.2) Object[] GetObjectData(Object obj, MemberInfo[] members);
a.3) write the assembly type info and 2 arrays into the stream.

b.1) Type GetTypeFromAssembly(Assembly assem, String name);
b.2) Object GetUninitializedObject(Type type); It does not call any constructor.
b.3) MemberInfo[] GetSerializableMembers(Type type);
b.4) Object PopulateObjectMembers(Object obj, MemberInfo[] members, Object[] data);


Dev notes:

2.1) By default, types are not serializable. developer must apply the System.SerializableAttribute custom attribute to this type, to make it serializable.

2.2) For performance reasons, formatters do not verify that all of the objects in the graph are serializable before serializing the graph. So, when serializing an object graph, it is entirely possible that some objects may be serialized to the byte stream before the SerializationException is thrown. If this happens, the byte stream is corrupt.

I recommend that you serialize the objects into a MemoryStream first. Then, if all objects are successfully serialized, you can copy the bytes in the MemoryStream to whatever stream (file or network, for example) you really want the bytes written to.

2.3) The SerializableAttribute custom attribute may be applied to reference types (class), value types (struct), enumerated types (enum), and delegate types (delegate) only

2.4) You can easily make a derived class to be serializable if the base class is serializable. But if the base class is not, then you cannot apply the Serializable attribute to the derived class.

2.5) So you may want to make all your classes serializable to grant a lot of flexibility to users of your types. However, you must be aware that serialization reads all of an object's fields regardless of whether the fields are declared as public, protected, internal, or private.

2.6) You can put NonSerializedAttribute to a type's fields, so the fields are ignored when serializing. And the attribute continues to apply to these fields when inherited by another type.

You may need to implement the System.Runtime.Serialization.IDeserializationCalback interface, if you need to do something for those NonSerialized fields.
During deserialization, formatters check to see if the object it is deserializing comes from a type that implements the IDeserializationCallback interface. If the type does implement this interface, then the formatter adds this object's reference to an internal list. After all of the objects have been deserialized, the formatter walks this list and calls each object's OnDeserialization method.

No comments: