Just an interesting observation I made while having to serialize/deserialize a large List<> of structs: Using the default serializing functionality, the serialized data is 2.5 times larger than marshalling the data to a byte array, and also more than 30 times slower.
Let's say I have a struct like this:
public struct Sample
{
public ushort Operation;
public TimeSpan StartTime;
public TimeSpan EndTime;
public int Data1;
public int Data2;
}
Then if I have a List<Sample> of 500,000 items and use the default BinaryFormatter functionality, the resulting file is 27MB and it takes 30.5 seconds to serialize, 20.8 seconds to deserialize.
Now when I have a custom serializer like this (in the container class, which I now mark with ISerializable)...
public unsafe void GetObjectData(SerializationInfo info, StreamingContext context)
{
var sampleSize = Marshal.SizeOf(typeof(Sample));
var size = sampleSize * m_samples.Count;
var bytes = new byte[size];
fixed (byte* ptr = bytes)
{
var intPtr = (IntPtr)ptr;
for (var idx = 0; idx < m_samples.Count; idx++)
{
Marshal.StructureToPtr(m_samples[idx], intPtr, false);
intPtr += sampleSize;
}
}
info.AddValue("Samples", bytes);
}
public unsafe ContainerClass(SerializationInfo info, StreamingContext context)
{
var sampleSize = Marshal.SizeOf(typeof(Sample));
var bytes = info.GetValue("Samples", typeof(byte[])) as byte[];
var sampleCount = bytes.Length / sampleSize;
m_samples = new List<Sample>(sampleCount);
fixed (byte* ptr = bytes)
{
var intPtr = (IntPtr)ptr;
var sampleType = typeof(Sample);
for (var idx = 0; idx < sampleCount; idx++)
{
m_samples.Add((Sample)(Marshal.PtrToStructure(intPtr, sampleType)));
intPtr += sampleSize;
}
}
}
... The resulting data is 2.5 times smaller and it takes less than 1 second to serialize or deserialze. Why this huge time difference?
Now I know that with the default serialization we get the ability to add/remove/re-arrange members in the structures without affecting the deserialization process (makes versioning control easier), but you'd think that if the serializer detects all value types it would at least create some sort of "template" in memory to map all entries (which all have to have the same layout).
Something else I noticed: If you do:
var samples = new List<Sample>(1000);
...it will serialize/deserialize samples.Capacity items, instead of samples.Count items. So even if you add just one item to that list, it will still serialize/deserialize 1000 items.
Just seems a bit slow and bloaty to me.
Add your 2¢