Return to
HomePage
How To: Improve Serialization Performance
http://msdn.microsoft.com/library/en-us/dnpag/html/ScaleNetHowTo01.aspJ.D. Meier, Srinath Vasireddy, Ashish Babbar, and Alex Mackman
Related Links
* Improving .NET Application Performance and Scalability home page
* Chapter 4, "Architecture and Design Review of a .NET Application for Performance and Scalability"
* Chapter 9, "Improving XML Performance"
* Chapter 10, "Improving Web Services Performance"
* Chapter 12, "Improving ADO.NET Performance"
* "Checklist: ADO.NET Performance" in the "Checklists" section of this guide
* "Checklist: Web Services Performance" in the "Checklists" section of this guide
* "Checklist: XML Performance" in the "Checklists" section of this guide
Summary
This How To shows you how to improve serialization performance. The How To covers the
XmlSerializer class that Web services use and the
SoapFormatter and
BinaryFormatter classes that Microsoft® .NET remoting uses to marshal objects. In addition to providing general performance tips, this How To gives specific consideration to improving the performance of
DataSet serialization.
Applies To
* .NET Framework version 1.1
Contents
Overview
What You Must Know
Improving Serialization Performance
Improving
DataSet Serialization
Web Services Serialization Considerations
Remoting Serialization Considerations
Additional Resources
Overview
Serialization is used to persist the state of an object so that the object can be saved and then regenerated later. ASP.NET uses serialization to save objects in session state. Serialization is also used when an object is passed across a remoting boundary, such as an application domain, process, or computer. Finally, serialization is used if parameters are passed to and from Web services.
The .NET Framework provides two serialization mechanisms:
* ASP.NET Web services use the
XmlSerializer class to perform serialization.
* .NET remoting uses the two classes that implement
IFormatter: the
BinaryFormatter and the
SoapFormatter. To support serialization by a formatter object, a type must be marked with the Serializable attribute.
Serialization performance is an important consideration for .NET applications because serialization is used frequently. There are a number of techniques that you can use to improve performance. These are described in this How To.
What You Must Know
If you plan to use serialization, you should know the following:
* Consider the data contract between client and server, and ensure that your interface is designed with efficiency of remote access in mind. For example, avoid chatty interfaces, and, where necessary, implement a data façade to wrap existing chatty interfaces and reduce round trips.
* The
XmlSerializer used by Web services serializes both the public fields and properties of a class.
* The
BinaryFormatter and
SoapFormatter classes used by .NET remoting require that you serialize all of the fields of a class, including those marked as private, whenever you pass an object by value to a remote method call.
* The
XmlSerializer provides faster serialization of
DataSet objects than the
BinaryFormatter and
SoapFormatter because it does not serialize private data.
DataSet objects maintain collections of internal properties to supply functionality, such as
DataViews and XML Diffgrams which can be expensive to serialize.
* Any type can be serialized by the
XmlSerializer class, provided that it has a public constructor and at least one public member that can be serialized, and it does not have declarative security. Types that include member variables that cannot be handled by
XmlSerializer, such as Hashtable, are not serialized.
* The
BinaryFormatter produces a more compact byte stream than
SoapFormatter. SoapFormatter is generally used for cross-platform interoperability.
* When you use the Serializable attribute, .NET run-time serialization uses reflection to identify the data that should be serialized. All nontransient fields are serialized, including public, private, protected, and internal fields. XML serialization uses reflection to generate special classes to perform the serialization.
* The
ISerializable interface allows you to explicitly control how data is serialized.
* Binary serialization usually outperforms XML serialization because its output is more compact.
* XML serialization cannot serialize classes such as
HashTable and
ListDictionary that implement
IDictionary. If you need to serialize objects that implement
IDictionary, you must implement your own custom serialization functionality.
* You should avoid serializing security sensitive data by annotating sensitive fields with the
NonSerialized or
XmlIgnore attributes as described in "Use the
NonSerialized or
XmlIgnore Attributes," later in this How To.
Improving Serialization Performance
There are multiple ways that you can improve run-time serialization performance. For example, you can reduce the size of the serialized data stream by instructing the run-time serializers to ignore specific fields within your class. Another way to improve performance is to implement the
ISerializable interface to gain explicit control over the serialization (and deserialization) process.
Using the
NonSerialized or
XmlIgnore Attributes
You can use attributes to prevent specific fields in your class from being serialized. This reduces the size of the output stream and reduces serialization processing overhead. This technique is also useful to prevent security-sensitive data from being serialized.
There are two attributes:
NonSerialized and
XmlIgnore. The one you should use depends on the serializer that you are using.
* The
SoapFormatter and
BinaryFormatter classes used by .NET remoting recognize the
NonSerialized attribute.
* The
XmlSerializer class used by Web services recognizes the
XmlIgnore attribute.
The following code fragment shows the
XmlIgnore attribute.
Serializablepublic class Employee
{
public string [FirstName;]
[XmlIgnore]
public string [MiddleName;]
public string [LastName;]
}
Using
ISerializable for Explicit Control
The
ISerializable interface gives you explicit control over how your class is serialized. However, you should only implement this interface as a last resort. New formatters provided by future versions of the .NET Framework and improvements to the framework provided by serialization cannot be used if you take this approach.
Note: In general, you should avoid implementing
ISerializable for the following reasons:
* It requires derived classes to implement
ISerializable to participate in serialization.
* It requires that you override the constructor and
GetObjectData. * It limits the type from taking advantage of future features and performance improvements.
Implementing
ISerializableThe
ISerializable interface contains a single method,
GetObjectData, which you use to specify precisely which data should be serialized.
public interface
ISerializable{
public void [GetObjectData(SerializationInfo] info, [StreamingContext] context);
}
The following code shows a simple implementation of the
GetObjectData method. Data is retrieved from the current object instance and stored in the
SerializationInfo object.
public void
GetObjectData(SerializationInfo info,
StreamingContext context)
{
info.AddValue("id", ID);
info.AddValue("firstName", firstName);
...
info.AddValue("zip", zip);
}
When you implement
ISerializable, you must also create a new constructor that accepts
SerializationInfo and
StreamingContext parameters. This constructor is called by the .NET runtime to de-serialize your object. In the constructor, you read data out of the supplied
SerializationInfo object and store the data in the current object instance, as shown in this example.
Serializablepublic class
CustomerInterface :
ISerializable{
protected [CustomerInterface(SerializationInfo] info, [StreamingContext] context)
{
ID = info.GetInt32("id");
firstName = info.GetString("firstName");
...
zip = info.GetString("zip");
}
...
}
Serializing Base Class Members
When you implement
ISerializable, be sure to serialize base class members. If the base class also implements
ISerializable, you can call the base class's
GetObjectData. If the base class does not implement
ISerializable, you need to store each required value.
Versioning Considerations
If you add, remove, or rename the member variables of a class that you have previously serialized, existing persisted objects cannot be successfully de-serialized. This is especially true for classes that do not implement
ISerializable and just call
GetValue. In this case, an exception is generated if the value you request is not present in the serialized stream.
One way to address this issue is to use a
SerializationInfoEnumerator to walk through the items in the
SerializationInfo object, and then use a switch to set values. With this approach, you only restore those fields that are present in the serialized stream and you can manually initialize any missing fields.
Improving
DataSet Serialization
Many applications pass
DataSet objects between remote tiers, although doing so incurs a significant serialization overhead and can cause your application to not meet its performance goals.
DataSets are complex objects with a hierarchy of child objects, and as a result, serializing a
DataSet is a processor-intensive operation. Also,
DataSet objects are serialized as XML even if you use the binary formatter. This means that the output stream is not compact.
There are a number of techniques that you can use to improve
DataSet serialization performance.
Using Column Name Aliasing
You can try aliasing long column names with shorter names to reduce the size of the serialized data. The following example shows how you can use aliases for column names by using the as keyword in your SQL.
DataSet objDataset = new DataSet("Customers");
SqlDataAdapter myAdapter = new SqlDataAdapter("Select
CustomerId as
C,CompanyName as
D,ContactName as
E,ContactTitle as F from Customers",myConnection);
myAdapter.Fill(objDataset);
Stream serializationStream = new
MemoryStream(byteData,0,byteData.Length,true,true);serializationStream.Position=0;
iBinForm.Serialize(serializationStream,objDataset); Avoiding Serializing Multiple Versions of the Same Data
As soon as you make changes to the data in a
DataSet you begin to maintain multiple copies of the data. The
DataSet maintains the original data along with the changed values. If you do not need to serialize new and old values, call
AcceptChanges before you serialize a
DataSet to reset the internal buffers. Depending upon the amount of data held in the
DataSet and the number of changes you make, this can significantly reduce the amount of data serialized. This approach is shown in the following code example.
// load some data into the dataset
customers.Fill(northwind, "Customers");
orders.Fill(northwind, "Orders");
// ... modify the data
northwind.AcceptChanges();// accept the changes made and flush the internal buffers
// ... serialize the dataset
Reducing the Number of
DataTables Serialized
If you don't need to send all of the
DataTables contained in a
DataSet, consider copying the
DataTables you need to send into a separate
DataSet. This will reduce the amount of data serialized by reducing the
DataTables processed and by initializing the change buffers that are used by the
DataView. customers.Fill(northwind, "Customers");
orders.Fill(northwind, "Orders");
//… use or modify some data
DataSet subset = new
DataSet();// copy just the customer
DataTable subset.Tables.Add( northwind.Tables
"customers".Copy());
// ... serialize the subset
DataSet Overriding
DataSet for Binary Serialization
By default,
DataSets are serialized as XML even if you use the
BinaryFormatter. This leads to large serialization data streams. To produce a more compact output format, you can consider overriding the
DataSet class and implementing your own serialization.
Web Services Serialization Considerations
To reduce the size of serialized data sent to and from Web services you can consider a number of compression techniques to compress the data streams. You can achieve other optimizations by efficiently initializing the
XmlSerializer class and by using
XmlIgnore. Consider the following approaches:
* Compressing the serialized data
* Initializing
XmlSerializer by calling
FromTypes on startup
* Using the
XmlIgnore attribute
Compressing the Serialized Data
There are a number of ways that you can compress the serialized data passed to and from Web services:
* Implement
SoapExtensions on both server and client side to compress and decompress the data.
* Implement an
HttpModule to compress the response, for example by using gzip compression, and then unzip the data on the client in the proxy. To do so, you need to override the
GetWebRequest and the
GetWebResponse methods for the Web service client proxy as shown here.
//overriding the
GetWebRequest method in the Web service proxy
protected override
WebRequest GetWebRequest(Uri uri)
{
[WebRequest] request = [base.GetWebRequest(uri);]
request.Headers.Add("Accept-Encoding", "gzip, deflate");
return request;
}
//overriding the
GetWebResponse method in the Web service proxy
protected override
WebResponse GetWebResponse(WebRequest request)
{
//decompress the response from the Web service
return response;
}
* Use the HTTP compression features in Internet Information Services (IIS) 5.0, and then decompress the response within the client-side proxy by using a utility that understands IIS 5.0 compression. Once again, you need to override the
GetWebRequest and the
GetWebResponse methods for the Web service client proxy.
Initializing
XmlSerializer by Calling
FromTypes on Startup
The first time
XmlSerializer encounters a type, it generates code to perform serialization and then it caches that code for later use. However, if you call the
FromTypes static method on the
XmlSerializer, it forces
XmlSerializer to immediately generate and cache the required code for the types you plan to serialize. This reduces the time taken to serialize a specific type for the first time. The following example shows this approach.
static void
OnApplicationStart(){
Type[] myTypes = new Type[] { Type.GetType("customer"), Type.GetType("order") };
[XmlSerializer.FromTypes(] myTypes );
}
Using the
XmlIgnore Attribute
You can consider using the
XmlIgnore attribute, as described earlier to prevent any field you do not need to serialize being included within the output stream.
Remoting Serialization Considerations
The .NET remoting infrastructure uses formatters that implement the
IFormatter interface to perform serialization. The two formatters provided by the .NET Framework are
SoapFormatter and
BinaryFormatter, although you can implement your own. When you use .NET remoting, all nontransient fields are serialized. This includes private, protected, and internal fields.
Using the
NonSerialized Attribute
To optimize performance and security, consider using the
NonSerialized attribute as described previously to prevent unnecessary or security-sensitive fields from being serialized.
DataSets and Remoting
If your application uses
DataSets and you experience serialization performance issues, consider implementing a serialization wrapper class. By implementing a serialization wrapper class, you can reduce the transient memory allocations that remoting typically performs. For an explanation of the issue and a sample, see Microsoft Knowledge Base article 829740, "Improving
DataSet Serialization and Remoting Performance," at http://support.microsoft.com/default.aspx?scid=kb;en-us;829740.
Additional Resources
For more information, see the following resources:
* ”Binary Serialization of ADO.NET Objects," at
http://msdn.microsoft.com/msdnmag/issues/02/12/CuttingEdge/default.aspx * Chapter 9, "Improving XML Performance"
* Chapter 10, "Improving Web Services Performance"
* Chapter 11, "Improving Remoting Performance"
* Chapter 12, "Improving ADO.NET Performance"
Return to
HomePage