Return to HomePage



Code Review: ADO.NET 1.1 Performance

Source: http://msdn.microsoft.com/library/en-us/dnpag/html/ScaleNetChapt13.asp
J.D. Meier, Srinath Vasireddy, Ashish Babbar, Rico Mariani, and Alex Mackman

Overview

Code reviews should be a regular part of your development process. Performance and scalability code reviews focus on identifying coding techniques and design choices that could lead to performance and scalability issues. The review goal is to identify potential performance and scalability issues before the code is deployed. The cost and effort of fixing performance and scalability flaws at development time is far less than fixing them later in the product deployment cycle.
Avoid performance code reviews too early in the coding phase because this can restrict your design options. Also, bear in mind that that performance decisions often involve tradeoffs. For example, it is easy to reduce maintainability and flexibility while striving to optimize code.
This chapter begins by highlighting the most significant issues that time and again result in inefficient code and suboptimal performance. The chapter then presents the review questions you need to ask while reviewing managed code. These questions apply regardless of the type of managed application you are building. Subsequent sections focus on questions specific to ASP.NET, interoperability with unmanaged code, Enterprise Services, Web services, .NET remoting, and data access. The chapter concludes by identifying a set of tools that you can use to help perform your code reviews.
How to Use This Chapter
This chapter presents the questions that you need to ask to expose potential performance and scalability issues in your managed code. To get the most out of this chapter, do the following:
* Jump to topics or read from beginning to end. The main headings in this chapter help you locate the topics that interest you. Alternatively, you can read the chapter from beginning to end to gain a thorough appreciation of the areas to focus on while performing performance-related code reviews.
* Read Chapter 3, "Design Guidelines for Application Performance." Read Chapter 3 to help ensure that you do not introduce bottlenecks at design time.
* Know your application architecture. Before you start to review code, make sure you fully understand your application's architecture and design goals. If your application does not adhere to best practices architecture and design principles for performance, it is unlikely to perform or scale satisfactorily, even with detailed code optimization. For more information, see Chapter 3, "Design Guidelines for Application Performance," and Chapter 4, "Architecture and Design Review of a .NET Application for Performance and Scalability."
* Scope your review. Identify the priority areas in your application where the review should focus. For example, if you have an online transaction processing (OLTP) database, data access is typically the key area where the most number of optimizations are probable. Similarly, if your application contains complex business logic, focus initially on the business layer. While you should focus on high impact areas, keep in mind the end-to-end flow at the application level.
* Read "Application Performance" chapters. Read the "Application Performance and Scalability" chapters found in Part III of this guide to discover technical solutions to problems raised during your code review.
* Update your coding standards. During successive code reviews, identify key characteristics that appear repeatedly and add those to your development department's coding standards. Over time, this helps raise developer awareness of the important issues and helps reduce common performance-related coding mistakes and encourage best practices during development.
* Use and evolve the accompanying checklist in the "Checklists" section of this guide. Use the "Checklist: Code Review" checklist to quickly view and evaluate the guidelines presented in this chapter. Extend and evolve the checklists to reflect your specific application.

ADO.NET
Use the following questions in this section to review the efficiency of your application's data access:
* Do you use connections efficiently?
* Do you use commands efficiently?
* Do you use stored procedures efficiently?
* Do you use Transact-SQL?
* Do you use Parameters?
* Do you use DataReaders?
* Do you use DataSets?
* Do you use Transactions?
* Do you use Binary Large Objects (BLOBS)?
* Do you page through data?

Do You Use Connections Efficiently?
Use the following review questions to review your code's use of database connections:
* Do you close your connections properly?
Keeping too many open connections is a common pitfall. Ensure you close your connections properly to reduce resource pressure. Identify areas in your code where you are using connections, and ensure the following guidelines are followed:
* Open and close the connection within the method.
* Explicitly close connections using a finally or using block.
* When using DataReaders, specify CommandBehavior.CloseConnection.
* If using Fill or Update with a DataSet, do not explicitly open the connection. The Fill and Update methods automatically open and close the connection.

* Do you pool your database connections?
Creating database connections is expensive. You can reduce the creation overhead by pooling your database connections.
You can pool connections by connecting to a database as a single identity rather than flowing the identity of original caller to the database. Flowing the caller's identity results in a separate connection pool for each user. Changing the connection string even by adding an empty space creates a separate pool for that connection string. If you are pooling your database connections, make certain that you call Close or Dispose on the connection as soon as you are done with the connection. This ensures that it is promptly returned to the pool.
* Is the pool size set correctly?
It is important to optimize the maximum and minimum levels of the pool size to maximize the throughput for your application. If you set the maximum levels to values that are too high, you may end up creating deadlocks and heavy resource utilization on the database. If you use values that are too low, you run the risk of under utilizing the database and queuing up the requests.
Determine appropriate maximum and minimum values for the pool during performance testing and performance tuning.
* What data provider do you use?
Make sure that your code uses the correct data provider. Each database-specific provider is optimized for a particular database:
* Use System.Data.SqlClient for SQL Server 7.0 and later.
* Use System.Data.OleDb for SQL Server 6.5 or OLE DB providers.
* Use System.Data.ODBC for ODBC data sources.
* Use System.Data.Oracle.Client for Oracle.
* Use SQLXML managed classes for XML data and SQL Server 2000.

* Do you check the State property of OleDbConnection?
Using the State property causes an additional round trip to the database. If you need to check the status of the connection, consider handling the StateChange event.

More Information
For more information about the questions and issues raised in this section, see "Connections" in Chapter 12, "Improving ADO.NET Performance."
Do You Use Commands Efficiently?
Use the following review questions to help review how efficiently your code uses database commands:
* Do you execute queries that do not return data?
If you do not return values from your stored procedure, use ExecuteNonQuery for optimum performance.
* Do you execute queries that only return a single value?
Identify queries that return only a single value. Consider changing the query to use return values and use Command.ExecuteNonQuery, or if you do not have control over the query, use Command.ExecuteScaler, which returns the value of the first column of the first row.
* Do you access very wide rows or rows with BLOBs?
If you are accessing very wide rows or rows with BLOB data, use CommandBehavior.SequentialAccess in conjunction with GetBytes to access BLOB in chunks.
* Do you use CommandBuilder at runtime?
CommandBuilder objects are useful for design time, prototyping, and code generation. However, you should avoid using them in production applications because the processing required to generate commands can affect performance. Ensure you are not using the CommandBuilder objects at run time.

More Information
For more information about the questions and issues raised in this section, see "Commands" in Chapter 12, "Improving ADO.NET Performance."
Do You Use Stored Procedures?
Use the following review questions to review your code's use of stored procedures:
* Have you analyzed the stored procedure query plan?
During your application's development stage, you should analyze your stored procedure query plan. Recompilation is not necessarily a bad thing; the optimizer recompiles when initial plan is not optimal for other calls. By monitoring and reducing frequent recompilation, you could avoid performance hits. You can monitor recompiling stored procedures by creating a trace in SQL Profiler and track for the SP:Recompile event. Identify the cause of recompilation and take corrective actions. For more information, see “Execution Plan Recompiles” in Chapter 14, “Improving SQL Server Performance.”
* Do you have multiple statements within the stored procedure?
Use SET NOCOUNT ON when you have multiple statements within your stored procedures. This prevents SQL Server from sending the DONEINPROC message for each statement in the stored procedure and reduces the processing SQL Server performs, as well as the size of the response sent across the network.
* Do you return a resultset for small amounts of data?
You should use output parameters and ExecuteNonQuery to return small amounts of data instead of returning a result set that contains a single row. This avoids the performance overhead associated with creating the result set on the server. If you need to return several output parameters, you can select them into variables and then emit a single row by selecting with all the variables so there’s one resultset for all.
* Do you use CommandType.Text with OleDbCommand?
If you use the OleDbCommand, use CommandType.Text. If you use CommandType.StoredProcedure, ODBC call syntax is generated by the provider anyway. By using explicit call syntax, you reduce the work of the provider.

More Information
For more information about the questions and issues raised in this section, see "Stored Procedures" in Chapter 12, "Improving ADO.NET Performance."
Do You Use Transact SQL?
If you use T-SQL, review the following questions:
* Do you restrict the amount of data selected?
Returning large amounts of data increases query time and the time it takes to transfer the data across the network. Similarly updating large amounts of data increases the load on the database server. Avoid using SELECT * in your queries and check that you restrict the amount of data that you select in your queries, for example, by using an appropriate WHERE clause.
* Do you use Select Top in rows?
Using Top in your SELECT statements enables you to limit the number of rows that can be returned by the select command. If you implement client-side paging, it makes sense to make use this feature. The query processing is aborted when the specified number of rows have been retrieved.
For more information about paging data, see "How To: Page Records in .NET Applications" in the “How To” section of this guide.
* Do you select only the columns you need?
Select only columns you need instead of using SELECT * queries. This reduces the network traffic in addition to reducing the processing on the database server.
Reducing your columns to the minimum also makes it easier for SQL Server to use an index to cover your query. If all the columns you need are in a usable index that is smaller than the main table, less I/O is required because the index contains the full result. Indexes are often created exactly for this reason, or columns are added to existing indexes not because of the sorting needs but to make the index better at “covering” the necessary queries. Creation of “covering” indexes is vital because if the index does not cover the query, the main table needs to be access (a so - called bookmark lookup from the index). From a performance perspective, these are equivalent to using joins.
* Do you batch multiple queries to avoid round trips?
Batching is the process of sending several SQL statements in one trip to the server. Batching can increase performance by reducing round trips to the database. Where possible, batch multiple SQL statements together and use the DataReader.NextResult method to improve performance. Another alternative is to batch multiple SQL statements within a stored procedure.

Do You Use Parameters?
Use the following review questions to review your code's use of parameters:
* Do you use parameters for all your stored procedures and SQL statements?
Using parameters when calling SQL statements as well as stored procedures can increase performance. Identify areas in your code where you call SQL statements or stored procedures, and ensure that you are explicitly creating parameters and supplying the parameter type, size, precision, and scale.
* Do you explicitly specify the parameter types?
Specifying the parameter types prevents unnecessary type conversions that are otherwise performed by the data provider. Use the enumeration type that is relevant for the connection used by you; for example, SqlDbType or OledbType.
* Do you cache the parameters for a frequently called stored procedure?
Consider caching the stored procedure parameters if you invoke stored procedures frequently to improve performance. If ASP.NET pages calls stored procedures, you can use cache APIs. If your data access code is factored into a separate component, caching helps only if your components are stateful. A good approach is to cache parameter arrays in a Hashtable. Each parameter array contains the parameters that are required by a particular stored procedure used by a particular connection.

More Information
For more information about the questions and issues raised in this section, see "Parameters" in Chapter 12, "Improving ADO.NET Performance."
Do you use DataReaders?
If you use DataReaders, review the following questions:
* Do you close your DataReaders?
Scan your code to ensure you are closing your DataReaders as soon as you are finished with them. You should call Close or Dispose in a finally block. If you pass a DateReader back from a method, use CommandBahavior.CloseConnection to ensure the connection gets closed when the reader is closed.
* Do you use index to read from a DataReader?
All output from a DataReader should be read using an index (for example, rdr.GetString(0)) which is faster, but for readability and maintainability, you might prefer to use the string names of the columns. If you are accessing the same columns multiple times (for example, when you retrieve a number of rows), you should use local variables that store the index number of the columns. You can use rdr.GetOrdinal() to retrieve the ordinal position of a column.
For more information, see "Use GetOrdinal when Using an Index-Based Lookup" in Chapter 12, "Improving ADO.NET Performance."

Do You Use DataSets?
Use the following review questions to review your code's use of DataSets:
* Do you serialize DataSets?
Inefficient serializing of DataSets is a major performance issue for remote calls. You should avoid sending DataSets (especially when using .NET remoting) and consider alternative means of sending data over the wire, such as arrays or simple collections, where possible.
If you serialize DataSets, make sure you adhere to the following guidelines:
* Only return relevant data in the DataSet.
* Consider using alias column names to shorter actual column names. This helps reduce the size of the DataSet.
* Avoid multiple versions of the data. Call AcceptChanges before serializing a DataSet.
* When serializing a DataSet over a Remoting channel, use the DataSetSurrogate class.

For more information, see "How To: Improve Serialization Performance" in the “How To” section of this guide, and Knowledge Base article 829740, "Improving DataSet Serialization and Remoting Performance," at http://support.microsoft.com/default.aspx?scid=kb;en-us;829740.
* Do you search data which has a primary key column?
If you need to search a DataSet using a primary key, create the primary key on the DataTable. This creates an index that the Rows.Find method can use to quickly find the required records. Avoid using DataTable.Select, which does not use indices.
* Do you search data which does not have a primary key?
If you need to repetitively search by nonprimary key data, create a DataView with a sort order. This creates an index that can be used to improve search efficiency. This is best suited to repetitive searches as there is some cost to creating the index.
* Do you use DataSets for XML data?
If you do not pass the schema for the XML data, the DataSet tries to infer the schema at run time. Pass XmlReadMode.IgnoreSchema to the ReadXml method to ensure that schema is not inferred.

More Information
For more information about the questions and issues raised in this section, see "Connections" in Chapter 12, "Improving ADO.NET Performance."
Do You Use Transactions?
Use the following review questions to review your code's use of transactions:
* What isolation level do you use?
Different isolation levels have different costs. Applications may have to operate at different transaction isolation levels, depending on their business needs. You need to choose the isolation level that is appropriate for the scenario. For example, scenarios that require a high degree of data integrity need a higher isolation level.
* Do you have long-running transactions?
Having a long-running transaction with high isolation levels prevents other users from reading the data. Instead of locking resources for the duration of the transaction, consider accommodating various states within your schema (for example, ticket status PENDING, instead of locking the row). Another option is to use compensating transactions.
* Did you turn off automatic transaction enlistment if it’s not needed?
If you use the.NET Framework Data Provider for SQL Server, you can turn off automatic transaction enlistment by setting Enlist to false in the connection string, as shown in the following code, when you are not dealing with an existing distribution transaction:

SqlConnection LondonSqlConnection = new SqlConnection( "Server=London;Integrated Security=true;Enlist=false;");


More Information
For more information about the questions and issues raised in this section, see "Transactions" in Chapter 12, "Improving ADO.NET Performance."
Do You Use Binary Large Objects (BLOBS)?
Use the following review questions to review your code's use of BLOB data:
* Do you store BLOBs in the database?
Reading and writing BLOBs to and from a database is an expensive operation, not only from a database perspective, but also from a code perspective. This is because there is also a memory impact associated with accessing BLOB data. If you store files such as images or documents that are frequently accessed by a Web server, consider storing the files on the Web server’s file system and maintaining a list of all the objects in the database. This can increase performance by avoiding frequent moving of BLOBs from the database to the Web server.

Note: This approach adds a maintenance overhead of having to update the links if the file path changes.

If you have a large store of images that is too large for a Web server, storing it in the SQL database as BLOBs is the right choice.
* Do you use a DataReader to read BLOBs?
If you access BLOB data, check that you use CommandBehavior.SequentialAccess in conjunction with the GetBytes, GetChars, or GetString methods to read BLOB in chunks.
* Do you read or write BLOBs to SQL Server database?
Ensure that you use READTEXT and UPDATETEXT to read and write large BLOBs to a SQL Server database. Use READTEXT to read text, ntext, varchar, varbinary, or image values. This enables you to read the data in chunks to improve performance. Use UPDATETEXT to write data in chunks.
However, if you "BLOB" an item that is relatively small, you can consider reading it in a statement or operation rather than in chunks. This depends on your network bandwidth and workload.
* Do you read or write BLOBs to an Oracle database?
Ensure that you use the System.Data.OracleClient.OracleLob class to read and write BLOBs to an Oracle database. The Read and Write methods provide the flexibility of reading and writing the data in chunks.

More Information
For more information about the questions and issues raised in this section, see "Binary Large Objects (BLOBS)" in Chapter 12, "Improving ADO.NET Performance."
Do You Page Through Data?
Use the following review questions to review your code's use of paging records:
* Do you page data based on user query (such as results of a search query)?
If you need to page through a large amount of data based on user queries, consider using SELECT TOP along with the table data type in your stored procedures. For more information, see “How To: Page Records in .NET Applications” in the "How To" section of this guide.
* Do you page through data which is mostly static over a period of time?
If you need to page through large amounts of data that is same for all users and is mostly static, consider using SELECT TOP along with the global temptable in your stored procedures. If you take this approach, ensure you have a policy in place to manage factors, such as refreshing the temp table with current data. For more information refer to “How To: Page Records in .NET Applications”.

More Information
For more information about the issues raised in this section, see Chapter 12, "Improving ADO.NET Performance."
Summary
Performance and scalability code reviews are similar to regular code reviews or inspections, except that the focus is on the identification of coding flaws that can lead to reduced performance and scalability.
This chapter has shown how to review managed code for top performance and scalability issues. It has also shown you how to identify other more subtle flaws that can lead to performance and scalability issues.
Performance and scalability code reviews are not a panacea. However, they can be very effective and should be a regular milestone in the development life cycle.
Additional Resource
For more information, see the following resources:
* Chapter 4 "Architecture and Design Review of a .NET Application for Performance and Scalability."
* Chapter 6, "Improving ASP.NET Performance."
* Chapter 7, "Improving Interop Performance."
* Chapter 8, "Improving Enterprise Services Performance."
* Chapter 9, "Improving XML Performance."
* Chapter 10, "Improving Web Services Performance."
* Chapter 11, "Improving Remoting Performance."
* Chapter 12, "Improving ADO.NET Performance."
For printable checklists, see the following checklists in the "Checklists” section of this guide:
* "Checklist: ASP.NET Performance."
* "Checklist: Managed Code Performance."
* "Checklist: Enterprise Services Performance."
* "Checklist: Interop Performance."
* "Checklist: Remoting Performance."
* "Checklist: Web Services Performance."
* "Checklist: XML Performance."
For further reading, see the following resource:
* For more information about designing for performance, see "Performance" on MSDN at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vsent7/html/vxconPerformance.asp?frame=true.



Return to HomePage
Microsoft Communities