Saturday, September 10, 2011

Optimization



Indexes A good indexing strategy can make the difference between queries that run for hours and queries that runs for seconds or minutes. An index that allows an UPDATE or DELETE operation to locate the affected row more quickly generally yields more of a performance gain than overhead to maintain the index.                                               
                                                In the case of reporting, it may be beneficial to create an index on the fly for the report, and the drop the index when you are done. Be careful not to over use this plan.

Statistics sets enable the query optimizer to develop query execution plan with greater accuracy based on the data distribution. These statistics more useful when they are created for the additional columns in an aggregate index. Because Sql server keeps statistics for the first column in an index by default. Don’t forget to run sp_createstats with INDEX ONLY parameter to create statistics for all columns contained in any index in the database.
                   These statistic sets require less storage and maintenance than indexes. The statistic sets are more beneficial to ad hoc queries, but do not appear to be considered by stored procedures.

Covering Indexes can be used to speed up the performance of a lookup query. A covering index includes every column referenced for a particular table in a query. If all columns in the query are contained in a single index, the index becomes a covering index for the query. This type of index allows Sql server to retrieve all required information from the index without reading the data pages. The savings in IO can be significant, especially for wide tables.
                             This technique is best used on the data that is seldom changed or does not change at all.  

Inequality try to avoid the use of NOT in the search conditions wherever possible. The NOT operator will prevent the query optimizer from being able to correctly utilize indexes for the fields referenced after this operator. The use of NOT NULL does not seem to suffer from this problem.

Optimizer hints are used to override the default behavior of the query optimizer. These hints can be used to specify the quality of locking in a batch to provide transaction isolation level required by the application. These hints also play an important role in eliminating unnecessary concurrency problems by eliminating locks when they are not needed.

Nulls allowing nulls in an index lower the chances of that index being selected as useful by the query optimizer. In most cases, it is better to remove nulls from aggregates. It may be more efficient to specify WHERE FIELD IS NOT NULL as part of the where clause. It is easy to eliminate the null values by using the COALESCE function.

Pseudo normalization and temporary normalization this is generally effective when the data that you are modifying is accessed multiple times in the batch or process. You can pull few fields of a wide table which are being accessed by a query into a temp table, and can create covering index. So the row size is smaller than the original table and rows-per-page will be high.
                             It might be beneficial to create a denormalized dataset for reporting or certain batch operations if the data is referenced several times. These approaches would eliminate the need to perform a complex join repeatedly. This would have only few and needed columns.

Joins wherever possible, you should always attempt to join tables on unique keys. As a general rule, the binary data type will yield a minimum of 15-20 percent better performance on join operations. When performing joins using char or varchar columns, keep the column definitions identical between the tables.
                                                                             One technique that can helpful is to add an empty string to the column that allows null value. Turning off the CONCAT_NULL_YIELDS_NULL option for the connection is required.

Union Vs Union All if you know result set will not have duplicates; there is a huge speed benefit of using the UNION ALL operator. It will not scan all the rows for duplicates.

Group by and Order By operations can require the creation of temp tables during query processing. Ordering a query by an indexed column generally not required a temp table, but a GROUP BY operation always needs to create a temp table to perform aggregation before returning a result set. You should also consider using the DISTINCT operator instead of group by when you are eliminating duplicates but do not need to create any aggregate results.

Insert, Update, and Delete it may be possible to drop the indexes before a large insert operation and recreate them afterward. This option would normally only be reasonable for very large batch inserting done outside of normal business hours.
                                                                   Update and delete operations that use a where clause often benefit more from indexing despite the overhead.

Coalesce function can be used in joins and in the where clause to alter query plan, which often improves performance if your query needs to check for NULL values. Eliminating the additional null check by using coalesce can be especially important when you are working with a system that contains rates or a range of effective dates.

No comments:

Post a Comment