Tuesday, March 24, 2015

More about statistics

  Cardinality estimates are a prediction of the number of rows in the query result. The query optimizer uses these estimates to choose a plan for executing the query. The quality of the query plan has a direct impact on improving query performance. To improve cardinality estimates for expressions, follow these guidelines:
  •  Whenever possible, simplify expressions with constants in them. The query optimizer does not evaluate all functions and expressions containing constants prior to determining cardinality estimates. For example, simplify the expression ABS(-100) to 100.
  •  If the query predicate uses a local variable, consider rewriting the query to use a parameter instead of a local variable. The value of a local variable is not known when the query optimizer creates the query execution plan. When a query uses a parameter, the query optimizer uses the cardinality estimate for the first actual parameter value that is passed to the stored procedure.
  • Consider using a standard table or temporary table to hold the results of multi-statement table-valued functions. The query optimizer does not create statistics for multi-statement table-valued functions. With this approach the query optimizer can create statistics on the table columns and use them to create a better query plan.
  • Consider using a standard table or temporary table as a replacement for table variables. The query optimizer does not create statistics for table variables. With this approach the query optimizer can create statistics on the table columns and use them to create a better query plan.      
  • If a stored procedure contains a query that uses a passed-in parameter, avoid changing the parameter value within the stored procedure before using it in the query. The cardinality estimates for the query are based on the passed-in parameter value and not the updated value. To avoid changing the parameter value, you can rewrite the query to use two stored procedures.
As data changes in your tables, the statistics - all the statistics - will be updated based on the following formula:
·         When a table with no rows gets a row
·         When 500 rows are changed to a table that is less than 500 rows
·         When 20% + 500 are changed in a table greater than 500 rows
By ‘change’ we mean if a row is inserted, updated or deleted.

The statistics themselves are stored within your database in a series of internal tables that include sysindexes. You can view some of the information about them using the system views sys.stats and sys.indexes, but the most detail is gleaned using a function, DBCC SHOW_STATISTICS.

Just remember that, when you create an index, part of the task of creating that index is to create the statistics using a full scan of the data. Since a rebuild of an index is effectively a ‘drop and recreate’ of the index, the statistics are also recreated. If you are rebuilding indexes, I would not recommend updating the statistics since this is extra work for a less effective statistic.

No comments:

Post a Comment