Knowing SQL Server: 2013

SQL Server 2008 New features

SQL Server 2008's data compression enables you to compress data stored in the database. This reduces storage requirements and can actually improve the performance of workloads that have high I/O requirements. SQL Server 2008 also supports compressing backups.

The Database Engine introduces a new user-defined table type that supports representing table structures for use as parameters in stored procedures and functions, or in a batch or the body of a stored procedure or function. You can create unique constraints and primary keys on user-defined table types.

3) Merge

This new Transact-SQL statement performs INSERT, UPDATE, or DELETE operations on a target table based on the results of a join with a source table. The syntax allows you to join a data source with a target table or view, and then perform multiple actions based on the results of that join.

SQL Server 2008 introduces two spatial data types: geometry and geography. The geometry data type supports planar, or Euclidean (flat-earth), data. The geography data type stores ellipsoidal (round-earth) data, such as GPS latitude and longitude coordinates. These new data types support the storage and manipulation of spatial data objects such as linestrings, points, and polygons.

SQL Server 2008 introduces a new system-provided data type to encapsulate hierarchical relationships. Use hierarchyid as a data type to create tables with a hierarchical structure or to reference the hierarchical structure of data in another location. Use hierarchical methods to query and perform work with hierarchical data by using Transact-SQL.

SQL Server 2012 New features

CREATE SEQUENCE [schema_name . ] sequence_name

[ AS [ built_in_integer_type | user-defined_integer_type ] ]

[ START WITH <constant> ]

[ INCREMENT BY <constant> ]

[ {MINVALUE [ <constant> ] } | { NO MINVALUE } ]

[ {MAXVALUE [ <constant> ] } | { NO MAXVALUE } ]

[ CYCLE | { NO CYCLE } ]

[ { CACHE [ <constant> ] } | { NO CACHE } ]

[ ; ]

SELECT *

FROM Customers

ORDERBYCustomerID

OFFSET 10 ROWS

FETCH NEXT 10 ROWS ONLY;

--skips first 10 rows and fetches next 5

BEGIN TRY

BEGIN TRANSACTION -- Start the transaction

-- Delete the Customer

DELETE FROM Customers

WHERE EmployeeID = ‘CACTU’

-- Commit the change

COMMIT TRANSACTION

END TRY

BEGIN CATCH

-- There is an error

ROLLBACK TRANSACTION

-- Re throw the exception

THROW

END CATCH

Also, you can use Throw to replace RAISERROR function:

Collapse | Copy Code

THROW 51000, ‘The record does not exist.’, 1;

EXEC CustOrderDetail ‘2’

WITH RESULT SETS

               ProductName1varchar(100),

               Unitprice1varchar(100),

               Quantity1varchar(100),

               Discount1varchar(100),

               ExtendedPrice1varchar(100)

);

SQL Server 2012, we have a new feature called Contained Databases, which intends to reduce or eliminate the dependencies that a database has on the SQL Server instance, making it easier to migrate a database to a new instance with less of the work involved in reproducing and validating these dependencies. In SQL Server 2012, tempdb will automatically create objects using the collation of the Contained Database, rather than the server default, making it easier to rely on your code regardless of the server collation (of course this still does not resolve issues where you try to join #temp tables generated from multiple databases with different collations).

SQL Server 2012, the relational engine gets to take advantage of the column-based index first seen through the VertiPaq acquisition a few years ago. What a ColumnStore index does is essentially turn a traditional index on its side. Think about a covering index that includes multiple columns, but instead of storing all the different columns in a single row on a single page, split it up and store each column on its own set of pages. In this version, ColumnStore indexes have a major limitation: they are read only. This means that once a ColumnStore index has been created on a table, you can no longer perform any DML operations against that table.

SELECT

    CASE WHEN TRY_CONVERT(float, 'test') IS NULL

    THEN 'Cast failed'

    ELSE 'Cast succeeded'

END AS Result;

GO

Statistics

SQL Server tries to use statistics to “react intelligently” in its query optimization. Knowing number of

records, density of pages, histogram, or available indexes help the SQL Server optimizer “guess” more accurately how it can best retrieve data. The Query Optimizer uses them to estimate how many rows will be returned from a query plan.

With no statistics to show how the data is distributed, the optimizer has no way it can compare the efficiency of different plans and so will be frequently forced to simply scan the table or index. Data is measured two different ways within a single set of statistics, by density and by distribution. Density is a ratio that shows just how many unique values there are within a given column, or set of columns. The formula is quite easy:

Density = 1 / Number of distinct values for column(s)

A high density (low selectivity, few unique values) will be of less use to the optimizer because it might not be the most efficient way of getting at your data. For example, if you have a column that shows up as a bit, a true or false statement such as, has a customer signed up for you mailing list, then for a million rows, you’re only ever going to see one of two values. That means that using an index or statistics to try to find data within the table based on two values is going to result in scans where more selective data, such as an email address, will result in more efficient data access.

There are 3 types of statistics in a SQL Server database.

1) Statistics created due to index creation. These statistics have the index name

2) Statistics created by Optimizer (Column statistics). Starts with _WA_*

3) User defined statistics which are created with CREATE STATISTICS command by the DBA

Facts about statistics:

· Index statistics are always updated with full scan when we rebuild the index (Only exception is in SQL 2012 partitioned index when the number of partitions >1000 it uses default sampling).

· Column Statistics are not updated because of index rebuilds, they are only updated by either DB Auto update stats option or if we manually update statistics using UPDATE STATISTICS command.

· If we do not specify the sampling rate while running the UPDATE STATISTICS command it will take the default sampling rate. See below how the default sampling rate algorithm works.

· If you specify Update statistics command with just the table name it will update statistics for all the statistics including the index statistics on the table. So you may lose the full scan advantage which you get for index statistics if you update statistics with sampling rate after rebuilding indexes (Be very careful with this step during your maintenance windows)

An index is a physically implemented structure in the database (you can read up more in BOL on clustered and non-clustered indexes) whereas statistics are a set of values that help the optimizer during the execution plan formation stages to decide whether to use an index or not. And it is not a 1-1 relationship between indexes and statistics i.e. all indexes have statistics but one can have statistics without an index. And these statistics that do not associate to an index can also help in the formation of the right execution plan Index helps the optimizer to find the data during the execution of the statements and statistics help the optimizer to determine which indexes to use.

Statistics essentially contain two pieces of information:

· A histogram which contains a sampling of the data values from the index and the distribution of the values in the ranges of data.

· Density groups (collections) over the column (or number of columns) of a table or an indexed view.

Density essentially reflects the uniqueness of the values in a particular column.

The density is just a fraction which represents how many records will be returned back when a query is made for a given value. So, for the primary key column index, the density will be 1. It however does not give us any information about the ranges i.e. the values that may appear more or less than the average N number of times. That information is provided by the histograms which are represented above.

Another thing of importance to note here is that in the cases of multiple columns in an index, the histogram is available only on the leading column of the index i.e. the first column. The density value will be available for each first column based subset, for example: If I create an index on say (LastName, FirstName), then I would have a density value for each of these:

a) LastName => the value for this would mean that for every LastName value, how many records get qualified?

b) LastName, FirstName => the value for this would mean that for every combination of the LastName and FirstName, how many records get qualified?

c) LastName, FirstName, EmployeeID => This includes the last column as the Primary Key column (a clustered

index) since what we created was a non-clustered index.

When the automatic update statistics option, AUTO_UPDATE_STATISTICS, is on, the query optimizer determines when statistics might be out-of-date and then updates them when they are used by a query. Statistics become

out-of-date after insert, update, delete, or merge operations change the data distribution in the table or indexed view. The query optimizer determines when statistics might be out-of-date by counting the number of data modifications since the last statistics update and comparing the number of modifications to a threshold.

The query optimizer checks for out-of-date statistics before compiling a query and before executing a cached query plan. Before compiling a query, the query optimizer uses the columns, tables, and indexed views in the query predicate to determine which statistics might be out-of-date. Before executing a cached query plan, the Database Engine verifies that the query plan references up-to-date statistics.

Consider updating statistics after performing maintenance procedures that change the distribution of data, such as truncating a table or performing a bulk insert of a large percentage of the rows. This can avoid future delays in query processing while queries wait for automatic statistics updates. The query optimizer updates statistics when you rebuild an index on a table or view with ALTER INDEX REBUILD or DBCC DBREINDEX check out your “Actual Number of Rows” and “Estimated Number of Rows”.

If these numbers are (consistently) fairly close, then most likely your statistics are up-to-date and used by the optimizer for the query. If not, time for you to re-check your statistics create/update frequency.

Knowing SQL Server

Sunday, August 18, 2013

SQL Server 2008 New features

SQL Server 2012 New features

Statistics

Blog Archive