Sunday, March 23, 2014

Collation

A collation encodes the rules governing the proper use of characters for either a language, such as Greek or Polish, or an alphabet, such as Latin1_General (the Latin alphabet used by western European languages).
Each SQL Server collation specifies three properties:
  • The sort order to use for Unicode data types (nchar, nvarchar, and ntext). A sort order defines the sequence in which characters are sorted, and the way characters are evaluated in comparison operations.
  • The sort order to use for non-Unicode character data types (char, varchar, and text).
  • The code page used to store non-Unicode character data.
If the users of your instance of SQL Server speak multiple languages, you should pick a collation that best supports the requirements of the various languages. For example, if the users generally speak western European languages, choose the Latin1_General collation.
Collation refers to a set of rules that determine how data is sorted and compared. Character data is sorted using rules that define the correct character sequence, with options for specifying case-sensitivity, accent marks, kana character types and character width.

Case sensitivity
If A and a, B and b, etc. are treated in the same way then it is case-insensitive. A computer treats A and a differently because it uses ASCII code to differentiate the input. The ASCII value of A is 65, while a is 97. The ASCII value of B is 66 and b is 98.

Accent sensitivity
If a and á, o and ó are treated in the same way, then it is accent-insensitive. A computer treats a and á differently because it uses ASCII code for differentiating the input. The ASCII value of a is 97 and áis 225. The ASCII value of o is 111 and ó is 243.

Kana Sensitivity
When Japanese kana characters Hiragana and Katakana are treated differently, it is called Kana sensitive. 

Width sensitivity
When a single-byte character (half-width) and the same character when represented as a double-byte character (full-width) are treated differently than it is width sensitive.

You can use the COLLATE keyword in various ways and at several levels:
COLLATE on database creation You can use the COLLATE clause of the CREATE DATABASE or ALTER DATABASE statement to specify the default collation of the database. You can also specify a collation when you create a database using SQL Server Management Studio. If you do not specify a collation, the database is assigned the default collation of the instance of SQL Server.
COLLATE on table creation You can specify collations for each varchar or char column using the COLLATE clause in the CREATE TABLE or ALTER TABLE statement. You can also specify a collation when you create a table using SQL Server Management Studio. If you do not specify a collation, the column is assigned the default collation of the database.
COLLATE by casting or expression You can use the COLLATE clause to cast an expression to a certain collation. You can assign the COLLATE clause to any ORDER BY or comparison statement.
SQL Server 2005 supports more than 1,000 collation types.

No comments:

Post a Comment