Friday, September 25, 2009

Collation in SQL Server

“SQL Server is not case sensitive” ….. I am sure DBA’s have heard this phrase uttered on many an occasion. The truth is ….that depends!

The case sensitivity among other things is determined in SQL Server by the collation setting. The collation assigned in SQL Server determines the sorting rules, case, and accent sensitivity of the data residing inside a SQL Server database. This property is defined at the server, database, column, and expression level. Each level defaults to the setting of the parent if not explicitly defined.

Specifying the collation setting in SQL Server determines which codepage is utilized to represent non-unicode character data. A codepage is a set of characters that represent different languages and locales. When installing SQL Server, the machine locale setting determines the default collation setting used by the SQL Server being installed. If the locale setting for the machine is set to “English (United States)” the default collation setting is SQL_Latin1_General_CP1_CI_AS which is case insensitive. Not changing this during install results in the server collation setting being case insensitive.

Collation names are comprised using the following naming standard.
SQL_SortRule_pref_CodePage_ {CaseSensitivity_AccentSensitivity | BIN}

SortRule - identifies the language to be used for sorting

CodePage - identifies the code page

CaseSensitivity - either CI or CS for Case Insensitive or Case Sensitive

AccesntSensitivity - either AI or AS for Accent Insensitive or Accent Sensitive

BIN – designates that binary sort order is to be used.

A list of available collations for SQL Server 2008 can be found here.

Collation is not just defined for the SQL Server instance however. One can control collations to a finer degree. The T-SQL command COLLATE will apply the selected collation to a database during the execution of CREATE DATABASE command. The following command will result in a new database with using the “English (United States)” code page with case sensitivity.

CREATE DATABASE TestDB_2 COLLATE SQL_Latin1_General_Cp1_CS_AS

COLLATE can also be applied in the column definition of CREATE and ALTER Table. The following example will create a table “MOVIES” with the titles being case insensitive and accent sensitive.

CREATE TABLE MOVIES
(Mov_ID INT,
MOV_TITLE VARCHAR(1024) COLLATE SQL_Latin1_General_Cp1_CI_AS NOT NULL,
MOV_RELEASE_DATE DATE
)

Since the TestDB_2 database was created using a case sensitive collation the object names in the database then also become case sensitive. So after creating the table “MOVIES” the following statement will fail.

DROP TABLE dbo.movies;

And Lastly, COLLATE can also be applied to a string expression. This will result in a cast of the result into the specified collation.

So, “SQL Server is not case sensitive”…. Just depends.

No comments:

Post a Comment