In SQL Server 2005 SP2, Microsoft introduced data
type-level compression, which allowed data of the decimal data type to be
stored in a variable length field, also referred to as vardecimal. In SQL
Server 2008, the concept of compression is extended even further to all of the
fixed length data types, including int, char, and float data types. Basically,
you can allow SQL Server to save space by storing your data like it was a
variable-sized type, yet in usage, the data will appear and behave like a fixed
length type.
For example, if you stored the value of 100 in an
int column, SQL Server needn’t use all 32 bits; It can simply use 8 bits (1
byte) as tinyint. Another case is when you use a char (30) column but store
only two characters; 28 characters could be saved. There is an overhead of 2
bytes per variable-length column (or 4 bits if the size of the column is less
than 8 bytes). Note that compression is only available in the Enterprise
Edition.
This data type-level compression is referred to as row
compression. Row compression is a very interesting thing for many databases
that use lots of fixed length data (for example, integers, especially for
surrogate keys).
SQL Server 2008 also includes an additional
compression capability called page compression. With page compression
(which includes row compression), the storage engine does a couple of interesting
things to compress the data on a page:
·
Prefix compression: Looks for repeated values in a value (like '0000001' and compresses the
prefix to something like 6-0 (six zeros)
·
Dictionary compression: For all values on the page, the storage engine looks for duplication and
stores the duplicated value once and then stores pointers on the data pages
where the duplicated values originally resided.
You can apply data compression to your tables and indexes with the CREATE
TABLE, ALTER TABLE, CREATE INDEX, and ALTER INDEX syntaxes.
CREATE TABLE
test
(
testId int,
value int
)
WITH
(DATA_COMPRESSION = ROW) -- PAGE or NONE
ALTER TABLE
test REBUILD WITH (DATA_COMPRESSION = PAGE) ;
Giving advice on whether to use compression is not
really possible without knowing the factors that surround your actual
situation. One tool you can use is the system procedure— sp_estimate_data_compression_savings—to
check existing data to see just how compressed the data in the table or index
would be after applying compression, but it won’t tell you how the compression will
positively or negatively affect your performance. There are trade-offs to any
sorts of compression. CPU utilization will certainly go up, because instead of
directly using the data right from the page, the query processor will have to
translate the values from the compressed format into the uncompressed format
that SQL Server will use. On the other hand, if you have a lot of data that
would benefit from compression, you could possibly lower your I/O enough to
make doing so worth the cost.
No comments:
Post a Comment