At the physical level, databases contain files,
where the data is physically stored. These files are basically just typical
Microsoft Windows files. These different files can be logically grouped into filegroups
that control where they are stored on a disk. Each file contains a number
of extents, which are a 64K allocation in a database file that’s made up
of eight individual contiguous 8K pages. Finally, the page is the basic
unit of data storage in SQL Server databases. Everything that’s stored in SQL
Server is stored on pages of several types: data, index, and overflow.
Files and
Filegroups
At the top level of a SQL Server instance we have the database. The
database is comprised of one or more filegroups, which are a logical grouping
of one or more files. We can place different filegroups on different disk
drives (hopefully on a different disk drive controller) to distribute the I/O
load evenly across the available hardware. It’s possible to have multiple files
in the filegroup, in which case SQL Server allocates space across each file in
the filegroup. For best performance, it’s best to have no more files in a
filegroup than you have physical CPUs. There’s usually very little, if any,
performance benefit to using multiple filegroups if they’re all on the same
physical drive.
A filegroup contains one or more files, which are
actual operating system files. Each database has at least one primary
filegroup, whose files are called primary files (commonly suffixed as
.mdf). Each database can possibly have other secondary filegroups containing
the secondary files (commonly suffixed as .ndf), which are in any other
filegroups. Files may only be a part of a single filegroup. SQL Server proportionally
fills files by allocating extents in each filegroup equally. To place an object
in a filegroup other than the default, you need to specify the name of the
filegroup using the ON clause of the table- or index-creation statement, for
example:
CREATE TABLE <tableName>
(...) ON
<fileGroupName>
For example, if you wanted the file to start at 1GB and grow in chunks of
100MB up to 2GB, you could specify the following:
CREATE
DATABASE demonstrateFileGrowth ON
PRIMARY (
NAME = Primary1,FILENAME = 'c:\demonstrateFileGrowth_primary.mdf',
SIZE = 1GB, FILEGROWTH=100MB,
MAXSIZE=2GB)
LOG ON ( NAME
= Log1,FILENAME = 'c:\demonstrateFileGrowth_log.ldf', SIZE = 10MB)
The growth settings are fine for smaller systems,
but it’s usually better to make the files large enough so there’s no need for
them to grow. File growth can be slow and cause ugly bottlenecks when OLTP
traffic is trying to use a file that’s growing. Even with the existence of
instant file allocation, it’s still better to have the space allocated
beforehand, as you then have cordoned off the space ahead of time. You can
query the sys.filegroups catalog view to view the files in the newly created
database.
SELECT
fg.name as file_group,
df.name
as file_logical_name,
df.physical_name
as physical_file_name
FROM
sys.filegroups fg
join
sys.database_files df
on
fg.data_space_id = df.data_space_id
Extents and Pages
Files are further broken down into a number of extents,
each consisting of eight separate 8K pages where tables, indexes, and so on are
physically stored. SQL Server only allocates space in a database to extents.
When files grow, you will notice that the size of files will be incremented
only in 64K increments. Each extent in turn has eight pages that hold one
specific type of data each:
·
Data: Table data.
·
Index: Index data.
·
Overflow data: Used when a row is greater than 8,060 bytes, or for varchar(max),
varbinary(max), text, or image values.
·
Allocation map: Information about the allocation of extents.
·
Page free space: Information about what different pages are allocated for.
·
Index allocation: Information about extents used for table or index data.
·
Bulk changed map: Extents modified by a bulk INSERT operation.
·
Differential changed map: Extents that have changed since the last database backup command. This
is used to support differential backups.
In larger databases, most every extent will contain
just one type of page, but in smaller databases, SQL Server can place any kind
of page in the same extent. When all data is of the same type, it’s known as a uniform
extent. When pages are of various types, it’s referred to as a mixed extent.
SQL Server places all table data in pages, with a header that contains metadata
about the page (object ID of the owner, type of page, and so on), as well as
the rows of data. At the end of the page are the offset values that tell the
relational engine where the rows start. The header of the page contains
identification values such as the page number, the object ID of the object the
data is for, compression information, and so on. The data rows hold the actual
data, Finally, there’s an allocation block that has the offsets/pointers to the
row data.
Data on Pages
When you get down to the row level, the data is laid
out with metadata, fixed length fields, and variable length fields. The
metadata describes the row, gives information about the variable length fields,
and so on. Generally speaking, since data is dealt with by the query processor
at the page level, even if only a single page is needed, data can be accessed
very rapidly no matter the exact physical representation.
The maximum amount of data that can be placed on a
single page (including overhead from variable fields) is 8,060 bytes. When a
data row grows larger than 8,060 bytes, the data in variable length columns can
spill out onto an overflow page. A 16-byte pointer is left on the original page
and points to the page where the overflow data is placed.
No comments:
Post a Comment