The
leaf level of a clustered index is implemented as a doubly-linked list. This
means that every leaf level of the index includes a pointer to both the
previous and the next page in the index. The data will be stored in the logical
order dictated by the clustering key and new data will be added in its correct
place on the relevant page. So, assuming there is room on that page, the
logical order will match the physical order.
The pointers in the doubly-linked list make it easy to scan ranges of data and, when the data is ordered as described above, this becomes a very efficient operation. In turn, non-clustered indexes on a table with a clustered index, are implemented as a doubly-linked list, and have the same benefits (by contrast, indexes on heaps are not implemented this way, and have no knowledge of which page to scan next).
As data is inserted, deleted, and updated, index fragmentation can occur. When data is deleted, gaps appear in data pages that create wasted empty space. When data is updated or added on a page that has become full, a page split will occur and one of the split pages will be stored "at the end," and so out of logical sequence. Now we have a logical ordering of the data that no longer matches the physical ordering of the data (referred to as logical fragmentation).
As a result, disk I/O is affected because the disk head must now jump all over the disk following a fragmented chain of records and pointers as opposed to a sequential listing. Wasted space reduces the number of rows that can be stored in SQL Server's data cache, which can also lead to increased disk I/O. Do keep in mind that, as a result, this only pertains to data/index pages that are not residing within memory, in the cache.
The only way to remove wasted space and restore the correct page ordering is to rebuild or reorganize the indexes on a regular basis. Rebuilding an index is the process by which the existing index is dropped and recreated.
No comments:
Post a Comment