Monday, September 28, 2015

Checkpoint and Lazy Writer

(Source: sqlxdetails.com)

CHECKPOINT‘s goal is to shorten database recovery time (e.g. after a server crash), therefore increasing availability. It makes data files not lagging too much behind the transaction log. Recovery of every single database starts at the last checkpoint and rolls-forward all the transaction log records from that point on. If checkpoint was long time ago, it could be that amount of log to roll forward is huge and it can take considerable time. During that time your database is unavailable, affecting availability (you have longer downtime). More frequent checkpoints means less amount of log needed to roll-forward, resulting in faster recovery. Too frequent checkpoints make buffer pool caching less efficient and can negatively affect performance. E.g. you don’t want to checkpoint every second. Frequency of automatic checkpoint is determined by recovery interval option and the log generation rate, and usually is around every 1 minute. Some TSQL commands also do a checkpoint inside, like BACKUP DATABASE command for example. Such checkpoints are called “Internal”. And we can do a manual checkpoint by invoking a CHECKPOINT tsql command.
LAZY WRITER purpose is to release the buffer pool memory (for pages cached in buffer pool) when memory pressure occurs. When more memory is needed (e.g. for bringing in new pages to the cache), lazy writer responds to a memory pressure releasing the “coldest” pages from the buffer pool, and makes more memory available for new pages to come in. And that is normal to see that lazy writer occasionally “makes a room”. But if lazy writer constantly has a lots of work to do, starting to purge pages that are not old at all (you see ‘Buffer Node:Page Life Expectancy’ perfmon counter stays below DataCacheSizeInGB/4GB *300) – you have a problem with buffer cache memory. You do not want pages flow though the buffer cache like a rapid river. You want them to stay there and be reused, read/written and rewritten again in-memory and not the disk which is slow compared to memory, as long as possible. Low page life expectancy nullifies buffer caching purpose, affecting performance.
They both write in-memory pages to the data files on the disk. But Which pages, when, and do they release memory or not – there is the difference!
CHECKPOINT writes only dirty pages pages to the disk (dirty = changed in memory since the last checkpoint, not yet written/checkpointed to disk), making them “clean”. Checkpoint does not release any memory, the pages STAY in memory, they are not removed from the buffer pool!
LAZY WRITER looks for least recently used (“cold” = least recently read or written, not accessed in recent time) pages in the buffer pool, and releases the memory taken by them. Lazy writer releases both dirty and clean pages. Clean pages can be released without writing to disk, but dirty pages must first be written to the disk (“flushed” to the disk and become “clean”) and then buffer pool memory can be released. So, total number of pages that lazy writer releases can be higher than the number of pages lazy writer writes to the data files, because “clean” pages can be released without writing them to disk. The final result of the lazy writer is less buffer pool memory used, therefore more memory available for the fresh pages in the buffer pool.

There is another difference: checkpoint process is more efficient in writing to the disk because it can group subsequent pages into larger disk IOs, e.g. 128KB IO size. It internally usesWriteFileGather Windows API function. For details, see here and here. Lazy writer can only write 8K pages. Therefore checkpoint disk throughput is much better than lazy writer’s.

No comments:

Post a Comment