Knowing SQL Server: Accelerated Database Recovery

When a server crashes, we must bring the database back online quickly. It could take seconds, minutes, or even longer. Accelerated Database Recovery (ADR) can assist with this.

Why was Accelerated Database Recovery created?

Because commit or rollback a long-running transaction takes a lot of time. Since SQL Server ensures data consistency, you cannot get rid of transactions that are pending to commit or pending to rollback; they must be done at any cost .

Starting with SQL Server 2019, Accelerated Database Recovery (ADR) allows the recovery of a database in a constant time regardless of the database size, so long-running transactions are no longer a problem. First, we need to know how the standard recovery process works. Here is a quick look at this:

There are three phases:

Analysis

Redo

Undo

1. Analysis Phase

It starts scanning the log file from the last checkpoint: Checkpoint is the process of flushing modified pages from memory to disk, ensuring that the pages’ log records are written to disk beforehand.

Reading the log file from left to right, the following is retrieved:

LSN (Log Sequence Number) of the oldest dirty page

List of uncommitted transactions

List of modified pages after the checkpoint.

2. Redo Phase

Starts locating the oldest active transaction LSN. Information about the oldest active transaction is saved in the checkpoint log record.

Committed transactions are persisted to disk (data file) using the list of modified pages after the checkpoint. This phase also acquires locks of operations already written to disk before the recovery process. Your database will be available for queries at the end of this phase.

3. Undo Phase

Now, we have the pages already locked by the Redo phase and the list of uncommitted transactions: This phase scans the log in reverse and undoes (rollback) uncommitted transactions: Any user that queries the pages of these transactions is blocked until the transaction is completely undone. At the end of this phase, your database should be fully recovered 

How does Accelerated Database Recovery solve the problem of long-running transactions?

By keeping the versions of rows every time a transaction modifies them.

The phases for ADR are the same as for the Standard Recovery, with some differences:

Pages are not locked in the Redo phase unless the database is a secondary replica or there are unresolved distributed transactions.

IN ADR  process it follows same three steps but each step get divided in sub phases

Analysis phase

Sub phase 1 -     Read the all transactions from log file

Sub phase 2 -     Of reconstructing the SLOG (system log stream) and copying log records for non-versioned operations.

Redo phase

Sub phase 1-       Redo from SLOG (oldest uncommitted transaction up to last checkpoint). Redo is a fast operation as it only needs to process a few records from the SLOG.

Sub phase 2 -     Redo from transaction log starts from last checkpoint (instead of oldest uncommitted transaction).

Undo phase - use SLOG to undo non-versioned operations and persisted   version store (PVS) with logical revert to perform row level version-based undo.

Additionally, these new data structures support ADR:

Persisted Version Store (PVS)

Secondary Log Stream (Slog)

Aborted Transaction Map (ATM)

The phases of ADR may look more complex than Standard recovery, but they are worth it for gaining faster recoveries, so let’s explore each phase.

1. Analysis phase

Locates the last checkpoint in the transaction log:

Then, these data structures are reconstructed:

SLog

Reconstructing Slog

Secondary Log Stream (Slog) is retrieved from the transaction log and reconstructed in memory:

What is Slog?

It is a data structure that keeps track of the non-versioned operations of SQL Server, such as:

Metadata about page allocation.

Data Definition Language statements such as CREATE, ALTER.

Locks over tables and indexes.

Slog data is initially stored in memory but is saved to the transaction log every time a checkpoint occurs.

Reconstructing ATM

The aborted Transaction Map (ATM) is reconstructed in memory from the transaction log:

Aborted transactions after the last checkpoint are also added to the ATM. The ATM is saved to the transaction log at every checkpoint.

2. Redo Phase

It performs the following:

Redo operations from Slog. It uses the Slog to flush the non-versioned operations to disk:

Recover the Persistent Version Store (PVS)

The number of operations should be minimal, so this operation ends almost instantly.

The portion between the oldest active transaction and the last checkpoint is redone using the Slog that is already in memory:

No data is flushed to disk as only logical operations, such as acquiring coarse-grained locks, are done.

Recover the Persistent Version Store (PVS)

The Persistent Version Store data is recovered from the last checkpoint and sent to the user database (disk). The data rows are also recovered in the Redo phase along with the PVS.

Only the log records after the last checkpoint need to be recovered. Therefore, this operation can be completed almost constantly due to regular checkpoints done by the database engine.

What is the PVS?

It is a data structure that stores the versions of each row that a transaction modifies. This also raises a performance issue because a query has to traverse the version chain in the PVS, which is an expensive task. If the modified version is small enough, the earlier version is stored together with the current version on the same page.

3. Undo Phase

It does these:

Undo operations from Slog. It rolls back all non-versioned operations using the Slog: Generally, only a few non-versioned operations must be undone, so this phase ends almost instantly.

Mark uncommitted transactions All uncommitted transactions are marked as ABORTED and added to the Aborted Transaction Map (ATM):

ADR looks impressive; what is the catch?

There is not a catch. However, there are some caveats:

ADR is not recommended when there are several long-running transactions as they increase the space required for PVS.

It is unsuitable for databases with a high transaction abort rate because it puts a lot of pressure on the cleaning process.

Disabling ADR is not supported in Azure SQL and Azure SQL Managed Instance.

You will need more space in the database to store the PVS.

ADR is turned off by default in SQL Server 2019.

Knowing SQL Server

Saturday, January 4, 2025

Accelerated Database Recovery

No comments:

Post a Comment

Blog Archive