Friday, July 24, 2015

Recovery using Logging

(Based on Paul S Randal articles)              

           Logging exists to support a variety of operations in SQL Server. It allows a backup copy of the transaction log to be taken so that a database can be restored and the transaction log backups replayed to bring the database to a specific point in time with transactional consistency. And it supports features that rely on reading the transaction log, such as replication, database mirroring, and change data capture.
The majority of uses of logging involve a mechanism called recovery. Recovery is the process of having the changes described in log records replayed or reverted in the database. Replaying log records is called the REDO (or roll forward) phase of recovery. Reverting log records is called the UNDO (or roll back) phase of recovery.
               The simple form of recovery is when a single transaction is canceled, in which case it is undone and there is no net effect on the database. The most complex form is crash recovery—when SQL Server crashes (for whatever reason) and the transaction log must be recovered to bring the database to a transactionally consistent point. This means that all transactions that were committed at the time of the crash must be rolled forward to ensure their effects are persisted in the database. And all in-flight transactions that had not committed at the time of the crash must be rolled back to ensure their effects are not persisted in the database.
               So how does recovery know what to do? All recovery processes depend on the fact that each log record is stamped with a log sequence number (LSN). A log sequence number is an ever-increasing, three-part number that uniquely defines the position of a log record within the transaction log. Each log record in a transaction is stored in sequential order within the transaction log and contains the transaction ID and the LSN of the previous log record for the transaction.
               For the simple case of a single transaction being rolled back, the recovery mechanism can easily and quickly follow the chain of logged operations from the most recent operation back to the first operation and undo the effects of the operations in the opposite order from which they occurred. The database pages that were affected by the transaction are either still in the buffer pool or on disk. In either case, the image of the page that is available is guaranteed to be one where the effect of the transaction is reflected on the page and must be undone.
               During crash recovery, the mechanism is more complicated. The fact that database pages are not written to disk when a transaction commits means that there is no guarantee that the set of database pages on disk accurately reflects the set of changes described in the transaction log—either for committed or uncommitted transactions. However, all database pages have a field in their page header (a 96-byte portion of the 8192-byte page that contains metadata about the page) that contains the LSN of the last log record that affected the page. This allows the recovery system to decide what to do about a particular log record that it must recover:
  • For a log record from a committed transaction where the database page has an LSN equal to or greater than the LSN of the log record, nothing needs to be done. The effect of the log record has already been persisted on the page on disk.
  • For a log record from a committed transaction where the database page has an LSN less than the LSN of the log record, the log record must be redone to ensure the transaction effects are persisted.
  • For a log record from an uncommitted transaction where the database page has an LSN equal to or greater than the LSN of the log record, the log record must be undone to ensure the transaction effects are not persisted.
  • For a log record from an uncommitted transaction where the database page has an LSN less than the LSN of the log record, nothing needs to be done. The effect of the log record was not persisted on the page on disk and as such does not need to be undone.

               Crash recovery reads through the transaction log and ensures that all effects of all committed transactions are persisted in the database, and all effects of all uncommitted transactions are not persisted in the database.

                One of the uses of a checkpoint operation is to reduce the amount of time that crash recovery takes. By periodically flushing out all dirty pages to disk, the number of pages that have changed because of committed transactions but whose images are not on disk is reduced. This, in turn, reduces the number of pages that need to have REDO recovery applied during crash recovery.

Transaction Logs - II

(Based on Paul S Randal articles)

          The physical architecture of the transaction log, it is split internally into smaller chunks called virtual log files (or VLFs). These are simply an aid to easier internal management of the transaction log. When a VLF becomes full, logging automatically proceeds to use the next VLF in the transaction log. You might think that eventually the transaction log will run out of space, but this is where the transaction log is so different from data files. The transaction log is really a circular file—as long as the log records at the start of the transaction log have been truncated (or cleared). Then when logging reaches the end of the transaction log, it wraps around to the start again and begins overwriting what was there before.

A log record is no longer needed in the transaction log if all of the following are true:
·         The transaction of which it is part has committed.
·         The database pages it changed have all been written to disk by a checkpoint.
·         The log record is not needed for a backup (full, differential, or log).
·         The log record is not needed for any feature that reads the log (such as database mirroring or replication).

               A log record that is still needed is called active, and a VLF that has at least one active log record is also called active. If they are all inactive, the VLF is marked as truncated. When a VLF is truncated, it is not overwritten or zeroed in any way—it is just marked as truncated and can then be reused. This process is called log truncation. Log truncation never changes the physical size of the transaction log. Active VLFs make up the logical log—the portion of the transaction log that contains all the active log records. The database itself knows where crash recovery should start reading log records within the active portion of the log—the start of the oldest active transaction in the log, the MinLSN (this is stored in the database boot page).

The check whether log truncation can take place under either of the following circumstances:
·         When a checkpoint occurs in the SIMPLE recovery model or in other recovery models when a full backup has never been taken. (This implies that a database will remain in a pseudo-SIMPLE recovery model after being switched out of SIMPLE until a full database backup occurs.)
·         When a log backup completes.

               When log truncation cannot occur, the VLFs cannot be truncated and eventually the transaction log has to grow (or another transaction log file be added). Excessive transaction log growth can cause performance problems through a phenomenon known as VLF fragmentation. Removing VLF fragmentation can sometimes lead to a dramatic improvement in the performance of log-related activities.

There are two common issues that can prevent log truncation:
·         A long-running active transaction. The entire transaction log since the first log record from the oldest active transaction can never be truncated until that transaction commits or aborts.
·         Switching to the FULL recovery model, taking a full backup, and then never taking any log backups. The entire transaction log will remain active, waiting to be backed up by a log backup.


               If the transaction log does grow to capacity and cannot grow any further, then error 9002 will be reported and you will need to take steps to provide more space, such as growing the log file, adding another log file, or removing any impediment to the log being truncated.

Transaction Logs - I

(Based on Paul S Randal articles)          

          Transaction log backups are only possible in the FULL or BULK_LOGGED recovery models. A transaction log backup contains all the transaction log records generated since the last log backup (or full backup that starts a log backup chain) and is used to allow the database to be recovered to a specific point in time (usually the time right before a disaster strikes). This means they are incremental, unlike differential backups, which are cumulative. Since these are incremental, if you want to restore the database to a particular point in time, you need to have all the transaction log records necessary to replay database changes up to that point in time. These are contained in the log backup chain. A log backup chain is an unbroken series of log backups that contain all the transaction log records necessary to recover a database to a point in time. A chain starts with a full database backup, and continues until something breaks the chain, thus preventing more log backups being taken until another full (or differential) backup is taken.
               Operations that break the log backup chain include switching to the SIMPLE recovery model, reverting from a database snapshot, and forcibly clearing the log using the WITH NO_LOG or TRUNCATE_ONLY options (which are not available in SQL Server 2008). It is inadvisable to break the log backup chain, as it forces another (potentially large) full backup to be taken. Log backups are also required to help manage the size of the transaction log. In the FULL or BULK_LOGGED recovery models, the log will not clear until a log backup has been performed. If the log cannot clear, the log will grow until it runs out of space.
               There is a special case in logging that improves performance by allowing some operations to run as minimally-logged operations, where only the page allocations are logged, not the actual insertion of data. This can improve the performance of operations such as bulk loads and index rebuilds. The first log backup following a minimally-logged operation will also contain some data.

               Crash recovery is only possible if the transaction log is intact. As well as the space occupied by the log records themselves, a transaction will also reserve space in the transaction log for any potential log records required if the transaction were to be canceled and required to roll back. This accounts for the behavior you may observe where, say, a transaction that updates 50MB of data in the database may actually require 100MB of transaction log space.

               When a new database is created, the transaction log is essentially empty. As transactions occur, log records are written sequentially to the transaction log, which means there is no performance gain from creating multiple transaction log files. The transaction log will use each log file in turn. Log records for concurrent transactions can be interspersed in the transaction log. Remember that log records for a single transaction are linked by their LSNs, so there is no need for all log records for a transaction to be grouped together in the log. LSNs can almost be thought of as a timestamp.

              

Migration Checklist summary

Pre-Migration Checklist
1.       Analyze the disk space of the target server for the new database, if the disk space is not enough add more space on the target server
2.       Confirm the data and log file location for the target server
3.       Collect the information about the Database properties (Auto Stats, DB Owner, Recovery Model, Compatibility level, Trustworthy option etc)
4.       Collect the information of dependent applications, make sure application services will be stopped during the database migration
5.       Collect the information of database logins, users and their permissions. (Optional)
6.       Check the database for the Orphan users if any
7.       Check the SQL Server for any dependent objects (SQL Agent Jobs and Linked Servers)
8.       Check, if the database is part of any maintenance plan
9.       Database Migration Checklist

These are the steps you would go through to make the change.
1.       Stop the application services
2.       Change the database to read-only mode (Optional)
3.       Take the latest backup of all the databases involved in migration
4.       Restore the databases on the target server on the appropriate drives
5.       Cross check the database properties as per the database property script output, change the database properties as per the pre migration- checklist
6.       Execute the output of Login transfer script on the target server, to create logins on the target server you can get the code from this technet article: http://support.microsoft.com/kb/246133.
2.       Check for Orphan Users and Fix Orphan Users
3.       Run the DBCC UPDATEUSAGE command against the migrated database when upgrading to a newer version of SQL Server.
4.       Rebuild Indexes (Optional) As per the requirement and time window you can execute this option.
5.       Update index statistics
6.       Recompile procedures

7.       Start the application services, check the application functionality and check the Windows event logs.

DB Migration (Server to server)

(Original post here:
http://yrushka.com/index.php/sql-server/database-recovery/sql-server-migration-from-one-server-to-another-detailed-checklist/
)

I. Steps to do on the current Production server – [Server A]
  1. Check SQL Server properties – These must stay the same on the future server. 
use master
GO
select SERVERPROPERTY ('Collation')
select SERVERPROPERTY ('Edition')
select SERVERPROPERTY ('InstanceName')
select SERVERPROPERTY ('ProductVersion')
select SERVERPROPERTY ('ProductLevel')
select SERVERPROPERTY ('ServerName')
  1. Check global configuration settings – These must stay the same on the future server. 
use master
go
sp_configure 'show advanced options' , 1
reconfigure
go
sp_configure
go
  1. Check Databases Integrity
It is necessary in order to be sure that restored backups on migrated server will be valid.
                        DBCC CHECKDB ('DATABASE_NAME')
  1. Note down model database growing options. This setting is important for every new created database. My strong advice is that you never leave the default setting because it can be a big cause for poor performance. The default setting is 1 MB per data file and 10% per log file – imagine what it means for a database to allocate MB after MB if it needs to allocate say 50 MB or grow a log file that is 10 GB in size.
  2. Note down physical path for master, model, msdb, tempdb databases data and log files.  
use master
go
select name, filename from sys.master_files where database_id in (1,2,3,4) order by database_id
  1. Note down physical path for mssqlsystemresource data and log files.
SQL 2008: It should be in :\Program Files\Microsoft SQL Server\MSSQL10_50.\MSSQL\Binn\
For SQL 2005: same as master database location.
use master
go
select name, filename,dbid from sys.master_files where database_id in (32767)
  1. Perform Full backup on master and msdb

II. Steps to do on the future Production server – [Server B] 
  1. Install same edition/version of SQL Server on the new machine.
Keep in mind the following when installing: 
o    Make sure the new server has mounted the same drive letters where system databases reside on old SQL Server. Otherwise, after the restore of master SQL will not be able to start because SQL will want to start the rest of system databases on physical paths from [Server A].
o    If you want to restore system databases then you need to keep the same edition (Standard, Entreprise, Business Intelligence) and version (ex: 2008 RTM, 2008 SP1, 2008 R2 RTM etc.). You cannot restore a backup of a system database (master, model, msdb) on a server build that is different from the build on which the backup was originally performed. An workaround would be, to replace manually the system databases files copied from [Server A] to [Server B]. This can be done only if the migrated server can be stopped. If not, then you must follow the procedure with “restore database….”
o    If you want to restore only user databases, then you can install the same or upper edition/version. If you install a lower edition/version than the current Production one, you might not be able to restore some of the databases having higher builds features which won’t be supported on lower builds.
o    Keep the same collation when installing SQL Server.
o    Create a Domain/Windows user account that will be used to start SQL Services.
  1. Backup system databases – master, model, msdb – and put in a safe location… in case you want to recover them.
  2. Stop SQL Server. Copy all system databases files and add them to the safe location.
  3. Create the physical locations for master, model, msdb, tempdb, mssqlsystemresource databases noted down at steps I.5 & I.6
  4. Give to the user running SQL Full security rights on the new folders created.
  5. Copy system databases backups made on [Server A] to [Server B] (step I.7).
  6. Stop SQL Server.
  7. Copy existing model, msdb and mssqlsystemresource database files from the installed location on [SERVER B] to the new created ones (step II.4). Afterwards you will be able to change the location for these databases. For now SQL will need old locations in order to load msdb, model, tempdb and mssqlsystemresource.
  8. Start SQL Server in single user mode. From an elevated cmd (started with administrator rights), access the right folder where sqlservr.exe executable is located and execute below commands. Normally you will find it at “C:\Program Files\Microsoft SQL Server\MSSQL[InstanceName]\MSSQL\Binn”
There where cases when other services like SQL Agent or SQL Reporting took the single one connection and prohibited the administrator to restore master. This error is given:
Reason: Server is in single user mode. Only one administrator can connect at this time.
Before starting the SQL in single user mode, make sure you have stopped every SQL service and there is no one that will make a default connection to it (application, reporting). 
cd [drive]:\folder_name
sqlservr.exe -c m

Restore master database from the backup copied from [SERVER A] using a NEW cmd line started with administrator permissions or using a DAC (Dedicated Administrator Connection) connection.

-- connect to SQL
sqlcmd -SMACHINENAME\SERVERINSTANCE -E
-- make sure you are connected to right server:
SELECT @@servername
GO
RESTORE DATABASE master FROM DISK = '[Drive]:\Backup_path\MASTER_.bak' WITH REPLACE;
GO

  1. Exit the cmd prompt and Start SQL from Services (in multi-user)
  2. Now the system databases (except master) will be loaded from the new created paths. Check new files location for system databases. Master will point to the right location. msdb, tempdb and model must be changed. If you will restore msdb you can specify at that time, the new location for the files. So here I will show how to move model and tempdb. For msdb is the same.
select * from sys.master_files
use master
go

Alter database tempdb modify file
(name=tempdev, filename='[drive]:\new_location\tempdb.mdf')
Alter database tempdb modify file
(name=tempdev2, filename='[drive]:\new_location\tempdev2.ndf')
Alter database tempdb modify file
(name=templog, filename='[drive]:\new_location\templog.ldf')

Alter database model modify file
(name=modeldev, filename='[drive]:\new_location\model.mdf')
Alter database model modify file
(name=modellog, filename='[drive]:\new_location\modellog.ldf')
12.    Stop SQL Server and move the model files from old location to new location. tempdb will be re-created on the new specified location at every SQL restart so no need to move the files.
13.    Start SQL and make sure that the system database point to the right locations.
select * from sys.master_files
14.     [OPTIONAL] Restore msdb database. I personally encountered problems running Database mail after msdb restore on SQL 2008 R2. I ended up leaving the msdb orginal files and migrated all jobs by scripting, created msdb.dbo.cdc_jobs table (because cdc was in place as well) and re-configured database mail.
15.    Remove folders created at step II.4.
16.    Test some SQL users for connection.
17.    Linked Servers will not work because their credentials where encrypted with Service master key from originated server. In order to fix this you need to backup service master key from [Server A], give permission on the file to your user and copy to [Server B] to restore it.
18.    Change the server name in system tables. @@servername variable will point to old server. It must be changed.
-- check servername with ID = 0. The queries will return old server name [SERVER A]. [SERVER B] will not be found. 
SELECT @@servername
EXEC sp_helpserver 'SERVER B'
EXEC sp_helpserver 'SERVER A'
SELECT srvname FROM sysservers where srvid = 0

-- DELETE old references to old servername.
EXEC sp_droplinkedsrvlogin 'SERVER A', null
EXEC sp_dropserver  'SERVER A'

-- ADD new server name: [SERVER B]
EXEC sp_addserver [SERVER B] , 'local' ,'duplicate_OK'
EXEC sp_addlinkedsrvlogin 'SERVER B', 'true'

-- RESTART SQL. [SERVER B] will replace the old server name.
SELECT @@servername
EXEC sp_helpserver 'SERVER B'
SELECT srvname FROM sysservers where srvid = 0

On versions below SQL 2008, you have to update also the originating server name from sysjobs as well. On higher versions, the reference in sysjobs is made by server_id which is always 0.
SELECT * FROM msdb.dbo.sysjobs
UPDATE msdb.dbo.sysjobs SET originating_server = @@SERVERNAME WHERE originating_server <> @@SERVERNAME

III. Remaining Steps to do on the current Production server – [Server A]
  1. Perform Full backups for all user databases. It is a good idea to use backup compression and backup verification (RESTORE VERIFYONLY)
  2. Perform Differential backups.
  3. Copy all backups on [SERVER B].
  4. Restore Full backups and differential backups on [SERVER B] with ‘NO RECOVERY’ option.
  5. Perform Transactional backups.
  6. Copy all tran backups on [SERVER B] and restore them using with ‘Recovery’ option.
  7. And a last step, after you restore all user databases, keep in mind to change databases owners to ‘sa’ user, otherwise you will receive errors of executing sys schema objects like these ones:
The EXECUTE permission was denied on the object ‘’, database ‘mssqlsystemresource’, schema ‘sys’. (.Net SqlClient Data Provider)
USE Database_Name
EXEC sp_changedbowner 'sa'

GO

Few things about backup

If you are using sql 2008 then you can split backup into different file.
BACKUP DATABASE [Northwind] TO
DISK = 'D:\Northwind_file1.bak',
DISK = 'D:\Northwind_file2.bak',
DISK = 'D:\Northwind_file3.bak',
DISK = 'E:\Northwind_file4.bak'
WITH INIT , NOUNLOAD , NAME = 'Northwind backup', NOSKIP , STATS = 10, NOFORMAT

TRUNCATE_ONLY
               If you get a "stuck" transaction, and your log grows continually and you're unable to shrink, TRUNCATE_ONLY comes in very handy. But, if you get a “stuck” transaction and the log grows, truncate only isn't going to truncate the active portion of log space held open by the stuck transaction.  You would have to commit/rollback/kill the transaction to allow that space to truncate out.  At that point, a log backup would clear the space allowing it to be reused. 
               Forcing someone to switch to SIMPLE recovery to truncate the log contents pretty much guarantees that they know they are breaking their ability to perform a point in time recovery, since that is not possible with SIMPLE recovery. BACKUP LOG WITH TRUNCATE_ONLY is a dangerous command: it empties out the contents of your SQL Server’s transaction log without really backing it up.  Database administrators sometimes run this command right before shrinking their log file with a DBCC SHRINKFILE command, thereby freeing up drive space.
               Microsoft recommends that instead of truncating logs, you switch to simple recovery mode instead.  That way you don’t generate logs you won’t be using, and you won’t incur performance impacts from repeatedly filling and truncating the logs.  You also remove the need to regularly back up the transaction log.

Internal process while taking backup
1.       Checkpoint – to make sure that pages on disk are up to date with current changes
2.       Mark the log – backup began here
3.       Read from data files – writes pages directly from disk – even though they might be changing WHILE the backup is occurring therefore backup is considered an inconsistent set of pages (NOT a problem – the log will know the transactional consistency)
4.       Mark the log – backup ended here
5.       Copy the transaction log – backup includes transaction log to make consistent. No point making the data

          Misconception about full backups is that they only contain data. Both full backups and differential backups also contain some transaction log records so that the restored component (database, file, or filegroup) can be made transactionally consistent. Consider an example transaction that inserts a record into a table with a single nonclustered index. The transaction is split into two parts internally: a record insertion into a data page in the table and then the insertion of the required record in an index page in the nonclustered index. If the backup process just happens to read the nonclustered index page before the record insertion, but reads the table data page after the record insertion, then the database represented by just the data in the backup is transactionally inconsistent.
              
            The data contained in a full backup is not necessarily all of the contents of all data files. The backup will only contain the allocated pages from the data files. For example, a single-file database may be 100GB but only contain 15GB of allocated data. In that case, the full backup will only contain the 15GB of allocated data plus the necessary transaction log.