If you missed any of the earlier posts in my DR series, you can check them out here:
- 31 Days of disaster Recovery
- Does DBCC Automatically Use Existing Snapshot?
- Protection From Restoring a Backup of a Contained Database
- Determining Files to Restore Database
- Back That Thang Up
- Dealing With Corruption in a Nonclustered Index
- Dealing With Corruption in Allocation Pages
- Writing SLAs for Disaster Recover
Disaster Recovery Resolutions
- Ensure that every database is being backed up.
- Prioritize backups. Investigate backup failures as a top priority.
- Verify that all of your databases are using the checksum page verification option.
Select name, page_verify_option_desc From sys.databases;
- Use the WITH CHECKSUM option for all database backups and restores
- See: Use All the Checksums.
- Test your backups. Preferably, automate restoring them to a different server.
- Test your recovery plan. Anyone who might need to implement should test regularly with different scenarios.
- Create alerts and send notifications for the following errors:
- 823 — Error reading page at the OS level.
- 824 — Error reading page at the SQL Server level.
- 825 — Error reading page but was successful on retry.
- 829 — Page has been marked as restore pending.
- Blog post to follow explaining why.