SQLU VLDB Week – Integrity Checks
Welcome back to day four of VLDB week at SQL University. Today, we will talk about running integrity checks – a.k.a. DBCC CHECKDB. In factuality, there are serveral DBCC CHECK commands for checking database integrity, not just CHECKDB. CHECKDB is the main function, the grandaddy of all DBCC CHECK commands. It is the one that performs the full gamut of integrity checks on a database.
Almost everything I know about CHECKDB I learned from Paul Randal (blog|@PaulRandal), either in one of the many SQLSkills.com classes I have attended or by reading material he has written on on his blog. This is especially true for CHECKDB on a VLDB. I would definitely be remiss if I didn’t call out that just about all knowledge I share on this page, I learned from Paul.
With integrity checks on VLDBs, we run into the same kind of challenges we see with just about everything else. It takes too much time. It uses too many resources. I have also heard from many people that they are concerned with running out of disk space due to the database snapshot it creates. Well fear not!! As with everything else we have discussed this week, we have some tips for you.
Run CHECKDB on a Restored Backup
This is my preferred method for handling integrity checks on a VLDB, if you have the infrastructure to support. Database mirroring and log shipping do not transfer corruption if it exists. A database backup will transfer corruption. If we restore the full backup that was created on a corrupt database, the restored database will be corrupt as well.
I used to perform operations for a VLDB that had two replica servers used by downstream consumers of the data (ETL) and ad hoc reporting users who had data freshness SLAs that could not be met by our data warehouse. The replica servers were load balanced, and users connected to a virtual name. We could move them to one server or the other as needed or let the load be balanced across both. The plan I wrote was to remove one of the replica servers from load balancing, drop replication, restore the full backup to it, run CHECKDB on it, rebuild replication, and then add the serverback into load balancing. With little effort, it’s a process that can be automated to run over a weekend when usage of the replicas is lower.
The downside of this is that the time it takes to copy the file, restore it, and run CHECKDB is a long time, and can lengthen the time until discovery of corruption.
Run with the PHYSICAL_ONLY option
Running CHECKDB WITH PHYSICAL_ONLY will perform a check of the physical structures of the pages and record headers and will check the integrity of the allocations. This check is limited, but among the things it detects are torn pages, checksum failures, and some hardware failures that can threaten the integrity of your database. This mode skips many of the checks that the full CHECKDB runs and can run a lot faster though it is limited in scope and leaves lots of openings for corruption to go undetected. It’s still better than nothing.
Break it up into Smaller Parts
This is where I recommend referring to Paul Randal’s post CHECKDB From Every Angle: Consistency Checking Options for a VLDB. Paul lays out everything out and explains how the process can be broken apart.
There are two approaches you can take here. One approach is to break CHECKDB apart according to filegroups and run DBCC CHECKFILEGROUP on different filegroups each night. Another approach is to break it apart into specific buckets of tables using DBCC CHECKTABLE along with running DBCC CHECKALLOC and DBCC CHECKCATALOG. If considering one of these approaches, you should read Paul’s post referenced above to ensure you are covering everything that needs to be covered.