I was inspired to write this introductory series when I alpha-tested a new free tool from Idera Software being launched today. Idera has created a tool for running and managing database integrity checks. This tool is ideal for people who want to simplify running and managing integrity checks of their databases. This free tool from Idera is SQL Integrity Check. Before I continue on, I want to thank Idera for inspiring and for sponsoring this blog series, Introduction to Integrity.
The word “integrity” has multiple meanings in the SQL Server world. One meaning deals with the quality of data stored within SQL Server. One form of integrity is maintained through proper data typing, constraints (default, check, unique, and key), and data validation processes. The other definition of integrity deals with the logical and physical consistency of the underlying structures of a database. Logical and physical consistency is the much more complex of the two, and this is the subject we are going to tackle in this series.
Reading Integrity Check Output
SQL Server provides a DBCC command for checking the integrity of a database. There are a variety of DBCC commands available, but the only command we are concerned with for this topic is DBCC CHECKDB. DBCC CHECKDB will check the internal structures, metadata, and data in a database.
In its simplest form, it can be executed by calling DBCC CHECKDB (<Database Name>);. For example, if I wanted to check the integrity of tempdb, I would execute the following:
DBCC CHECKDB (tempdb);
The output of DBCC CHECKDB can be confusing even for a clean result with no corruption. DBCC CHECKDB outputs a lot of information, most of it informational unless corruption is found. Fortunately, the command has optional arguments that allow us to filter out the informational messages and focus on the error messages, if any are returned. There are three options that you should commit to memory: NO_INFOMSGS, ALL_ERRORMSGS, and TABLERESULTS.
- NO_INFOMSGS: filters out all informational messages in the output.
- ALL_ERRORMSGS: returns all error messages. Prior to SQL Server 2008 SP1, CHECKDB only returns the first 200 error messages by default. Beginning with SQL Server 2008 SP1, running the command outside of SQL Server Management Studio (SSMS) returns all error messages by default and this argument has no effect. In SSMS with SQL 2008 SP1 and newer, SSMS limits the output to 1000 error messages unless the argument is used.
- TABLERESULTS: returns the information in a neat, easy-to-read grid format rather than free-form text.
The image below is sample output in table format taken from Idera’s free tool SQL Integrity Check.
The sample database shown above can be downloaded for your own practice here: RecoveringFromCorruption.zip (13.9 MB). The command I would execute to see the limited output for this sample database is:
DBCC CHECKDB (PFSCorruption) WITH NO_INFOMSGS, ALL_ERRORMSGS, TABLERESULTS;
To see the full supported list of arguments see the SQL Server Books Online entry for DBCC CHECKDB: http://technet.microsoft.com/en-us/library/ms176064.aspx.
If you look at the Books Online page listed above, you will see that one of the arguments I listed was not listed. The TABLERESULTS option is not documented for this command (it is documented for a few DBCC commands) and is therefore not officially supported. The argument works for almost all DBCC commands though, and it is safe to use.
When you encounter corruption, the output can be somewhat daunting (and a little scary) at first. The key is to know where to focus your attention. If you look at the Level column, you will see that it returns different values for different messages. The Level correlates directly to the severity of the error message. The higher the Level, the more severe and more important the message is. Your first tip is to focus on the messages with the highest Level of 16.
If you are using SQL Integrity Check that I mentioned above, you can simply click on the Level column header to sort the output by Level. Click it again to sort in the opposite direction. If running the command manually, you will need to sort through the messages manually to find the most severe messages.
The MessageText column (or Message column if using SQL Integrity Check) for the Level 16 error messages will contain just about everything you need to know to figure out what kind of corruption you have, how wide-spread the corruption is, and how best to deal with it with minimal or no data loss. The message will tell which pages in which files have corruption in the format of (<file number>:<page number>). In the image above, you see three different pages listed in the Level 16 messages: (1:1), (1:6), and (1:7). The messages are referring to page numbers 1, 6, and 7 in file number 1.
You can query sys.database_files or sys.master_files to see which file holds the corrupted page. These particular messages indicate a problem with only one page though. If you read the messages carefully, it says that page (1:1) has incorrect values for pages (1:6) and (1:7). Page (1:1) is the only corrupt page detected in this example. This brings me to the second tip.
Once you know which messages to focus on, identify what is corrupted. Determine the object that is corrupted and that will tell you what kind of corruption you have. The messages can tell you a wide range of ID values depending on what is corrupted. Focus on the IDs that it gives you and narrow it down to exactly what is corrupted.
For the above example, the only key IDs the messages gave us were the page IDs. The first 8 pages of a data file are system pages commonly referred to allocation pages. The pages actually contain more than just allocation information, but for consistency with standard terminology, we will use the term allocation pages.
The first 8 pages of a file are:
- Page 0: File header page
- Page 1: First Page Free Space (PFS) page
- Page 2: First Global Allocation Map (GAM) page
- Page 3: First Shared Global Allocation Map (SGAM) page
- Page 4: No longer used
- Page 5: No longer used
- Page 6: First Differential Changed Map (DCM) page
- Page 7: First Bulk Changed Map (BCM) page
Pages 1, 2, 3, 6, and 7 also repeat throughout the file. We’ll get into that more when we talk about how to fix allocation page corruption.
If it reports the Object ID, this will tell you which table contains the corruption. Along with the Object ID, it should report an index ID as well. If you are dealing with a corrupt index or heap, the index ID will tell you what type of index contains the corruption. The type of index dictates how you will handle the corruption. We will discuss how to handle corruption in clustered indexes or heaps (index ID 1 or 0) and nonclustered indexes (greater than 1) later on.
The last two common integrity issues we will talk about are found in the wording of the error messages and not by the ID values being reported. We will look at issues when the value stored in a table is out of range for its data type, also known as data purity errors. Lastly, we will look at chain linkage problems where pointers from one page are pointing to an incorrect page. For this particular error, the error message clearly calls out that there are possible chain linkage problems.
Before we start talking about how to deal with the common integrity issues, I have one final tip for reading the CHECKDB output, and it is a very important one. Even if you forget the other two tips, please commit this one to memory.
The final tip is to ignore the RepairLevel column (Repair Level in SQL Integrity Check). Do not consider using the repair options for CHECKDB until absolutely all other avenues have been exhausted. The repair options should be your absolute last resort only. If you’re not an expert at dealing with corruption, you should enlist the help of an expert before running any repair on the database. The repair options delete data pages in order to get rid of corruption, and once deleted, they cannot be recovered except by manually re-entering the data.
To summarize the tips:
- Limit and simplify the DBCC CHECKDB output by using NO_INFOMSGS, ALL_ERRORMSGS, and TABLERESULTS
- Focus in on the error messages with a Level 16 severity
- Read the error messages carefully
- Check the IDs reported in the error messages to determine exactly what objects and indexes are affected by the error
- Ignore the repair options and do not consider using repair until it is your only remaining option
One final thought on integrity checks: if you not already doing them, start now. Right now! Don’t put it off.