Shortly after writing the whitepaper, I was asked to consult on a problem they were having. There wasn’t a lot of specifics other than the processes hitting the server were running slower than normal. I took a look at the configuration of the server and the activity on the server and made several recommendations. I’d like to talk about one of those recommendations in this post: using index hints.
I have seen people get burned by using index hints many times. The obvious problem is when an index that is being used in a hint gets dropped or renamed, the query suddenly starts failing. If you tell a query to use a specific index, and that index does not exist, the query doesn’t have a default behavior to which it can fail back. It fails outright. The query cannot be compiled if the index in the hint is missing and it does not make it past compilation. Index hints are not just recommendations like the name “hint” implies. They are enforced and will use that index or the query won’t run at all. This wasn’t the issue I was witnessing, however.
Another problem is when people think they have outsmarted the query optimizer and know which index it should use. If you know better, why not tell it which index to use and save it the time of trying to figure it out? The problem is that index usage is very situational. Is that truly the right index to use? Like most things in SQL Server, it depends. It may be the right index to use for that specific parameter. Or it may be the right index to use for the current volume or cardinality of data. Will it still be the right index 3 months from now? Will it be the right index when the table has 4 times as much data? Will it still be the right index if the data skews heavily so that some parameters have a low number of records and some values have a large number of records?
Quite often, the answer is “no”. I have met DBAs and developers who love to use query hints, including index hints. Hinting is generally a dangerous thing to do because you tie the hands of the query engine into always performing the query in a specific way whether that way is fast or not. Let’s take a look at a real world example I dealt with just last week.
When I investigate the “slow performance” of a production SQL Server, with no idea of what specifically was running slow, one thing I look for are queries consuming a lot of resources. They had captured a very short SQL trace of activity on the server shortly after the slowness issue was raised, and I combed through that to find the biggest queries. One of the top queries for CPU time and for IO was using an index hint. I did a little testing of the query to see if the index hint was actually helping or not.
The query in question was actually a very simple query:
-- Get Min and Max BatchItemID based on Min and Max BatchID's SELECT @MinBatchItemID = MIN(ID), @MaxBatchItemID = MAX(ID) FROM dbo.BatchItem WITH (index="ix_Batch_BatchItem") WHERE Batch BETWEEN @MinBatchID AND @MaxBatchID;
For testing purposes, I remove the variables and compare the query with and without the index hint:
SELECT MIN(ID) , MAX(ID) FROM dbo.BatchItem WITH (index="ix_Batch_BatchItem") WHERE Batch between 0 and 200000000; SELECT MIN(ID) , MAX(ID) FROM dbo.BatchItem WHERE Batch between 0 and 200000000;
The query with the index hint takes 96 seconds and has the following I/O stats:
Without the index hint, it takes 246 ms (< ¼ second) and has the following I/O stats:
Note that with the index hint, scan count is 21 instead of 2, Logical reads is 15,336,546 instead of 13,172, and total pages read from disk are 13,023,603 (physical reads + read-ahead reads) instead of 3. That’s 390 times faster and 1164 times fewer reads without the index hint. If I ran both version of the queries repeatedly, physical reads dropped to 0 for the query without the hint, but the query with the index hint always had to read some large amount of pages from disk indicating that it wasn’t maintaining that large amount of data in cache for very long.
The recommendation I made was simple: stop using index hints. If you have a query that is not using the optimal plan, don’t just take a short cut and use a hint. Figure out why it’s not getting the optimal plan and fix that. Index hints don’t address the core problem. They address a symptom, and they do it in a way that is permanent whether it is good or not. If you have a query that is performing bad or using a lot of resource and it has an index hint, or any query hint for that matter, test it without the hint, and you may get a quick performance win.