SQLU VLDB Week – Intro to VLDBs, part 2
Welcome to another exciting week SQL University postings. This week, we will be talking about working with VLDBs, or Very Large DataBases. Opinions vary greatly on what constitutes a VLDB and how to work with them. Last month, I posted some polls to get feedback from the community on VLDBs. Let’s take a look at those results first.
What is the smallest database that qualifies as a VLDB?
The minimum size with the highest number of votes was 5 TB garnering more than 1/3 of the votes. A few people think the 100 GB qualifies as a VLDB.
5 TB (35%, 7 Votes)
1 TB (25%, 5 Votes)
500 GB (20%, 4 Votes)
100 GB (15%, 3 Votes)
750 GB (5%, 1 Votes)
250 GB (0%, 0 Votes)
50 GB (0%, 0 Votes)
Total Voters: 20
What is the largest database you have worked with?
I was surprised a little by the votes in this poll. I expected this one to be fairly close to the first poll. 12 people thought that a database had to be 1 TB or larger to be a VLDB, but only 6 people voted that they have worked with databases this size.
1 TB to 5 TB (25%, 3 Votes)
Less than 100 GB (17%, 2 Votes)
100 GB to 500 GB (17%, 2 Votes)
500 GB to 1 TB (17%, 2 Votes)
5 TB to 10 TB (8%, 1 Votes)
20 TB to 50 TB (8%, 1 Votes)
More than 50 TB (8%, 1 Votes)
10 TB to 20 TB (0%, 0 Votes)
Total Voters: 12
What are your biggest challenges with working with VLDBs?
For this poll, I asked you to please choose your top 3 biggest challenges with working with VLDBs. Based on the types of questions and issues I hear about or deal with, these results came out pretty much how I expected. We are going to use these results as our itinerary for the rest of this week. We will take a look at the top 4 challenges from the poll to determine why they are challenging and offer fome strategies for dealing with each of these challenges.
Index maintenance (79%, 11 Votes) — Tuesdays Lesson
Backups (57%, 8 Votes) — Wednesday’s Lesson
Integrity checks (50%, 7 Votes) — Thursday’s Lesson
Archiving/purging data (36%, 5 Votes) — Friday’s Lesson
Disk/disk space management (29%, 4 Votes)
Partitioning management (21%, 3 Votes)
Inefficient queries (14%, 2 Votes)
Disk performance (7%, 1 Votes)
Total Voters: 14
If you want to get a head start, you can take a look at the session I did at SQL Saturday #68 in Olympia, WA on Saturday, April 9, 2011. The files from the session can be downloaded below. I have also recorded the demo performed during the session. The recording was done after the demo was over and not during the live demo due to some technical diffculties.
Session Files: StrategiesForWorkingWithVLDBs.zip (637 KB)