Contents

Analyzing Tables for Compression in SQL Server

Contents

This is mostly because I can’t find the link to this Microsoft Technet Article easily enough. Like most things SQL server there is a lot nuance to compressing tables and indexes in your databases and I’m going to just touch on the first pass look I’m performing.

The first step is to look at the usage patterns on the table. The technet article splits this into two queries that I have combined into one single query below. What I am looking for first is how much the table/index is updated: more updates will mean more CPU overhead as pages/rows are decompressed before being updated. If the server doesn’t have many spare CPU cycles to spare I could be helping to create a bottleneck at the CPU. Secondly: how many scans happen on the table/index? More scans means it’s more likely that we’ll get a performance benefit when we go to read the the object from disk as we will do less physical reads.

SELECT o.name AS [Table_Name]
  , x.name AS [Index_Name]
  , i.partition_number AS [Partition]
  , i.index_id AS [Index_ID]
  , x.type_desc AS [Index_Type]
  , i.leaf_update_count * 100.0/ ( i.range_scan_count + i.leaf_insert_count + i.leaf_delete_count + i.leaf_update_count + i.leaf_page_merge_count + i.singleton_lookup_count) AS [Percent_Update]
  , i.range_scan_count * 100.0/ ( i.range_scan_count + i.leaf_insert_count + i.leaf_delete_count + i.leaf_update_count + i.leaf_page_merge_count + i.singleton_lookup_count) AS [Percent_Scan]
FROM sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) i
  JOIN sys.objects o ON o.object_id = i.object_id
  JOIN sys.indexes x ON x.object_id = i.object_id
    AND x.index_id = i.index_id
WHERE ( i.range_scan_count + i.leaf_insert_count + i.leaf_delete_count + leaf_update_count + i.leaf_page_merge_count + i.singleton_lookup_count) != 0
  AND OBJECTPROPERTY(i.object_id, 'IsUserTable') = 1
  AND o.name = '';

This is important information but if you are looking at enabling compression there is probably a more immediate consideration as well: you need to conserve space on your data drives. The stored procedure below will give you an idea of the space savings benefit for either ROW or PAGE level compression. The data types in the index or table will have a large impact on how much SQL is able to squeeze data down. This ultimately (in tandem with the usage patterns determined above) drive your decision to compress your data. If you have spare CPU cycles to burn and you can reclaim a large amount of space in your database (or even disk drive if needed), then compression is going to an important tool in your toolbox. We recently reclaimed nearly 200 GB in our data warehouse by compressing our largest table (260 GB down to 56 GB). The table was rarely updated but scanned often so in addition to gaining back significant database space we’ve also sped up the ETL processes that access that table as a bonus!

EXEC sp_estimate_data_compression_savings ''
  , ''
  , NULL
  , NULL
  , '';

The technet article is great and goes into greater depth on what’s happening under the hood when you compress tables as well as providing more detailed examples of compression decision scenarios and really is recommended reading if you are currently looking to implement compression.