Finding Unused Databases on Your SQL Instance

We’ve been re-arranging some deck chairs at work (deploying a new backup tool in our SQL environment). As we’ve been working our way through that we’ve had to update or re-engineer some of our processes for automatic restores to use the new tool. We’re doing one restore though where no one could recall who actually used the restored database. After asking around for a month we didn’t have any better idea of whether the database was actually used or not. Being a risk-averse (not to mention polite) group we wanted to really be sure that no one was using it. Thus the inspiration for this post was born: we needed a way to audit database access confidently say one way or the other that our mystery database was in use or not.

We needed something quick and lightweight and After a quick spin through the SQLHelp channel on Slack I’d been pointed in the direction of extended events. SQL combined with XML my forever nemesis. Thanks to some help from Thomas Schutte (b) particularly his blog post on getting started with XML in SQL. So here is some useful code if you need to audit database use on a server.

First we need a simple extended event session. I picked the sql_batch_completed event so I knew I would catch users as they changed database context on the server and executed code. I kept the data just in the ring buffer since I’ll be checking it often to pull out relevant data and storing it in a table.

CREATE EVENT SESSION [UserContextAudit]
ON SERVER
ADD EVENT sqlserver.sql_batch_completed
( ACTION ( sqlserver.database_id
, sqlserver.database_name
, sqlserver.session_id
, sqlserver.session_nt_username
, sqlserver.sql_text
, sqlserver.username )
WHERE sqlserver.session_id > 100 )
ADD TARGET package0.ring_buffer
WITH ( MAX_MEMORY = 1024KB
, STARTUP_STATE = ON );
GO

ALTER EVENT SESSION [UserContextAudit]
ON SERVER STATE = START;

Generating extended events is easy peasy, especially when using the wizard in SSMS. The terrible (for me) part is parsing the results from XE which is stored as XML. I got about 65% of the way there but struggled to get to the data points I needed out of the XML. Thomas’s post above pointed me in the direction of the second cross apply I needed to get the XML nodes I was missing into my temp table. Once you have a good query to pull out the extract the data from the XE session we just dumb the results into our database. I’m running this script once a minute via a SQL Agent job to preserve the information.

SET QUOTED_IDENTIFIER ON;

DECLARE @XML AS XML;

SELECT @XML = ( SELECT TOP 1 CAST(xet.target_data AS XML) AS XEData
FROM sys.dm_xe_session_targets AS xet
JOIN sys.dm_xe_sessions AS xe ON ( xe.address = xet.event_session_address )
WHERE xe.name = 'UserContextAudit'
AND xet.target_name = 'ring_buffer' );

SELECT ce.event_node.value('@timestamp [1]', 'DATETIME2') AS EventDate
, ca.action_node.value('@name [1]', 'VARCHAR(200)') AS ActionName
, ca.action_node.value('(value/text())[1]', 'VARCHAR(MAX)') AS ActionValue
INTO #temp
FROM ( VALUES ( @XML )) xx ( XMLData )
CROSS APPLY xx.XMLData.nodes('//RingBufferTarget/event') ce(event_node)
CROSS APPLY ce.event_node.nodes('./action') ca(action_node);

SELECT DISTINCT a.EventDate
, a.ActionValue AS username
, b.ActionValue AS databasename
INTO #temp2
FROM #temp a
JOIN #temp b ON b.EventDate = a.EventDate
JOIN #temp c ON c.EventDate = b.EventDate
WHERE a.ActionName = 'username'
AND b.ActionName = 'database_name'
AND c.ActionName = 'sql_text';

UPDATE upd
SET lastdate = d.EventDate
FROM AuditDB.audit.usertracking upd
JOIN #temp2 d ON d.username = upd.username
AND upd.databasename = d.databasename
WHERE upd.lastdate < d.EventDate;

INSERT INTO SQLMgmtD.audit.usertracking ( username
, databasename
, lastdate )
SELECT username
, databasename
, MIN(EventDate)
FROM #temp2 i
WHERE NOT EXISTS ( SELECT *
FROM Audit.audit.usertracking a
WHERE a.username = i.username
AND a.databasename = i.databasename )
GROUP BY username
, databasename;

DROP TABLE #temp;
DROP TABLE #temp2;

Run that XE Session and SQL Agent collector job for a month or so and you should have a solid idea of what databases are actually being utilized on your server. There is a non-zero chance that a database might only be referenced as cross-db query but if you think that’s possible the sql_text from the XE session could be analyzed to look for and pull apart 3 part names for databases referenced this wa

Maintaining Table Data with Partitions

Recently I was reviewing the growth of some of our third party vendor databases and discovered that one vendor had these handy, enormous audit tables that hadn’t been touched since they were created. Records were constantly being added and as a double surprise (SURPRISE) there was no way via the application to ensure these tables were getting the cleanup they needed. After some back and forth (triple SURPRISE) it fell to us to manage this data. We met with the business owners and determined a reasonable retention policy and set out to create some (Relatively) simple SQL agent jobs that would clean up the data nightly and keep things tidy.

The single biggest problem we ran into was that being audit tables they were heavily used and deleting records potentially caused the application to grind to a halt while SQL took out locks to remove records (especially in the early day as we tried to purge the really old records. Eventually we got most of the purging working through proper scheduling and batch deleting. But one table held out as a problem child. The solution that worked on all the other tables simple could not delete records faster that they were being created (without grinding the system to standstill). Enter our last and only hope: partition switching! We had to schedule some down time to rebuild the table but once the table was rebuilt we were able to remove records from the table months at a time to temp tables which we could drop as soon as we switched the data into it. And best of all the blocking (if any) caused by the schema lock from the partition switch was a once a month deal! Sample code follows.

DO NOT JUST RUN IT ON YOUR PRODUCTION SERVERS: STRANGE DBAs GOOFING ON THE INTERNET DISTRIBUTING CODE IS NO BASIS FOR A SYSTEM OF DATA MANAGEMENT.

*ahem* Now that that’s out the way:

First you’ll need to create a partition function and partition scheme to rebuild your table across. Build your partition function first and then your partition scheme. With partition functions make sure to create empty partitions on either side of the series that will hold no data (if at all possible). The partition scheme below keeps the table all on the same file group (primary) but can be altered if needed. In our case we are changing a vendor database as little as possible!

Once your partition scheme and function are built it’s time to schedule a maintenance window: The table will be offline while you rebuild it across the partition. Additionally any other indexes will need to be rebuilt to include the partitioning column and built across the partition as well or partition switching will not work. Thankfully my super busy table only has a primary key. I’ll drop it and recreate it on the partition:

Viola! Now we can bring the application back online and allow it to keep adding an ungodly number of audit records to the table. In the SQL Agent we add a job that has three job steps. The first adds future partitions to the partition function and partition scheme so we always have some empty partitions on that end of the structure. The second finds old partitions, switches them out and then drops the data. The third step removes old, empty partitions from the partition scheme and function. But before that you’ll need a to create a table with the same index structure as your partitioned table (sans partitioning). Every index on the partitioned table must be recreated on the secondary table where we’ll actually truncate the data. I cheat and create the table by selecting the top 1 records from my partitioned table, truncating it and then rebuild the needed indexes on top of that.

Step 1 dynamically adding new partitions:

Step 2 Switch out old partitions:

And lastly removing old crusty partitions:

If you’ve got out of control tables that no one else is willing to tend to this will do the trick. Although remember that altering your vendor database without their consent may void your support contract and any number of other considerations. As with anything else you find on the internet: test it and see if it works for you before deploying to production. A huge hat tip to Kendra Little (b|t) whose introduction to partitioning made any of this possible for my own problem database. I hope that if nothing else I’ve added another avenue for some else to find her great break down of partitioning.

Partial Backups With Dell Litespeed

We currently use Dell’s Litespeed for SQL Server for our SQL backup compression and encryption needs. It has an intuitive GUI, manages log shipping and we are generally pretty happy with it. I’ve got a couple of problem databases where I don’t want the vanilla out of the box full backups. A couple of our data warehouse dbs are now over a terabyte in size and continuing to grow. We’re implementing partitioning to help with index and statistics maintenance as well as backups. The down side is that there is no GUI for partial backups with Litespeed so I’ve had to fumble through it with the help of their support. Here then are the working scripts for taking partial backups (and restores) of databases using the Litespeed stored procedures. Our database is in simple recovery mode and we’ll be backing up a read-only file group as well as all our active files.

First backup your read-only file group. You only need to back this up once. Or until an end user has you set it to read-write, and updates records. When that happens be sure to set the file group back to read-only and then back it up again.

The xp_backup_database also has a @read_write_filgroups parameter we’ll use later. It’s important to note that if you include it here with a value of 0 (which seems an intuitive thing to do) you will essentially take a full backup rather than a backup of just the intended filegroup. Including @returndetails is optional but will instruct Litespeed to return a row of information about the backup as it was processed.

Once you have your read-only files backed up you can back up the read-write file groups daily (or hourly or whatever works for your SLA) with the following code. Notice we’ve removed the specific filegroup parameter and added the @read_write_filegroups parameter.

Once you’ve taken a full backup you can do differential backups of the read-write file groups as well by including an additional @with:

Restore scripts look very similar to the backup scripts. If you do not have differential backups of your read-write file groups change ‘NORECOVERY’ to ‘RECOVERY.’

Restoring a read-write differential backup (RECOVERY should be changed to NO RECOVERY if there are multiple differentials to apply):

Finally read-only file groups can be restored after the database has been brought online after restoring all backups of read-write file groups:

Updating Tables with Read-Only Partitions

I learned an important thing today. I’ve started partitioning our data warehouse and we’ve moved into our testing phase. The partitions are in place, the archive is read-only and once a month the automated partition maintenance kicks off. It all seemed to be going swimmingly until…

A couple of times our data warehouse folks have found that an updated to a record on the read-write partition would fail with the following error:

Msg 652, Level 16, State 1, Line 1 The index “PK_STG_PHX_LN_ACCT_INT_OPT” for table “dbo.STG_PHX_LN_ACCT_INT_OPT” (RowsetId 72057594415022080) resides on a read-only filegroup (“ARCHIVE”), which cannot be modified.

Generated by this lovely ETL software generated SQL:

The table is partitioned on a column  (ETL_VERS_END_DTM)  that indicates if the record is current or not so *all* records being updated are expected to be on one end of the partitioned table. Since the partitioned column was not included in the WHERE clause of the update statement SQL generated an execution plan that indicated it would not be inconceivable the update statement could affect rows in read-only partitions of the table and then failed.

The fix is simple: include the partitioning column in the where clause. In our case we know that records being updated will always have a datetime value of ‘99990101 00:00:00.000’ (because consultants) and they are partitioned accordingly.

And the SQL is able to determine the query will not touch a read-only partition (and since the partition column value is not being updated there is no risk of movement to a read-only partition).

In a currently strange case of SQL Magic (otherwise know as SQL internals): setting  ETL_VER_END_DTM to the expected value (i.e. it’s current value) but not including it in the where clause also produces an execution plan that allows SQL to proceed and succeed despite not eliminating the possibility it exists in the read-only partition before it begins. I’ll update this later with a link to any explanation that I find as I try to wrap my head around it.

UPDATE: In the simplest terms (for folks like me) the query originally ended up with a shared rowset for the both the read and the update when it encountered the read-only partition SQL had to bail. Including the end date in the update clause prevents SQL from using a shared rowset under the covers so when it gets to the update portion its rowset includes only rows from the read-write partitions. A link to Paul White’s post on how rowset sharing causes the error is here.

Many thanks to Aaron Bertrand (b|t) and Paul White (b|t) for helping me to understand what was going on over on Stack Exchange. Not only was I able to wrap my head around some non-intuitive internal processes but it reaffirmed why I think the SQL Family is so great.

Starting Big

I’ve long imagined that my first blog would be about the importance and scheduling of backups (very) or maintaining SLAs (very important and they should be reviewed at least once a year). Instead I believe I am going to eat the proverbial elephant blogging my process for slicing and dicing my company’s shiny new data warehouse into maintainable pieces.

About a year ago our onsite consultants started to come online and build out our structure. I was only tangentially attached to the project in that they knew they’d need some SQL servers at some point. I started to do some research into VLDBs (while it’s started out “small” the new policy was to delete nothing ever again) and quickly discovered the usefulness of partitioning (both for maintenance tasks and ETL processes. I of course forwarded on the links and after a brief discussion was told that it could be addressed later. In my naivete of course this sounded reasonable and I continued on with the million other projects I was eyeball deep in.

Woe to you who is not sure enough to stand your ground! 9 months later when I learned that I’d be dissecting their largest databases table by table to get even a reasonable maintenance plan going. After a mere 9 months backups are already up to two hours and it was time to prove that a partitioned database could alleviate the inevitable backup and index maintenance apocalypse.

So that’s where this series is starting: me, my planning and then table by table through our two largest databases (ps I won’t be posting table by table).

Cartoon of a charging elephant

I am beginning to rethink my methodology for eating an elephant…

SQL Saturday #446 & RedGate SQL in the City

So I’m missing out on the Summit 15 keynote this morning as I was sadly paged last night. Between that and the first timers meetup and welcome reception I was completely unable to get up at an appropriate time today. While my work colleague gets ready I wanted to jot down a few impressions from the last few days.

SQL Saturday #446: Portland, OR

This was my third SQL Saturday in Portland and the event continues to be fantastic for a number of reasons. It’s right before Summit so a number of speakers at summer present in Portland that Saturday before so you can open up your Summit schedule if you get in on the sessions early. It’s a great networking event (I think there was around 400 SQL Family that showed up?).  I attended a great session on auditing in SQL Server from Colleen Morrow (bt) The really short and skinny: in SQL 2008 Standard you can only audit at the instance level but in 2008 Enterprise and SQL 2012 and above you can drill down into specific databases with your auditing. It all makes use of extended events so it’s light weight. It’s configurable in filtering, retention and handling auditing errors (lose the audit info, cancel the transaction that triggered it etc). She also shared a plan for centralizing the collection and ingestion of the audit logs that were created in a central location for reporting purposes. Scripts and slide deck here.

SQL in the City

This was my second SQL in the City and it was even better than my previous experience as well: new products like ReadyRoll and InstantClone really speak to some pain points at our organization and I’m looking forward to see we can fully implement them to make it all better. Or at least mostly better. Or at least version or databases: I hear that’s a good start!

I’m super excited and it’s time to hop on a train to start hitting Summit sessions. If you found this post and found it useful let me know!