Updating Tables with Read-Only Partitions

I learned an important thing today. I’ve started partitioning our data warehouse and we’ve moved into our testing phase. The partitions are in place, the archive is read-only and once a month the automated partition maintenance kicks off. It all seemed to be going swimmingly until…

A couple of times our data warehouse folks have found that an updated to a record on the read-write partition would fail with the following error:

Msg 652, Level 16, State 1, Line 1 The index “PK_STG_PHX_LN_ACCT_INT_OPT” for table “dbo.STG_PHX_LN_ACCT_INT_OPT” (RowsetId 72057594415022080) resides on a read-only filegroup (“ARCHIVE”), which cannot be modified.

Generated by this lovely ETL software generated SQL:

UPDATE DW_DBNAME.dbo.DW_TABLE
SET ETL_JOB_SEQ_NUM = 1027140 ,
ETL_IUD_CD = 'D' ,
ETL_UPD_DTM = CONVERT(DATETIME, '03/16/2016 01:01:17')
WHERE ETL_VERS_START_DTM = CONVERT(DATETIME, '12/27/2014 00:26:24.137')
AND ACCT_NO = '5001194157'
AND ACCT_TYPE = 'ILN';

The table is partitioned on a column  (ETL_VERS_END_DTM)  that indicates if the record is current or not so *all* records being updated are expected to be on one end of the partitioned table. Since the partitioned column was not included in the WHERE clause of the update statement SQL generated an execution plan that indicated it would not be inconceivable the update statement could affect rows in read-only partitions of the table and then failed.

The fix is simple: include the partitioning column in the where clause. In our case we know that records being updated will always have a datetime value of ‘99990101 00:00:00.000’ (because consultants) and they are partitioned accordingly.

UPDATE DW_DBNAME.dbo.DW_TABLE
SET ETL_JOB_SEQ_NUM = 1027140 ,
ETL_IUD_CD = 'D' ,
ETL_UPD_DTM = CONVERT(DATETIME, '03/16/2016 01:01:17')
WHERE ETL_VERS_START_DTM = CONVERT(DATETIME, '12/27/2014 00:26:24.137')
AND ACCT_NO = '5001194157'
AND ACCT_TYPE = 'ILN'
AND ETL_VERS_END_DTM > '2115-01-01';

And the SQL is able to determine the query will not touch a read-only partition (and since the partition column value is not being updated there is no risk of movement to a read-only partition).

In a currently strange case of SQL Magic (otherwise know as SQL internals): setting  ETL_VER_END_DTM to the expected value (i.e. it’s current value) but not including it in the where clause also produces an execution plan that allows SQL to proceed and succeed despite not eliminating the possibility it exists in the read-only partition before it begins. I’ll update this later with a link to any explanation that I find as I try to wrap my head around it.

UPDATE: In the simplest terms (for folks like me) the query originally ended up with a shared rowset for the both the read and the update when it encountered the read-only partition SQL had to bail. Including the end date in the update clause prevents SQL from using a shared rowset under the covers so when it gets to the update portion its rowset includes only rows from the read-write partitions. A link to Paul White’s post on how rowset sharing causes the error is here.

Many thanks to Aaron Bertrand (b|t) and Paul White (b|t) for helping me to understand what was going on over on Stack Exchange. Not only was I able to wrap my head around some non-intuitive internal processes but it reaffirmed why I think the SQL Family is so great.

Starting Big

I’ve long imagined that my first blog would be about the importance and scheduling of backups (very) or maintaining SLAs (very important and they should be reviewed at least once a year). Instead I believe I am going to eat the proverbial elephant blogging my process for slicing and dicing my company’s shiny new data warehouse into maintainable pieces.

About a year ago our onsite consultants started to come online and build out our structure. I was only tangentially attached to the project in that they knew they’d need some SQL servers at some point. I started to do some research into VLDBs (while it’s started out “small” the new policy was to delete nothing ever again) and quickly discovered the usefulness of partitioning (both for maintenance tasks and ETL processes. I of course forwarded on the links and after a brief discussion was told that it could be addressed later. In my naivete of course this sounded reasonable and I continued on with the million other projects I was eyeball deep in.

Woe to you who is not sure enough to stand your ground! 9 months later when I learned that I’d be dissecting their largest databases table by table to get even a reasonable maintenance plan going. After a mere 9 months backups are already up to two hours and it was time to prove that a partitioned database could alleviate the inevitable backup and index maintenance apocalypse.

So that’s where this series is starting: me, my planning and then table by table through our two largest databases (ps I won’t be posting table by table).

Cartoon of a charging elephant

I am beginning to rethink my methodology for eating an elephant…