Keeping Your Codebase Current With Your Vendor’s Changes

Yep it’s kind of as gross as it sounds. But there’s a good chance that you have implemented a very nice CI\CD pipeline for part of a vendor database that you are allowed to customize. Unfortunately you don’t always know when they might log in to do a hotfix or patch and you may have dependencies on objects that you don’t own. Automation to the rescue:

Previously on this site I outlined how you could audit your SQL instance using free/open source tools. I ended up leveraging that same process to run a daily check to see if there where unexpected changes in the production database and (thanks to this blog post from Jason Sider) creating a Github issue to ensure time gets set aside to review those changes and pull them into our local codebase.

The details are pretty simple so I’ll leave the specific implementation details to you dear reader but in a nutshell you’ll need the following resources:

  • an automation user with access/permissions to the appropriate Github repos in order to create an issue,
  • an automation tool like Octopus Deploy or something similar.

The entire process is:

  1. Query for unexpected changes (whatever that means in your environment). In our case it’s any changes made to the production database that wasn’t done by our automation tools. Any other changes should (at the least) be reviewed to see if they are legitimate DDL.
  2. If unexpected changes exist for the last $timeFrame bundle them together in to the issue description and submit it to Github.

If you get fancy (or already fancy) this may trigger other useful workflows. Our repos are also tied into our Kanban board and a Github issue will automatically create a work item for the DBA team to work on, so it really raises the visibility of that goes into maintaining the vendor side of the code base and keeping internal projects in synch with those changes.

Please note: You should have a general use git login that can be used by automation tools. This will prevent you from breaking processes that depend on a user access token when the user leaves the organization and their access to the repos goes away. Generate user access tokens for the automation user and then store those secrets in a secure, shared location for anyone working on automation (along with the login credentials etc. for this shared user).

Adding SQL Docs to Jekyll

Back in August I talked about how our organization was using Jekyll to publish our internal docs. This fall I took it a step further and worked out how to publish content generated by Redgate’s SQL Doc. SQL Doc is handy tool that will allows you to easily add meta-data to your database. It’s great: you essentially comment every table, column, view, schema etc. and then commit those changes to the code repository itself. Of course if you want to share that documentation with someone who doesn’t have access to the source code (or maybe doesn’t want to flip through SQL files looking for the updates to metadata to read it even if they did) it needs to be extracted made usable somewhere. SQL Doc has several report generation options and since we were using markdown for the Jekyll site that seemed a good choice. The problem in this case is that the export wasn’t really built with the idea that I might want to insert it into another site. It simply takes all the objects from the instance you have it pointed at and creates an entire site/directory structure around that. So we need to (1) strip out all the directory references or other code we won’t be using and (2) insert any of the code needed by the Jekyll site to make it useful. The following PS script mostly (I am sure there are still improvements to be made) does that job: just point it at a fresh export of the documentation and it will delete all the existing files and regenerate the docs for inclusion. (One note that it is pretty site specific in that we assumes that the site is built on the Documentation theme.)

Author: Josh Smith
Created: 2020-09-03
This script will parse through SQL Doc Generated markdown files to create Jekyll friendly
db documentation.
Should be run from the root directory of the documentation repository.
function createHeader { param ($subject, $link, $page)
$title = "$subject - $page - Documentation"
$summary = "Documentation for " + $page.ToLower() + " in $subject."
$header = "---
title: ""$title""
tags: ["
if ($page.IndexOf("index.html") -gt -1){
$header = $header + "data_dictionary, "
$header = $header + "$subject]
sidebar: db_sidebar
permalink: $link
last_updated: Oct 21, 2020
summary: $summary
return $header
function removeImageFolder { param ($imagelineIn)
$imgPos = $imagelineIn.IndexOf('Images') + 6
$imageLine = $imagelineIn.Substring(0, $imagelineIn.IndexOf('(..') + 1) + '/images/dbmodel'
if ($imageLineIn.IndexOf('Images', $imgPos) -gt -1){
$imageLine = $imageLine + (removeImageFolder $imageLineIn.Substring($imgPos))
else {
$imageLine = $imageLine + $imageLineIn.Substring($imgPos, $imageLineIn.IndexOf(".", $imgPos) - $imgPos).ToLower()
if ($imageLineIn.IndexOf('Images', $imgPos + 1) -lt 1){
$imageLine = $imageLine + $imageLineIn.Substring($imageLineIn.IndexOf(".", $imgPos))
if ($imageLine.IndexOf("Index.png") -gt 0){ $imageLine = $imageLine -replace "Index.png", "index.png" }
return $imageLine
function removeAnchor {param ($anchorlineIn)
$anchorLine = $anchorlineIn.Substring(0, $anchorlineIn.IndexOf("<a name="))
$anchorLine = $anchorLine + $anchorlineIn.Substring($anchorlineIn.IndexOf("a>") + 2, $anchorlineIn.Length - ($anchorlineIn.IndexOf("a>") + 2))
return $anchorLine
function insertLink {param ($linklineIn, $linkfolders)
$newLink = ''
foreach($f in $linkfolders){ $newLink = $newLink + $f + '/' }
$newLink = $newLink + $linkLineIn.Substring($linklineIn.LastIndexOf('(') + 1, $linklineIn.LastIndexOf('.md') - ($linklineIn.LastIndexOf('(') + 1))
$newLink = $newLink -replace '/', '_'
$linkLine = $linklineIn.Substring(0, $linklineIn.LastIndexOf('(') + 1) + $newLink + '.html)'
return $linkLine.toLower()
function processMDFile {param ($content, $database, $processFolderList, $fileName, $filler, $workingFolder)
$mdOutput = ''
$newLine = ''
$firstLine = $true
$prevBlank = $true
foreach($l in $content){
if ($l.IndexOf('####') -lt 0 -or $l.IndexOf('[Database]') -lt 0 -or $l.IndexOf('---') -ne 0){
if ($l.Trim().Length -eq 0 -and $prevBlank -eq $true){
# do nothing
elseif ($l.Trim().Length -eq 0){
$newLine = $l.Trim()
else {
$prevBlank = $false
$FirstLine = $false
$newLine = $l -replace $filler, ''
$newLine = processText $newLine
if ($newLine.IndexOf(".md") -ge 0) {
if ($newLine.Substring(0,9) -eq '[Project]'){
$newLine = projectNavigation $newLine $database $processFolderList
else {
if ($newLine.IndexOf("|") -eq 0){
$cells = $newLine.Split("|")
$newLine = ""
foreach($c in $cells){
$tempText = $c
if ($tempText.IndexOf(".md") -gt 0) {$tempText = insertLink $tempText $processFolderList}
if ($c.length -gt 0) {$newLine = $newLine + "| $temptext "}
$newLine = "$newLine |"
else {
$newLine = insertLink $newLine $processFolderList
if ($Firstline -eq $false -and $prevBlank -eq $false){
if ($newLine.Length -eq 0){$prevBlank = $true}
$mdOutput = $mdOutput + $newLine + "`r`n"
return $mdOutput
function projectNavigation{param ($projectLine, $projectDB, $sourceFolders)
$newPL = "> "
$p1 = $projectLine.IndexOf($projectDB)
do {
if ($p1 -gt -1){
$p2 = $projectLine.IndexOf("]", $p1)
# no link exists
if ($p1 -lt 0 ){
$p2 = $projectLine.LastIndexOf(')')
$newPL = $newPL + $projectLine.Substring($p2 + 3)
$p2 = -1
if ($p2 -gt -1){ $text = ($projectLine.Substring($p1 - 1, $p2 - ($p1 -2))).Trim() }
else { $text = ($projectLine.Substring($p1)).Trim() }
if ($text.Substring($text.Length - 1, 1) -eq ']'){
# potential link: parse and replace
$text = $text.substring(1) -replace ']', ''
$folderPosition = $sourceFolders.IndexOf(($text -replace " ", "_"))
$i = 0
$fPath = ''
do {
if ($i -gt 0){ $fPath = $fPath + '\' + $sourceFolders[$i] }
} while ($i -le $folderPosition)
$fileName = $projectLine.Substring($p2 + 2, $projectLine.IndexOf(")", $p2) - ($p2 + 2)) -replace '../', ''
$tPath = $startFolder.ToLower() + $fPath + '\' + $fileName.tolower()
if ((Test-Path $tPath) -eq $true){
$URL = $projectDB.toLower() + ($fPath.toLower() -replace '\\', '_') + '_' + ($fileName.ToLower() -replace ".md", ".html")
$newPL = $newPL + "[$text]($URL) > "
$tPath = $startFolder + $fPath + "\" + $sourceFolders[$i-1] + ".md"
if ((Test-Path $tPath) -eq $true){
$URL = $projectDB.toLower() + "_" + ($fPath.toLower() -replace '\\', '_') + '_' + ($sourceFolders[$i -1].toLower() -replace ".md", "") + ".html"
else { $newPL = $newPL + $text + " > " }
else{ $newPL = "$newPL $text" }
if ($p2 -gt -1){ $p1 = $projectLine.IndexOf("[", $p2) }
} while ($p2 -gt -1)
return $newPL
function processText {param($textInput)
if ($textInput.IndexOf("Images") -ge 0){ $textInput = removeImageFolder $textInput }
if ($textInput.IndexOf("<a name=""#") -ge 0){ $textInput = removeAnchor $textInput }
$textInput = $textInput -replace "#foreignkeys", "#foreign-keys" # fix foreign-key links
$textInput = $textInput -replace "_Dev", "" # remove _dev appended to db names
return $textInput
$databaseName = '' # the known name of the database
$sourceFolder = 'C:\' # path to the root of the SQL Doc export folder
$imageSource = $sourceFolder + 'Images'
$imageDestination = 'images\dbmodel'
$destination = 'pages\dbs\models\' + $databaseName
$fillString = 'JS_' # our development dbs are have our initals appended the front of the db name: we want to strip this out
# move image files
Get-ChildItem -Path $imageSource | Copy-Item -Destination $imageDestination -Recurse -Force
$files = Get-ChildItem $imageDestination
# jekyll is running in linux so case matters: lowercase all image file names
foreach($f in $files){
if ($f.Name -cne $f.Name.toLower()){
Rename-Item $f.FullName $f.Name.toLower()
#find root directory for just the database:
$startFolder = (Get-ChildItem -Path $sourceFolder -filter 'user_databases' -Recurse).FullName
$startFolder = (Get-ChildItem $startFolder | Where-Object {$_.Mode -eq 'd----'}).FullName
# get the files we will reformat
$files = Get-ChildItem -Path $startFolder -Recurse -Filter *.md
# clear out old files in case we are deleting pages from doc:
Get-ChildItem -Path $destination -Filter *.md | Remove-Item
# pull apart and rebuild files!
foreach ($f in $files){
$workingFile = $f.FullName
$file= $f.Name
$type = ($f.Directory).Name
$folderList = $workingFile.Substring($workingFile.ToUpper().LastIndexOf($databaseName.toUpper()), $workingFile.Length - $workingFile.toUpper().LastIndexOf($databaseName.toUpper()))
$folderList = $folderList -replace '.md', ''
$fileName = $folderList.Substring($folderlist.LastIndexOf('\') + 1).ToLower()
$folderList = $folderList.Substring(0, $folderList.LastIndexOf('\'))
$folderList = $folderList -replace '\\', ','
$folderList = $folderList -replace '_Dev', '' #account for possible _dev appended to db along with initials
$folderCollection = $folderList.Split(",")
$content = $null
$result = $null
# pull the content out of the working file:
$content = Get-Content -Path $workingFile
# process the content
$result = processMDFile $content $databaseName $folderCollection $fileName $fillString $startFolder
# ensure correct formatting of file names for linux
$link = ($folderList.ToLower() -replace ",", "_") + "_$fileName.html"
# set file 'topic'
if ($fileName -eq "index"){ $topic = $folderCollection[$folderCollection.Length - 1] }
else { $topic = $fileName }
# format the header for the new doc
$curHeader = createHeader $databaseName $link $topic
$result = $curHeader + $result
if ((Test-Path $destination) -eq $false){ New-Item -ItemType Directory -Force -Path $destination }
$outPath = $destination + "\" + ($link -replace ".html", ".md")
Set-Content -Path $outPath -Value $result -Force
#todo: update db_sidebar with link to documentation home
$sidebarOut = Get-Content _data/sidebars/db_sidebar.yml -Raw
$dbmodelList = Get-ChildItem pages/dbs/models/
$sidebarOut = $sidebarOut.Substring(0, $sidebarOut.LastIndexOf('folderitems:') + 12) + "`r`n`r`n"
foreach($i in $dbmodelList){
$dbName = $i.Name
$fileName = $i.Name.ToLower()
$newEntry = " - title: $dbName
url: /$fileName" + "_index.html
output: web, pdf`r`n"
$sidebarOut = $sidebarOut + $newEntry
$sidebarOut = $sidebarOut.Substring(0, $sidebarOut.Length - 1)
Set-Content -Path _data/sidebars/db_sidebar.yml -Value $sidebarOut -Force

Auditing Your SQL Instance on the Cheap

I had some great questions after my SQL 101 session at PASS Summit last week about auditing access and actions on SQL Server. I’ve recovered enough that I can now coherently share what I’ve set up for some free1 rudimentary auditing of SQL Server.

We use Redgate’s SQL Monitor to keep tabs on our SQL environment. We started initially with this extended events session to detect database changes for a monitoring alert. We quickly realized that we probably wanted a record of what triggered the alert and made a few changes. Instead of just letting SQL Monitor query the extended events session we deployed tables to our local helper db to store any detected changes. We then updated the SQL Monitor alert to check the table itself rather than the extended events sessions directly for recent events. In addition to having a record of the changes stored for research (we clean up the records after a month or so) it also gave us the chance to tune them to be less noisy. For example we don’t particularly need to be alerted when our deployment process makes changes: we have a record of it if we need to determine what happened but generally we expect those changes and don’t need to hear about them (at least not from the monitoring process).

Once we’d done that it seemed natural to make the same modifications to a similar monitoring alert for security changes. Of course you don’t have to run a monitoring tool to use these (although it does generally make it easier). A local SQL agent job could be run to check the table and send email alerts etc. The other question that comes along with security events is “Who is supposed to have access?” So lastly we implemented a once a day poll of the instance and database principals that could serve as a point in time reference of who had access.

/* We collect the results daily and haven't been too concerned
about lost history if a particular instance fails occasionally */
CREATE TABLE ##userQuery
serverName VARCHAR(255)
, databaseName VARCHAR(255)
, databaseRoleName VARCHAR(255)
, memberName VARCHAR(255)
, loginType VARCHAR(50)
, status VARCHAR(50)
/* dynamic SQL for collecting DB permissions for each instance using Aaron Bertrands sp_foreachdb
but you'd be better off incorporating the new procedure: */
EXEC master.dbo.sp_foreachdb @command = N'USE[?];
INSERT INTO ##userQuery (serverName
, databaseName
, databaseRoleName
, memberName
, loginType
, status)
SELECT @@SERVERNAME AS serverName, DB_NAME() AS databaseName
, AS DatabaseRoleName
, ISNULL(, ''No members'') AS MemberName
, CASE DP2.type
WHEN ''U'' THEN ''WinLogin''
WHEN ''S'' THEN ''SQLLogin''
WHEN ''G'' THEN ''WinGroupLogin''
END AS LoginType
, CASE WHEN IS NULL THEN ''Orphan'' ELSE ''N/A'' END AS [status]
[?].sys.database_role_members AS DRM
RIGHT OUTER JOIN [?].sys.database_principals AS DP1 ON DRM.role_principal_id = DP1.principal_id
LEFT OUTER JOIN [?].sys.database_principals AS DP2 ON DRM.member_principal_id = DP2.principal_id
LEFT OUTER JOIN master.sys.syslogins l ON l.sid = DP2.sid
DP1.type = ''R''
AND <> ''dbo''
AND DP2.type IN (''G'', ''U'', ''S'')'
, @exclude_list = N'tempdb,model'
, @print_dbname = 1;
/* union all the db permissions with the instance permissions
and stuff it into the local table */
INSERT INTO dbo.secPoll (serverName
, databaseName
, databaseRoleName
, memberName
, loginType
, status
, dateDetected)
SELECT serverName
, databaseName
, databaseRoleName
, memberName
, loginType
, status
, @theTime AS dateDetected
FROM ##userQuery
WHERE memberName NOT LIKE '##%'
, 'N/A' AS databaseName
, CASE WHEN b.sysadmin = 1 THEN
WHEN b.securityadmin = 1 THEN
WHEN b.serveradmin = 1 THEN
WHEN b.setupadmin = 1 THEN
WHEN b.processadmin = 1 THEN
WHEN b.diskadmin = 1 THEN
WHEN b.dbcreator = 1 THEN
WHEN b.bulkadmin = 1 THEN
END AS ServerRole
, AS MemberName
, CASE WHEN a.type_desc = 'SQL_LOGIN' THEN
END AS LoginType
, CASE WHEN a.is_disabled = 1 THEN
WHEN a.is_disabled = 0 THEN
WHEN a.is_disabled IS NULL THEN
END AS Status
, @theTime AS dateDetected
FROM sys.server_principals AS a
JOIN master.dbo.syslogins AS b ON a.sid = b.sid
ORDER BY serverName
, memberName;
DROP TABLE ##userQuery;

The natural conclusion of gathering metrics is to drive informed action/decision making so we started gathering up all the drift and security events via SSIS and plunking it in our central database so we could make it easily available for review but others. If we start to see an excessive amount of failed logins on a SQL instance we notify our info sec team. They can then review the security audit report for that SQL instance to to get a sense of what is happening and if they need to take any action (or if we need to tune our alert better). Similarly when they get requests to access data sources they can research to see who else has been granted permissions to the same instance or database as well as check to see what other data sources the user might have access to.

This is all a good way to implement something with no budget but if auditing and/or security auditing specifically is something that is important to your organization then the next step is to start researching more robust tooling!

1. You do get what you pay for.

Centralize Your Documentation to Learn Git

Source control is (has?) taken over and the recommendation is that pretty much anyone in IT should probably be using it. Beyond application development source control is use for anytime you have text documents that you need shared editing access to source control is something you might want to consider. Git is a widely used source control system and it is… difficult which is why it’s not too hard to find content like this:

And for some it will always be somewhat frustrating for some (and that’s ok). But even at it’s most frustrating for me it’s worth it because of how easy it is to collaborate with my team on code and files. This winter I found myself championing the use of git to help centralize our documentation. We’d been using a mix of SharePoint directories, pages and file shares to keep track of various SLAs, environment documentation and kbs about the applications we support and maintain. It was a mess. At the same time we’d also been sharing more PowerShell scripts around to get work done and the like. It was time to start using source control to keep everyone up to date. Adoption was not great though: folks would grab the files and then never pull to get updates, they’d paste the changes into email or chat to share with specific people when they found they needed those changes etc.

So we took a slightly different tack: we’d move all our team documentation into a git repository and use it to publish an internal website to make it accessible to those in the organization who just needed access to the information. Jekyll is a ruby library (? I’m still not a web developer thankfully I just can’t hack it) that will take a collection of (mostly) markdown files and generate a static website. Within 24 hours we had an initial repository that anyone on the team could pull and add their information to. For the information to be useful (i.e. published) it had to be pushed to the repository, reviewed and merged into the main branch. Adoption started off relatively slow but as new documentation was required or older docs needed updating it got used.

There were the inevitable learning pains, a few of us had to muddle our way through dumb merge conflicts. As git champion I borked the repo at least once but: it was getting used but the entire team and each bump in the road made the rest of the journey that much smoother. So now not only have we seen better engagement with our team code repositories for sharing PowerShell and the like but 95% of our documentation has been migrated from the various nooks and crannies where it was hidden to a central location where it is easily found by anyone in the org. If you looking at a scattered or disorganized documentation system and/or a team that needs to be introduced to source control you could do worse than to start by tackling both at the same time.

Just remember: commit and push often!

Backing Up Cosmos DB (Azure Data Factory Edition)

Now that ADF is a trusted service I wanted to document the state of my current solution since I’ve been able to dump the hack-y PowerShell script I put together. I haven’t been able to get the level of abstraction I’d really like to see but overall I think I’m pretty happy with the solution (and I still get to include a hack-y PowerShell script). My solution consists of

  • a control pipeline,
  • a notification pipeline and
  • 1 pipeline for every Cosmos DB service I want to to work with. (This is because I wasn’t able to figure out a way to abstract the data source connection for the Copy Data task.)

We’ll start at the backup pipeline and then back out to the larger solution:

Screen shot of the CosmosDB pipeline tasks

The backup pipeline is relatively simple: I set a date string that ends up being a part of the backup filename: then use the Copy Data task to pull the contents of Cosmos DB container into blob storage. We then perform cleanup of the backup files  (we are keeping 2 weeks of backups) then clean up the logs generated by the delete step. If either of the two steps fail (backup or backup cleanup) we use an execute pipeline task to kick off a notification pipeline. (ADF doesn’t natively send emails but it was very easy to leverage a logic app to send any necessary notifications: see this mssqltips article for details on how to set that up). This pipeline is parameterized with the name of the Cosmos DB service, database name and container name. These are used to create the folder structure for the backup in blob storage (formatted as [service name][database name][container name]_[YYYYMMDD_HHMM].json). This was where I couldn’t seem to abstract away the Cosmos DB service itself. I have 1 pipeline per known Cosmos DB service. Any time I need to add a new service I define a single data source and clone an existing pipeline to point at the new data source.

Each backup pipeline is called by my Dynamic Backup control pipeline which has two tasks:

The screen shot of tasks from the dynamic backup control pipelineLookup Backuplist grabs a json formatted list of Cosmos DB services along with any associated databases and containers from a blob storage data source. This is where I had to return to a hack-y PowerShell script. There isn’t handy way for ADF to get a token to talk to Cosmos DB services and query for a list of databases or containers. Sure I could try and do it via REST calls and the Webhooks action but that was going to be a huge headache plus our Cosmos DB is already IP restricted so I wasn’t 100% sure if things would fail because I had written them terribly or if the communication just wasn’t going to happen. So I now have a PowerShell script that I run manually every so often that will pick up new databases and containers that were deployed to known services since the last time I checked and add them to the list. This was mostly just a bit of CYA: When I went to deploy the ADF backups I discovered there was a new container in one of the databases that no one had thought to notify me of. While I’m not necessarily in the loop for new databases or containers on existing Cosmos DB services it seems a safe bet that I’ll be informed of any new Cosmos DB services themselves. Here is the PS script:

Author: Josh Smith
Created: 2020-03-02
Purpose: Will pull all databases and containers from known CosmosDB resources.
Cosmos db services must be known and added to the array below (paired with the
associated resource group).
<# add cosmos db resource and groups here as needed: #>
$cosmosDBServices = @( @{resource = '<cosmos db service 1>'; group = '<resource group name>'}, `
@{resource = '<cosmos db service 2'; group = '<resource group name>'})
$connectAz = Read-Host "Did you already connect to Azure [Y/N]?"
# Sadly the need to authenticate with Azure means this script needs to be run manually.
if ($connectAz -eq 'N'){Connect-AzAccount -subscriptionName "<name of your Azure subscription>"}
$dbResourceType = "Microsoft.DocumentDb/databaseAccounts/apis/databases"
$containerResourceType = "Microsoft.DocumentDb/databaseAccounts/apis/databases/containers"
$BackupStorageName = "<Azure storage account name>"
$StorageResourceGroupName = "<resource group for azure storage>"
$apiVersion = "2015-04-08"
$finalJson = ""
foreach($c in $cosmosDBServices)
$ResourceName = $c.resource + "/sql/"
$resourceGroup = $
$databases = Get-AzResource -ResourceType $dbResourceType -ApiVersion $apiVersion `
-ResourceGroupName $resourceGroup -Name $ResourceName | `
Select-Object -ExpandProperty Properties | Select-Object id
foreach($d in $databases)
$databaseName = $
$db = $ResourceName + $databaseName
$containers = Get-AzResource -ResourceType $containerResourceType `
-ApiVersion $apiVersion -ResourceGroupName $ResourceGroup `
-Name $db | Select-Object -ExpandProperty Properties | Select-Object id
foreach($con in $containers)
$container = $
$con = $null
$json = $null
$conObject = New-Object -TypeName psobject
$conObject | Add-Member -MemberType NoteProperty -Name ContainerName -Value $container
$conObject | Add-Member -MemberType NoteProperty -Name DatabaseName -Value $databaseName
$conObject | Add-Member -MemberType NoteProperty -Name ResourceName -Value $resourceName.Substring(0, $resourceName.Length - 5)
$json = $conObject | ConvertTo-Json
$finalJson = $finalJson + $json + ",
" # I'm a sucker for lazy line breaks
# lose the last line break and comma then output:
$finalJson = $finalJson.Substring(0, $finalJson.Length - 2)
$finalJson | Out-File -FilePath '.\CDBBackup.json' -Force
# connect to the blob storage and then push the file up:
$storageAccount = Get-AzStorageAccount -ResourceGroupName $StorageResourceGroupName `
-Name $BackupStorageName
$storageContext = $storageAccount.Context
Set-AzStorageBlobContent -File '.\CDBBackup.json' -Container 'backuplist' `
-Blob 'CDBBackups.json' -Context $storageContext -Force
# don't need that file hanging around here:
Remove-Item .\CDBBackup.json

Which will give me output like this:

"ContainerName" : "myFirstContainer",
"DatabaseName" : "myFirstDatabase",
"ResourceName" : "cosmos-db-service-1"
"ContainerName" : "mySecondContainer",
"DatabaseName" : "myFirstDatabase",
"ResourceName" : "cosmos-db-service-1"
"ContainerName" : "myFirstContainerMarkII",
"DatabaseName" : "mySecondDatabase",
"ResourceName" : "cosmos-db-service-2"
view raw backuplist.json hosted with ❤ by GitHub

A screenshot of a single Switch taskFor each returned item in the backup list I simply run it through a switch task based off the name of the cosmos db service from my collection. If none of the names match we simply don’t do anything with it (default). Once I find a service that is “known” (i.e. I have a data source for it) I can call the pipeline I deployed for that service via an execute pipeline task and pass in the database name and container in as parameters (along with the service name for consistency’s sake although it could be hard coded into the pipeline I am certain I’ll find a way to make this more dynamic in the future).

So that’s where we are at: ideally either ADF will get some built in methods to pull back meta-data about the service (webhooks and REST really sounds like a problem that only needs to be solved once) or I’ll find a way that I can run that PowerShell script in an automated fashion to populate the list.

Also a very appreciative shout out to Cathrine Wilhelmsen (b|t) without her Christmas time ADF blog marathon I would have spent much longer fumbling around than I ultimately did.

DB Deployments: Using A Custom Filter

So I’ve been rolling out the deployment pipelines as I was able and I ran into an issue with our third one: We have a helper database that lives on all of SQL Instances and while we’d like to eventually keep any customization in source control we only want the dbo schema deployed automatically via the pipeline. The Redgate “Deploy From Package” step has a handy option to include an alternative filter to use with the SQL Compare engine so it seemed really straight forward: I fired up SQL Compare: generated a unique filter file that I named DeployFilter.scpf and threw that file name into the job step!


Unfortunately I got stuck here because this does not work. The nuget packages don’t end up in the working directory for the deployment and I couldn’t find any documentation about where it might end up. I ran a few deployments so I could watch the working folder as the build ran to see what showed up with no luck. Looked for articles etc. that might have mentioned configuring the path (to no avail). Eventually I hit up the Redgate channel on the SQL Slack. Alex Yates (b|t) had the answer in his head all along: The previous step (self-deploy package) unpacks the nuget package on the tentacle and I have to get it to tell me where the package is (and then append the path to the filter within the package itself). This is all very long way of saying if you want to use a custom filter (a) it should be in source control with the rest of your code and (b) the variable is #{Octopus.Action[<name of package deployment step>].Output.Package.InstallationDirectoryPath}\<path to filter file in nuget package>. And I am putting it here for the next person that finds themselves googling how to make it work (which may still be me: my buffer is notoriously overworked).

DB Deployments Part 3: Time For A Pipeline!

All right, previously I configured Team City, created templates in the DBA Project directory that will be used by any child projects and then set up Octopus Deploy. Today I’ll talk a little about deploying an actual project in team city. Navigating to the page for the DBA Projects folder I can find the list of subprojects. This includes the handy “Create subproject” button. The project can be built out manually but as I have added my Github information I can also simply browse my organizations repositories to immediate link the TeamCity project to the appropriate source code. All I need to do after Team City verifies the repository is name the first build configuration. The build will inherit the build steps from the template I configured previously. The first settings I want to enter are the specific VCS settings. In the side bar for the project there is a link for VCS Roots. It indicates there is 1 VCS root configuration. I navigate to that tab and click Edit on the root that is there. Team city automatically appended #refs/heads/master to the VCS name and being the tidy person I am I remove that and regenerate the VCS root ID so they match. We’re going edit a few other settings as well:

  • I want to ensure the default branch is “refs/heads/develop”
  • I also add “+:refs/heads/*” to Branch specification. This filters the branches the VCS will see to branches that match the pattern defined. In this case I am adding any branch that starts as “refs/heads/”

Lastly it pulled in my Github account details to access the repository but that should be changed to use a dedicated Github account to be used solely by the Team City. Depending on how your organization is set up there may be a little extra work for this step. Our company enforces two factor authentication which means for any automated account to access repos independently we need to use ssh rather than https to connect to the repo. It also means we need to create an ssh key to use as well. This article is a good walk through of setting up an ssh key as well adding that key to the repository (if you want to go that route) although we simply added the SSH key to the account for our Team City user so it’s valid across all the projects it will access. The Fetch URL needs to be updated to the ssh URL. Under the Authentication Settings the method should be changed from Password to one of the key options. Our set up we uploaded the key to the Team City server because for one reason or another (we never really were able to figure out why) I was unable to use the Custom Private Key option. I added the key to the DBA projects folder so that it could be used in all sub-projects. Once these settings are configured I can test them with the “Test connection” button and save it if successful.

The template the dev build is based on is configured to be the development build and requires no changes so I can start adding builds based on the templates and configuring them as needed. First up is the master build: On the project home page I click “Create build configuration.” This time I am going to select the option to do it manually. I give it a valid name and then select the appropriate template (in this case DBA Build Template in the drop down box for “Based on Template.” Once created we need to add a VCS root for the build: under the Version Control Settings we need to click “Attach VCS root” and select the existing VCS root used in the default dev build. Next we move down to triggers and update the trigger settings that were inherited. The only change to the trigger for this build is to change the branch filter to “+:master.” this will filter out any branches except the master branch. The last bit is to ensure the build steps are customized as needed. Thankfully nothing actually needs to change except our deploy step: Remove the Deployment Enviroment(s) value and set the Release Channel value to Master (this will be configured used on the Octopus Deploy side).

To set up a build for Pull requests I need to make a few extra changes. I manually create one more build configuration and name it [database name] Merge Build. This time I use the Merge Build Template I created. I need to attach a VCS root but in this case I am going to end up creating a duplicate: I start by selecting the same VCS root and then I’ll be making some changes. I leave the default branch alone but I change the branch specification to “+:refs/pull/(/merge)” this filters out anything except the temporary branches created by pull requests in Github. At the bottom near the save button Team City has a helpful warning that I am changing an object used by more than one build configuration and gives me the option to save it for just this Build Configuration (creating a copy of the VCS root). I update the name of the VCS root (appending “[Merge Build]”), regenerate the VCS root ID and save. I update the branch filter on the Trigger to “+:, -:<default>” (If I don’t add -:<default> the merge builds also seem to run when I push changes to my default branch—develop.) Lastly I don’t want pull requests to trigger deployment of code so I disable steps 4 & 5 in the build steps to ensure we ignore Octopus Deploy for this build.

And that’s my current set up. I’ll gladly tell you that the process was a bit frustrating initially but with some help (so much thanks to Hamish Watson) I was able to get my head wrapped around it and as much as I knew that it could shrink the turn around time from developement to deployment I still found myself surprised at how quick I was able to turn around fixes we discovered we needed the week after I had this set up. I am pretty sure there are things that I could have done better but I definitely have the time to focus on that now. If you’re an Ops person on the fence about getting started with deployment automation I hope this walk through was helpful!

DB Deployments Part 2: Configuring Octopus Deploy

Where Team City will test and validate your code, Octopus Deploy will manage where that code is deployed. Again there are SQL Change Automation tools we’ll be using to ensure our database code is deployed consistently across our environments. You’ll need to ensure you have a service account (or MSA/gMSA eventually) for the application as well as home for the SQL database it needs. Download the install media and deploy it to your server. We’ve have Octopus deployed side by side with Team City (just on different ports).

Once installed under configuration you can add users and groups to roles as needed. We’ve created groups for the DBA and BI teams so they can log in and see the status of their deployments and added folks to the Administrators group as appropriate. (A quick note: while there is a small teams license for Octopus all users in the app are essentially an admin unless you purchase licensing: You’ll very likely want to purchase a license to take advantage of role based permissions.)

Screen shot of the Octopus Deploy Infrastructure page. It shows 4 environments and 1 deployment target.

A quick peek at our Octopus Deploy Infrastructure

The next piece that needs configuring is Infrastructure. We need to define agents that will do deployments (called deployment targets or tentacles) and define deployment environments. Again we don’t expect our deploy agents to be very busy to start with so we have a single agent that also runs on the app server (just like Team City). Clicking Add Deployment Target Octopus will walk you through installing the agent on a new server: it was pretty painless. At this time we’ll also create a named role for the tentacle that indicates what it might be used for (this is useful if your deployment process spans different servers and you need more than one agent to deploy your code). We only ever deploy to SQL servers so again a single agent and role is sufficient for now. Once we have a target we can define Environments and assign a target that will handle work in that environment. If your environments are highly segmented you may need a target (not on the app server) that is on the right network with the right connection rights (to whichever environment) that the app server can communicate with to do deployments. I have 4 environments: DBA Dev, DBA Test, DBA Stage and DBA Prod that match up with the deployments I will want Octopus to do.

Screenshot of the Octopus Deploy Variable Sets menu

Octopus Deploy Variable Set: DBA Deployment Locations

Next we add some objects to the Library tab. Under Variable Sets I am going to define a set of variables called DBA Deployment Locations. These are a standardize variable that we can include in Octopus Projects. I’ve defined the database name and SQL Instance as well as connection string based on those two values. Secondly for Step Templates I browse the library to down and install the Redgate step templates for creating, deploying releases. Additionally I grab the SQL – Execute Script template. I’ve also added two lifecycles that control deployment behavior: DBA Dev which deploys automatically to both the DBA Dev and DBA Test environments and DBA Production with automatically deploys to DBA Stage and is linked to DBA Prod. Lastly you’ll note there is a packages section: this is where Octopus will display packages that have been passed to it. Now we are ready to create our first project!

Projects can be put into groups and I’ve continued the grouping of DBA vs BI projects here as well. Once all that is in place we can create a project to handle specific deployments. Under the project tabs there is an Add Project button. The project needs to have the same name as the Team City project for ease of interoperability. I give it a description and assign it to a group and a Lifecycle. Once on the project page I expand Variables from the navigation pane on the left and click Library Sets. Then in the upper left I click “Include Library Sets.” I choose “DBA Deployment Locations” I just created and click save. I need to create some project variables as well so I can assign specific values to that variable based on the environment the tentacle is connecting to. The really nice thing is that I can overload existing variables I made have added from library sets so I recreate my DB Name and SQL Instance variables. I have different database names and SQL Servers for different environments and I input them all here. The scope for variables is very useful and available just about everywhere in the UI. The nice feature is that you can preview variable values at specific process steps to ensure they are being set properly throughout the deployment process.

I then create two channels to handle the releases and assign a lifecycle to each: Develop gets the default DBA Dev Lifecycle and then Master is associated with the DBA Production Lifecycle. The last bit is to create an actual deployment process. After clicking Process on the left navigation pane under the project we can start adding steps. I have six (although they don’t all run in every environment).OctoProcess

    1. The first step will create the database if it does not exist on the SQL instance already. I select “SQL – Execute Script” from the templates available and configure it. I only have one choice of target roles (my single tentacle that is configured) and I select that and then provide the necessary components.
      • Connection string generated based on the SQL Instance variable
        server=#{SQL Instance};database=master;integrated security=true
      • The SQL script itself which can again be customized with the Octopus variables:
        IF NOT EXISTS ( SELECT                *
        FROM   databases
        WHERE name = ‘#{DB Name}’)
        CREATE DATABASE [#{DB Name}];
      • I check the box to marking this step as required for every deployment. If the database isn’t there the rest of the deployment will surely fail.
    2. If the database does exist (particularly in staging or prod) I want to back it up before I potentially change it. I add another execute SQL step that backs up the database. When configuring the step I add expand the Conditions section and select “Run for only specific environments” and then only include DBA Stage and DBA Prod. I also require this step before any further steps can be run. The SQL script runs our local backup procedure for the database contained in the variable #{DB Name}.
    3. The next step passes the deployment package to the tentacle. Under Package Details we leave the Package Feed as Octopus Server (built in) and give it the name of the project for the Package ID. (I also un-check everything under the Features section as I know we are not using them.) I also ensure the step is required.
    4. This step is the actual deployment from the nuget package. This step template is Redgate – Deploy From Package. I need to give it a few things to work.
      • It needs to know which step grabs the package it needs so we point it at step 3.
      • It needs the SQL Instance variable for the target SQL instance as well as the target database name. Optionally we could configure a user name and password but our build agent service account will have the appropriate permissions on the SQL instance.
      • I do give it SQL Compare options (IgnoretSQLt, DecryptEncryptedObjects1)
      • Again this step cannot be skipped.
    5. If the deployment fails outside of production we’d like to get an email. I configure an email step that is limited to all environments except DBA Prod: super easy since the Environments conditions has a “skip environments” option. Also I limit the step to run only if a previous step has failed.
    6. If the deployment fails in production we want someone to be paged: I have a second email step that sends an email to Pagerduty address set up to alert the DBA on call and is restrictued to only the DBA Prod environment. Otherwise it is the same as step 5.


Now that the project is configured I can go back to Team City and create a build project and start triggering builds. But that is a post for another day.

1. If you have encrypted objects in your database you’ll need this second option as well as to grant the deploying user SA on the SQL Instance in question otherwise your deployment will fail as SQL Compare will not be able to validate the deployment because it won’t be able to compare encrypted objects, this goes for Team City as well.

DB Deployments Part 1: Configure Team City

The first piece of the database DevOps puzzle is automating the build. Team City is a build server that developers use to compile and test their code. It consists of a central server and a number of agents. Agents are secondary machines that will pull code from source control, run tests, compile binaries and assemble deployment packages. Depending on your environment you may only need one machine. Our deployments are currently infrequent enough (mostly because they not needed often but also because they are manual so we avoid it unless absolutely necessary) that we only need on agent and it runs on the same machine as the central server. If we get busier and notice that builds are stacking up in a queue we can add more machines to install agents on that will take on that work as needed.

You will need a service account for the build server and agents. You can create specific users or possibly even a single gMSA to apply to the appropriate services1. Installation is relatively simple: download the installation files from JetBrains ( Do that and then come back.

A portion of the Team City Administration Navigation
Team City Admin Navigation

Ready? So the first thing that needs setting up is Authentication: We enabled Microsoft Windows domain authentication as well as the token based authentication module (we like PowerShell on our Team). We also kept the built in authentication (we have admin user credentials stored in a secret server in case we need them). Once that’s sorted you can create some groups to manage access/permissions and then start adding the users that will access the build server. In our case it was the DBA team, BI team and our manager. These are all folks that will want or need to either create new projects or just follow up on the results of build. You’ll also want to configure the mail notifier so that users can be notified when their builds fail (eventually watching the builds progress will become less exciting—or at least that’s what they tell me). Lastly you’ll want to visit the plugins page then click the “Browse plugins repository” button. This will allow you to browse for and install both the Octopus Deploy Integration and the Redgate SQL Change Automation plugins. We’ll be making use of both of these in our builds.

tcprojectsYou’ll want to give some thought to how you are going to organize the space: I chose to create team folders in the root directory for the DBA and BI teams to hold our respective projects. There was a non-zero chance that we’d possibly have different enough configurations that it made sense to separate them out. Thankfully if 90% of our configurations will be the same it is very easy to set up a template that can be rolled out and configured for each project. I only needed a single template in the parent directory DBA Projects to create the three distinct builds we wanted for our process:

  • builds based on changes to develop,
  • builds based on changes to master,
  • and builds based on pull requests (from develop into master).

The template consists of five steps and trigger for running it:

  1. Generate Build Number: Our build processes all take place in SSMS and build numbers and other artifacts that might be generated by Visual Studio or other IDEs is a mystery to me. We decided that our build numbers would be meaningful in as much as they would tell you when the code was built (for example 2020.01.21.1404 was a package that was built at 2:04 on January 21, 2020). The first step runs a simple PowerShell script that sets the build number that will be used by the rest of the Team City process.
  2. Build Database: This step uses Redgate’s SQL Change Automation to actually to ensure the database is deployable as written. A temporary database is created3 from scratch by the Team City agent. This agent’s service account must have db_creator rights (at a minimum) and if your database contains encrypted objects the service account will need SA so that SQL Compare will operate correctly to compare the encrypted objects to your source code. If you are using DLM Dashboard you can include that information here and Team City will ensure it is updated. We’ve configured this step to point to our development instance and ensured that our unit testing objects are not considered part of the deployment by adding the SQL Compare option “IgnoretSQLt.” If any of our code doesn’t compile this step will fail and the person committing the code will be notified (based on our configured notification rules).
  3. Run tSQLt: Honestly the best part of automating all of this was the fact that we could roll tSQLt unit tests (via the SQL Test plugin for SSMS) to provide extra validation of our code. This step builds the database again (this time from the nuget package produced by the previous step) and then runs all the unit tests we’ve included in our source code. Again we’ve configured it to use a real SQL Server rather than localdb.
  4. Push Package to Octopus Deploy: If the previous steps were successful Team City will toss the package over to Octopus Deploy. Octopus Deploy does need to be installed by this point as you will need to generate an API key for Team City to communicate with Octopus Deploy. We’ve configured this step to pass the package over and overwrite any existing packages on the Octopus side.
  5. Deploy Release: This automates the release creation on the Octopus Deploy server. For this to work for us the Project name in Octopus must be the same as the one in TeamCity as the step will pass its own project name over to identify the project receiving the release along with our custom build number. By default the template specifies “DBA Dev” as the Octopus Environment for the release. This safely enables some automation on the Octopus side (our template won’t ever accidentally deploy code to production accidentally).

Lastly the template trigger is configured to monitor the project’s VCS (version control system) root for changes that would indicate we should test a build. Ours is set to trigger on each check-in and include several check-ins in the build if they are from the same person. Additionally the default for the trigger in our template only looks at the develop branch and specifically ignores master.

whew That feels like a lot for just one article. The next post will cover the Octopus Deploy install and environment so they can work in tandem.

1. I believe a gMSA would be best here but your mileage may vary. Likely you’ll have to deploy with a traditional service account and then come back and update the service to use the MSA/gMSA. For our use case permissions are pretty binary (either the service(s) will have permissions or they won’t)2.
2. Also please note that much of what follows is that path we took due to some trial and error and may not be the single most efficient implementation. We’re all human and we’re doing our best: your mileage may vary!
3. Best practice is deploy to a big boy SQL Server that matches the edition of SQL server that will be deployed to in production. While you can use localdb your builds could fail if your database code includes objects no supported by localdb or worse if the version of localdb doesn’t match your actual environment.

Database Deployments for Non-Developers

Ever since my first disaster of a manual deployment as a DBA I’ve been on the DevOps train. Maybe I didn’t know exactly what it was called but certainly this is a solved problem I said to myself? How do other organizations find the changes that need to be deployed from their dev/test environment into production? How can I ensure that we have a process that has the best chance of succeeding when we go to deploy? And most importantly: can I get out of logging in after hours to manually run scripts?

This, dear reader, is what brought me to Redgate originally: the SQL Compare tool was exactly what I was looking for in early 2014. Working backwards I started requiring that we wrap tooling around our SQL deployments. It started small (and slow): prior to deployments developers needed to meet with me to go over the change report to identify production ready changes they wanted deployed but ended big: last year the DBA team was out of the deployment game altogether since the development team had implemented SQL Change Automation steps in Team City and Octopus Deploy. But not so fast: what about the database development not done by the development team? I am sad to report that the DBA owns its own data repository, helper databases on every SQL instance and the occasional one off database that ends up under our care. Additionally our BI team fully owns the development of our data warehouse and staging databases. Few of these were in source control and in the case of our help db there was nothing in place to ensure they were up to date across our environment. For various reasons we decided we needed to own our deployment tools and we recently found ourselves with our own Team City and Octopus Deploy server. The next couple of blog posts will document the places where I had to do some quick learning/trouble shooting to get it all working right since: I have not been in the development world since 2003 and having solely been a production DBA since my return to tech in 2013. Hopefully if you are doing database development—but aren’t what folks would consider a traditional developer—this guide will prove helpful for standing up and taking advantage of what all this automation is able to offer you!

  1. Configuring Team City
  2. Configuring Octopus Deploy
  3. Building a deployment pipeline.