Centralize Your Documentation to Learn Git

Source control is (has?) taken over and the recommendation is that pretty much anyone in IT should probably be using it. Beyond application development source control is use for anytime you have text documents that you need shared editing access to source control is something you might want to consider. Git is a widely used source control system and it is… difficult which is why it’s not too hard to find content like this:

And for some it will always be somewhat frustrating for some (and that’s ok). But even at it’s most frustrating for me it’s worth it because of how easy it is to collaborate with my team on code and files. This winter I found myself championing the use of git to help centralize our documentation. We’d been using a mix of SharePoint directories, pages and file shares to keep track of various SLAs, environment documentation and kbs about the applications we support and maintain. It was a mess. At the same time we’d also been sharing more PowerShell scripts around to get work done and the like. It was time to start using source control to keep everyone up to date. Adoption was not great though: folks would grab the files and then never pull to get updates, they’d paste the changes into email or chat to share with specific people when they found they needed those changes etc.

So we took a slightly different tack: we’d move all our team documentation into a git repository and use it to publish an internal website to make it accessible to those in the organization who just needed access to the information. Jekyll is a ruby library (? I’m still not a web developer thankfully I just can’t hack it) that will take a collection of (mostly) markdown files and generate a static website. Within 24 hours we had an initial repository that anyone on the team could pull and add their information to. For the information to be useful (i.e. published) it had to be pushed to the repository, reviewed and merged into the main branch. Adoption started off relatively slow but as new documentation was required or older docs needed updating it got used.

There were the inevitable learning pains, a few of us had to muddle our way through dumb merge conflicts. As git champion I borked the repo at least once but: it was getting used but the entire team and each bump in the road made the rest of the journey that much smoother. So now not only have we seen better engagement with our team code repositories for sharing PowerShell and the like but 95% of our documentation has been migrated from the various nooks and crannies where it was hidden to a central location where it is easily found by anyone in the org. If you looking at a scattered or disorganized documentation system and/or a team that needs to be introduced to source control you could do worse than to start by tackling both at the same time.

Just remember: commit and push often!

DB Deployments: Using A Custom Filter

So I’ve been rolling out the deployment pipelines as I was able and I ran into an issue with our third one: We have a helper database that lives on all of SQL Instances and while we’d like to eventually keep any customization in source control we only want the dbo schema deployed automatically via the pipeline. The Redgate “Deploy From Package” step has a handy option to include an alternative filter to use with the SQL Compare engine so it seemed really straight forward: I fired up SQL Compare: generated a unique filter file that I named DeployFilter.scpf and threw that file name into the job step!


Unfortunately I got stuck here because this does not work. The nuget packages don’t end up in the working directory for the deployment and I couldn’t find any documentation about where it might end up. I ran a few deployments so I could watch the working folder as the build ran to see what showed up with no luck. Looked for articles etc. that might have mentioned configuring the path (to no avail). Eventually I hit up the Redgate channel on the SQL Slack. Alex Yates (b|t) had the answer in his head all along: The previous step (self-deploy package) unpacks the nuget package on the tentacle and I have to get it to tell me where the package is (and then append the path to the filter within the package itself). This is all very long way of saying if you want to use a custom filter (a) it should be in source control with the rest of your code and (b) the variable is #{Octopus.Action[<name of package deployment step>].Output.Package.InstallationDirectoryPath}\<path to filter file in nuget package>. And I am putting it here for the next person that finds themselves googling how to make it work (which may still be me: my buffer is notoriously overworked).

DB Deployments Part 2: Configuring Octopus Deploy

Where Team City will test and validate your code, Octopus Deploy will manage where that code is deployed. Again there are SQL Change Automation tools we’ll be using to ensure our database code is deployed consistently across our environments. You’ll need to ensure you have a service account (or MSA/gMSA eventually) for the application as well as home for the SQL database it needs. Download the install media and deploy it to your server. We’ve have Octopus deployed side by side with Team City (just on different ports).

Once installed under configuration you can add users and groups to roles as needed. We’ve created groups for the DBA and BI teams so they can log in and see the status of their deployments and added folks to the Administrators group as appropriate. (A quick note: while there is a small teams license for Octopus all users in the app are essentially an admin unless you purchase licensing: You’ll very likely want to purchase a license to take advantage of role based permissions.)

Screen shot of the Octopus Deploy Infrastructure page. It shows 4 environments and 1 deployment target.

A quick peek at our Octopus Deploy Infrastructure

The next piece that needs configuring is Infrastructure. We need to define agents that will do deployments (called deployment targets or tentacles) and define deployment environments. Again we don’t expect our deploy agents to be very busy to start with so we have a single agent that also runs on the app server (just like Team City). Clicking Add Deployment Target Octopus will walk you through installing the agent on a new server: it was pretty painless. At this time we’ll also create a named role for the tentacle that indicates what it might be used for (this is useful if your deployment process spans different servers and you need more than one agent to deploy your code). We only ever deploy to SQL servers so again a single agent and role is sufficient for now. Once we have a target we can define Environments and assign a target that will handle work in that environment. If your environments are highly segmented you may need a target (not on the app server) that is on the right network with the right connection rights (to whichever environment) that the app server can communicate with to do deployments. I have 4 environments: DBA Dev, DBA Test, DBA Stage and DBA Prod that match up with the deployments I will want Octopus to do.

Screenshot of the Octopus Deploy Variable Sets menu

Octopus Deploy Variable Set: DBA Deployment Locations

Next we add some objects to the Library tab. Under Variable Sets I am going to define a set of variables called DBA Deployment Locations. These are a standardize variable that we can include in Octopus Projects. I’ve defined the database name and SQL Instance as well as connection string based on those two values. Secondly for Step Templates I browse the library to down and install the Redgate step templates for creating, deploying releases. Additionally I grab the SQL – Execute Script template. I’ve also added two lifecycles that control deployment behavior: DBA Dev which deploys automatically to both the DBA Dev and DBA Test environments and DBA Production with automatically deploys to DBA Stage and is linked to DBA Prod. Lastly you’ll note there is a packages section: this is where Octopus will display packages that have been passed to it. Now we are ready to create our first project!

Projects can be put into groups and I’ve continued the grouping of DBA vs BI projects here as well. Once all that is in place we can create a project to handle specific deployments. Under the project tabs there is an Add Project button. The project needs to have the same name as the Team City project for ease of interoperability. I give it a description and assign it to a group and a Lifecycle. Once on the project page I expand Variables from the navigation pane on the left and click Library Sets. Then in the upper left I click “Include Library Sets.” I choose “DBA Deployment Locations” I just created and click save. I need to create some project variables as well so I can assign specific values to that variable based on the environment the tentacle is connecting to. The really nice thing is that I can overload existing variables I made have added from library sets so I recreate my DB Name and SQL Instance variables. I have different database names and SQL Servers for different environments and I input them all here. The scope for variables is very useful and available just about everywhere in the UI. The nice feature is that you can preview variable values at specific process steps to ensure they are being set properly throughout the deployment process.

I then create two channels to handle the releases and assign a lifecycle to each: Develop gets the default DBA Dev Lifecycle and then Master is associated with the DBA Production Lifecycle. The last bit is to create an actual deployment process. After clicking Process on the left navigation pane under the project we can start adding steps. I have six (although they don’t all run in every environment).OctoProcess

    1. The first step will create the database if it does not exist on the SQL instance already. I select “SQL – Execute Script” from the templates available and configure it. I only have one choice of target roles (my single tentacle that is configured) and I select that and then provide the necessary components.
      • Connection string generated based on the SQL Instance variable
        server=#{SQL Instance};database=master;integrated security=true
      • The SQL script itself which can again be customized with the Octopus variables:
        IF NOT EXISTS ( SELECT                *
        FROM   databases
        WHERE name = ‘#{DB Name}’)
        CREATE DATABASE [#{DB Name}];
      • I check the box to marking this step as required for every deployment. If the database isn’t there the rest of the deployment will surely fail.
    2. If the database does exist (particularly in staging or prod) I want to back it up before I potentially change it. I add another execute SQL step that backs up the database. When configuring the step I add expand the Conditions section and select “Run for only specific environments” and then only include DBA Stage and DBA Prod. I also require this step before any further steps can be run. The SQL script runs our local backup procedure for the database contained in the variable #{DB Name}.
    3. The next step passes the deployment package to the tentacle. Under Package Details we leave the Package Feed as Octopus Server (built in) and give it the name of the project for the Package ID. (I also un-check everything under the Features section as I know we are not using them.) I also ensure the step is required.
    4. This step is the actual deployment from the nuget package. This step template is Redgate – Deploy From Package. I need to give it a few things to work.
      • It needs to know which step grabs the package it needs so we point it at step 3.
      • It needs the SQL Instance variable for the target SQL instance as well as the target database name. Optionally we could configure a user name and password but our build agent service account will have the appropriate permissions on the SQL instance.
      • I do give it SQL Compare options (IgnoretSQLt, DecryptEncryptedObjects1)
      • Again this step cannot be skipped.
    5. If the deployment fails outside of production we’d like to get an email. I configure an email step that is limited to all environments except DBA Prod: super easy since the Environments conditions has a “skip environments” option. Also I limit the step to run only if a previous step has failed.
    6. If the deployment fails in production we want someone to be paged: I have a second email step that sends an email to Pagerduty address set up to alert the DBA on call and is restrictued to only the DBA Prod environment. Otherwise it is the same as step 5.


Now that the project is configured I can go back to Team City and create a build project and start triggering builds. But that is a post for another day.

1. If you have encrypted objects in your database you’ll need this second option as well as to grant the deploying user SA on the SQL Instance in question otherwise your deployment will fail as SQL Compare will not be able to validate the deployment because it won’t be able to compare encrypted objects, this goes for Team City as well.

Backing Up A Cosmos DB with the Cosmos DB Migrator Tool

CosmosDB really is an amazing datastore and even better (you might be thinking): Microsoft handles the backups for you. Which is true. They take backups every four hours and keep the last two. If you need anything recovered from the database you’d better hope that you notice with in that window *and* get a ticket open with Microsoft to get it fixed. This being the case Microsoft helpfully recommends that in addition to the by default backups that come with the Cosmos DB service that you export your data to a secondary location as needed to meet your organizations SLA. Data Factory to the rescue right? Again, almost.

Unfortunately if you are restricting access to your Cosmos DB service based on IP address (a reasonable security measure) then Data Factory won’t work as of this writing as Azure Data Factory doesn’t operate like a trusted Azure service and presents as IP address from somewhere in the data center where it is spun up. Thankfully they are working on this. In the meantime however the next best thing is to use the Cosmos DB migration tool (scripts below) to dump the contents to a location where they can be retained as long as needed. Be aware in addition to the RU cost of returning the data that if you bring these backups back out of the data center where the Cosmos DB lives you’ll also incur egress charges on the data.

The script reads from a custom json file, this will contain the cosmos db service(s), as well as the databases and collections that need to be backed up. This file will have the read-only keys to your cosmos DB services in it so should be encrypted on the disk in order to limit the number of people who can access the file.

"_comment" : ["This is a sample object that should be modified for deployment.",
"Connect strings will need to be inserted and correct service, database and collection",
"names included as well as setting the database backup flag to true as needed."],
"service" : {
"name" : "<name of your cosmos db service>",
"connectString" : "<read-only connect string here>",
"databases" : [ {"name" : "database1",
"backupFlag" : true,
"collections" : [{"name" : "collection1"},
{"name" : "collection2"}]
, {"name" : "database2",
"backupFlag" : false,
"collections" : [{"name" : "collection1"}]
{"name" : "database2",
"backupFlag" : true,
"collections" : [{"name" : "collection1"}]
"service" : {
"name" : "<second cosmos db service>",
"connectString" : "<second service read-only key>",
"databases" : [ {"name" : "database1",
"backupFlag" : false,
"collections" : [{"name" : "collection1"}]
view raw cosmosDB.json hosted with ❤ by GitHub

Once the config file is in place the following PowerShell will read the file and backup the appropriate services, databases and collections appropriately (and remove any old backups that are no longer needed).

This script will call the cosmos db migration tool with the correct parameters based
on a list of databases that need to be backed up. It depends on a json param file that
contains the list of cosmos db services that have databases that require backup.
This script has a couple of dependencies:
(1) the dt.exe that it runs (the cosmos db migration tool and we assume the associated files/dlls
in the compiled folder) needs to be locally available.
(2) A configured json file to list out the cosmos services and databases that require backups.
Care should be taken (encrypt the file and provide access to the keys to a limited set of users)
as the read-only keys for the cosmos-db service will be stored here.
$retentionDays = 14
$backupList = "C:\temp\CosmosBackup.json" # point these at the appropriate folders
$backupPath = 'C:\temp\'
$pathtoEXE = 'C:\temp\drop\dt.exe'
$backups = Get-Content -Raw -Path $backupList | ConvertFrom-Json
foreach($s in $backups.service)
$sName = $s.name
Write-Output "Processing service $sName..."
$cosmosConnectString = $s.connectString
foreach($d in $s.databases)
$database = $d.name
if ($d.backupFlag -eq $true)
Write-Output " Backing up collections in $database..."
foreach($c in $d.collections)
$collection = $c.name
Write-Output " Backing up collection $collection."
<# configure export arguments #>
$connectString = "$cosmosConnectString;Database=$database"
$date = Get-Date
$dateString = $date.ToString('yyyyMMdd')
$dateString = $dateString + '_' + $date.ToString('hhmm')
$targetFile = "$collection`_$dateString.json"
$args = "/ErrorLog:" + "$backupPath\backups\$sName\$database\$collection`_$dateString`_log.csv /OverwriteErrorLog"
$args = $args + " /ErrorDetails:Critical /s:DocumentDB /s.ConnectionString:$connectString"
$args = $args + " /s.ConnectionMode:Gateway /s.Collection:$collection /s.Query:`"SELECT * FROM c`""
$args = $args + " /t:JsonFile /t.File:$backupPath\backups\$Name\$database\$targetFile /t.Prettify /t.Overwrite"
<# now we are all configured: run the collection backup #>
Start-Process -FilePath $pathtoEXE -ArgumentList $args -Wait
else {
Write-Output " Skipping $dName backupFlag <> true."
$purgeDate = (Get-Date).AddDays(-1 * ($retentionDays + 1))
<# remove old logs and backups #>
Get-ChildItem -Path $backupPath -Recurse -Include *.json -Exclude *cosmosBackup.json | Where-Object {$_.CreationTime -lt $purgeDate} | Remove-Item
Get-ChildItem -Path $backupPath -Recurse -Include *.csv | Where-Object {$_.CreationTime -lt $purgeDate} | Remove-Item
view raw CosmosDBBkup.ps1 hosted with ❤ by GitHub

While this is not ideal if you have a need to immediately start backing up your cosmos dbs this will do the trick until Microsoft finishes incorporating Data Factory into their trusted services.

[Edited to add 10/3/2019:] Just yesterday it looks like MS updated their timeline for adding the needed functionality to ADF.

T-SQL Tuesday #101: My Essential SQL Server Tools

tsql2sday150x150My road to tools has been both short and long. Short because I started out (just a few years ago) and didn’t think I needed much more than a connection to the database and my own brain. Long because I also had not yet had the misfortune of discovering how much pain good tooling can eliminate. It started one crisp winter day as bright eyed junior DBA when I was informed I would be supporting an internal deployment that evening. The first that had occurred since I started about 5 months prior. Our internal databases are relatively small and mostly simple so none of this was particularly frightening.

“They’ll give you a deployment script to run on the database. Make sure you take a backup so you can roll back if you need to, but it shouldn’t take more than a few minutes of your evening. Good luck,” they told me. With an update like that what could go wrong? Pretty much most of the things.

The developer provided me with a script. I dutifully took the database backup, ran it and waited for him to confirm the update had applied successfully.

“Somethings wrong. Hang on a sec. I forgot to include some changes in the script.” Within a few minutes I had a second script to run in addition to the first, which I quickly run against the database. A pause.

“No, it’s still not working as expected. Um, let me check some things…” they sent me. This went on for 30 minutes, an hour. I couldn’t make the call to roll it back and cancel the upgrade and the developer was researching what he might have missed in his change script. I asked often if he had anyway to to tell me a total list of objects he expected to have changed for the deployment to work (“Who needs a staging environment?” past me complained). A “simple” deployment that I had expected to maybe spend 30 minutes (including logging in and waiting around for the start time) dragged on to close to 2 hours. Eventually it was completed and we went our separate, frustrated ways for the evening. The next morning I started researching tools for diffing databases and found Redgate’s SQL Compare. Within a week I had approval to purchase their toolbelt and the next time there was a deployment we scheduled in time for them to review the differences between their test environment and production to come up with a deployment script. It took a little longer but I was able to finally convince the development team they needed their own tooling to manage their database code but after about 18 months we had them using Ready Roll to manage database migrations and now a few years later we are finally getting to the point where we can use a build agent like Octopus Deploy so none of us DBAs have to get up at all.

Besides SQL Compare the other tool that I really notice when I don’t have it (absence makes my heart so much more sad) is SQL Prompt. Having to work on a machine without it now makes me feel like an idiot now that I’m used to have easy access to a library of code snippets (and automatic code formatting: tabs forever!).

It’s not essential (yet) but recently I’ve been trying to use PowerShell more when I can and quickly discovered dbatools which does a lot of the heavy lifting and provides immediate access to a lot of management tasks. My favorite database migration: we used to manually copy backups from one server to another and restore them or detach a database and copy the mdf/ldf to the new server and re-attach which was a pain: I can never remember the exact syntax, I end up having to RDP to the server to move files: ugh. But with a single line of powershell I can take a backup (or backups) to a network location, restore that(those) backup(s) without leaving my own desktop and I’ve barely scratched the surface of what I can do with it: highly recommended!

Deploying Images & Clones in SQL Clone

I’ve been taking notes hoping to share some lessons learned from our deployment of Red Gate’s SQL Clone but they kept it so deliciously simply and the documentation and worked examples for PowerShell are difficultto improve upon (also we aren’t doing anything too terrible fancy yet either). My only gotcha was learning that if you don’t use the redgate extension (.SQB) for backups that SQL Clone won’t recognize that it was compressed and encrypted with SQL Backup (we’ve wrapped an existing backup procedure around the SQL Backup proc that preserved our old naming conventions when we switched to the new backup tool). The work around was simple: replace the backup file extension prior to creating the image from the backup. If you think that you could benefit from cloning your databases to replicate (and mask) production data I encourage you to check it out. In the meantime I’ll see if I can find a use case or work around that needs some more documentation as we roll it out for our test group.