Adding SQL Docs to Jekyll

Back in August I talked about how our organization was using Jekyll to publish our internal docs. This fall I took it a step further and worked out how to publish content generated by Redgate’s SQL Doc. SQL Doc is handy tool that will allows you to easily add meta-data to your database. It’s great: you essentially comment every table, column, view, schema etc. and then commit those changes to the code repository itself. Of course if you want to share that documentation with someone who doesn’t have access to the source code (or maybe doesn’t want to flip through SQL files looking for the updates to metadata to read it even if they did) it needs to be extracted made usable somewhere. SQL Doc has several report generation options and since we were using markdown for the Jekyll site that seemed a good choice. The problem in this case is that the export wasn’t really built with the idea that I might want to insert it into another site. It simply takes all the objects from the instance you have it pointed at and creates an entire site/directory structure around that. So we need to (1) strip out all the directory references or other code we won’t be using and (2) insert any of the code needed by the Jekyll site to make it useful. The following PS script mostly (I am sure there are still improvements to be made) does that job: just point it at a fresh export of the documentation and it will delete all the existing files and regenerate the docs for inclusion. (One note that it is pretty site specific in that we assumes that the site is built on the Documentation theme.)

Author: Josh Smith
Created: 2020-09-03
This script will parse through SQL Doc Generated markdown files to create Jekyll friendly
db documentation.
Should be run from the root directory of the documentation repository.
function createHeader { param ($subject, $link, $page)
$title = "$subject - $page - Documentation"
$summary = "Documentation for " + $page.ToLower() + " in $subject."
$header = "---
title: ""$title""
tags: ["
if ($page.IndexOf("index.html") -gt -1){
$header = $header + "data_dictionary, "
$header = $header + "$subject]
sidebar: db_sidebar
permalink: $link
last_updated: Oct 21, 2020
summary: $summary
return $header
function removeImageFolder { param ($imagelineIn)
$imgPos = $imagelineIn.IndexOf('Images') + 6
$imageLine = $imagelineIn.Substring(0, $imagelineIn.IndexOf('(..') + 1) + '/images/dbmodel'
if ($imageLineIn.IndexOf('Images', $imgPos) -gt -1){
$imageLine = $imageLine + (removeImageFolder $imageLineIn.Substring($imgPos))
else {
$imageLine = $imageLine + $imageLineIn.Substring($imgPos, $imageLineIn.IndexOf(".", $imgPos) - $imgPos).ToLower()
if ($imageLineIn.IndexOf('Images', $imgPos + 1) -lt 1){
$imageLine = $imageLine + $imageLineIn.Substring($imageLineIn.IndexOf(".", $imgPos))
if ($imageLine.IndexOf("Index.png") -gt 0){ $imageLine = $imageLine -replace "Index.png", "index.png" }
return $imageLine
function removeAnchor {param ($anchorlineIn)
$anchorLine = $anchorlineIn.Substring(0, $anchorlineIn.IndexOf("<a name="))
$anchorLine = $anchorLine + $anchorlineIn.Substring($anchorlineIn.IndexOf("a>") + 2, $anchorlineIn.Length - ($anchorlineIn.IndexOf("a>") + 2))
return $anchorLine
function insertLink {param ($linklineIn, $linkfolders)
$newLink = ''
foreach($f in $linkfolders){ $newLink = $newLink + $f + '/' }
$newLink = $newLink + $linkLineIn.Substring($linklineIn.LastIndexOf('(') + 1, $linklineIn.LastIndexOf('.md') - ($linklineIn.LastIndexOf('(') + 1))
$newLink = $newLink -replace '/', '_'
$linkLine = $linklineIn.Substring(0, $linklineIn.LastIndexOf('(') + 1) + $newLink + '.html)'
return $linkLine.toLower()
function processMDFile {param ($content, $database, $processFolderList, $fileName, $filler, $workingFolder)
$mdOutput = ''
$newLine = ''
$firstLine = $true
$prevBlank = $true
foreach($l in $content){
if ($l.IndexOf('####') -lt 0 -or $l.IndexOf('[Database]') -lt 0 -or $l.IndexOf('---') -ne 0){
if ($l.Trim().Length -eq 0 -and $prevBlank -eq $true){
# do nothing
elseif ($l.Trim().Length -eq 0){
$newLine = $l.Trim()
else {
$prevBlank = $false
$FirstLine = $false
$newLine = $l -replace $filler, ''
$newLine = processText $newLine
if ($newLine.IndexOf(".md") -ge 0) {
if ($newLine.Substring(0,9) -eq '[Project]'){
$newLine = projectNavigation $newLine $database $processFolderList
else {
if ($newLine.IndexOf("|") -eq 0){
$cells = $newLine.Split("|")
$newLine = ""
foreach($c in $cells){
$tempText = $c
if ($tempText.IndexOf(".md") -gt 0) {$tempText = insertLink $tempText $processFolderList}
if ($c.length -gt 0) {$newLine = $newLine + "| $temptext "}
$newLine = "$newLine |"
else {
$newLine = insertLink $newLine $processFolderList
if ($Firstline -eq $false -and $prevBlank -eq $false){
if ($newLine.Length -eq 0){$prevBlank = $true}
$mdOutput = $mdOutput + $newLine + "`r`n"
return $mdOutput
function projectNavigation{param ($projectLine, $projectDB, $sourceFolders)
$newPL = "> "
$p1 = $projectLine.IndexOf($projectDB)
do {
if ($p1 -gt -1){
$p2 = $projectLine.IndexOf("]", $p1)
# no link exists
if ($p1 -lt 0 ){
$p2 = $projectLine.LastIndexOf(')')
$newPL = $newPL + $projectLine.Substring($p2 + 3)
$p2 = -1
if ($p2 -gt -1){ $text = ($projectLine.Substring($p1 - 1, $p2 - ($p1 -2))).Trim() }
else { $text = ($projectLine.Substring($p1)).Trim() }
if ($text.Substring($text.Length - 1, 1) -eq ']'){
# potential link: parse and replace
$text = $text.substring(1) -replace ']', ''
$folderPosition = $sourceFolders.IndexOf(($text -replace " ", "_"))
$i = 0
$fPath = ''
do {
if ($i -gt 0){ $fPath = $fPath + '\' + $sourceFolders[$i] }
} while ($i -le $folderPosition)
$fileName = $projectLine.Substring($p2 + 2, $projectLine.IndexOf(")", $p2) - ($p2 + 2)) -replace '../', ''
$tPath = $startFolder.ToLower() + $fPath + '\' + $fileName.tolower()
if ((Test-Path $tPath) -eq $true){
$URL = $projectDB.toLower() + ($fPath.toLower() -replace '\\', '_') + '_' + ($fileName.ToLower() -replace ".md", ".html")
$newPL = $newPL + "[$text]($URL) > "
$tPath = $startFolder + $fPath + "\" + $sourceFolders[$i-1] + ".md"
if ((Test-Path $tPath) -eq $true){
$URL = $projectDB.toLower() + "_" + ($fPath.toLower() -replace '\\', '_') + '_' + ($sourceFolders[$i -1].toLower() -replace ".md", "") + ".html"
else { $newPL = $newPL + $text + " > " }
else{ $newPL = "$newPL $text" }
if ($p2 -gt -1){ $p1 = $projectLine.IndexOf("[", $p2) }
} while ($p2 -gt -1)
return $newPL
function processText {param($textInput)
if ($textInput.IndexOf("Images") -ge 0){ $textInput = removeImageFolder $textInput }
if ($textInput.IndexOf("<a name=""#") -ge 0){ $textInput = removeAnchor $textInput }
$textInput = $textInput -replace "#foreignkeys", "#foreign-keys" # fix foreign-key links
$textInput = $textInput -replace "_Dev", "" # remove _dev appended to db names
return $textInput
$databaseName = '' # the known name of the database
$sourceFolder = 'C:\' # path to the root of the SQL Doc export folder
$imageSource = $sourceFolder + 'Images'
$imageDestination = 'images\dbmodel'
$destination = 'pages\dbs\models\' + $databaseName
$fillString = 'JS_' # our development dbs are have our initals appended the front of the db name: we want to strip this out
# move image files
Get-ChildItem -Path $imageSource | Copy-Item -Destination $imageDestination -Recurse -Force
$files = Get-ChildItem $imageDestination
# jekyll is running in linux so case matters: lowercase all image file names
foreach($f in $files){
if ($f.Name -cne $f.Name.toLower()){
Rename-Item $f.FullName $f.Name.toLower()
#find root directory for just the database:
$startFolder = (Get-ChildItem -Path $sourceFolder -filter 'user_databases' -Recurse).FullName
$startFolder = (Get-ChildItem $startFolder | Where-Object {$_.Mode -eq 'd----'}).FullName
# get the files we will reformat
$files = Get-ChildItem -Path $startFolder -Recurse -Filter *.md
# clear out old files in case we are deleting pages from doc:
Get-ChildItem -Path $destination -Filter *.md | Remove-Item
# pull apart and rebuild files!
foreach ($f in $files){
$workingFile = $f.FullName
$file= $f.Name
$type = ($f.Directory).Name
$folderList = $workingFile.Substring($workingFile.ToUpper().LastIndexOf($databaseName.toUpper()), $workingFile.Length - $workingFile.toUpper().LastIndexOf($databaseName.toUpper()))
$folderList = $folderList -replace '.md', ''
$fileName = $folderList.Substring($folderlist.LastIndexOf('\') + 1).ToLower()
$folderList = $folderList.Substring(0, $folderList.LastIndexOf('\'))
$folderList = $folderList -replace '\\', ','
$folderList = $folderList -replace '_Dev', '' #account for possible _dev appended to db along with initials
$folderCollection = $folderList.Split(",")
$content = $null
$result = $null
# pull the content out of the working file:
$content = Get-Content -Path $workingFile
# process the content
$result = processMDFile $content $databaseName $folderCollection $fileName $fillString $startFolder
# ensure correct formatting of file names for linux
$link = ($folderList.ToLower() -replace ",", "_") + "_$fileName.html"
# set file 'topic'
if ($fileName -eq "index"){ $topic = $folderCollection[$folderCollection.Length - 1] }
else { $topic = $fileName }
# format the header for the new doc
$curHeader = createHeader $databaseName $link $topic
$result = $curHeader + $result
if ((Test-Path $destination) -eq $false){ New-Item -ItemType Directory -Force -Path $destination }
$outPath = $destination + "\" + ($link -replace ".html", ".md")
Set-Content -Path $outPath -Value $result -Force
#todo: update db_sidebar with link to documentation home
$sidebarOut = Get-Content _data/sidebars/db_sidebar.yml -Raw
$dbmodelList = Get-ChildItem pages/dbs/models/
$sidebarOut = $sidebarOut.Substring(0, $sidebarOut.LastIndexOf('folderitems:') + 12) + "`r`n`r`n"
foreach($i in $dbmodelList){
$dbName = $i.Name
$fileName = $i.Name.ToLower()
$newEntry = " - title: $dbName
url: /$fileName" + "_index.html
output: web, pdf`r`n"
$sidebarOut = $sidebarOut + $newEntry
$sidebarOut = $sidebarOut.Substring(0, $sidebarOut.Length - 1)
Set-Content -Path _data/sidebars/db_sidebar.yml -Value $sidebarOut -Force
view raw ExtractSQLDocs.ps1 hosted with ❤ by GitHub

Centralize Your Documentation to Learn Git

Source control is (has?) taken over and the recommendation is that pretty much anyone in IT should probably be using it. Beyond application development source control is use for anytime you have text documents that you need shared editing access to source control is something you might want to consider. Git is a widely used source control system and it is… difficult which is why it’s not too hard to find content like this:

And for some it will always be somewhat frustrating for some (and that’s ok). But even at it’s most frustrating for me it’s worth it because of how easy it is to collaborate with my team on code and files. This winter I found myself championing the use of git to help centralize our documentation. We’d been using a mix of SharePoint directories, pages and file shares to keep track of various SLAs, environment documentation and kbs about the applications we support and maintain. It was a mess. At the same time we’d also been sharing more PowerShell scripts around to get work done and the like. It was time to start using source control to keep everyone up to date. Adoption was not great though: folks would grab the files and then never pull to get updates, they’d paste the changes into email or chat to share with specific people when they found they needed those changes etc.

So we took a slightly different tack: we’d move all our team documentation into a git repository and use it to publish an internal website to make it accessible to those in the organization who just needed access to the information. Jekyll is a ruby library (? I’m still not a web developer thankfully I just can’t hack it) that will take a collection of (mostly) markdown files and generate a static website. Within 24 hours we had an initial repository that anyone on the team could pull and add their information to. For the information to be useful (i.e. published) it had to be pushed to the repository, reviewed and merged into the main branch. Adoption started off relatively slow but as new documentation was required or older docs needed updating it got used.

There were the inevitable learning pains, a few of us had to muddle our way through dumb merge conflicts. As git champion I borked the repo at least once but: it was getting used but the entire team and each bump in the road made the rest of the journey that much smoother. So now not only have we seen better engagement with our team code repositories for sharing PowerShell and the like but 95% of our documentation has been migrated from the various nooks and crannies where it was hidden to a central location where it is easily found by anyone in the org. If you looking at a scattered or disorganized documentation system and/or a team that needs to be introduced to source control you could do worse than to start by tackling both at the same time.

Just remember: commit and push often!