I’ll be presenting my introduction to SQL queries in just a few short days at SQL Saturday Redmond! If you know somebody that’s just getting started with SQL and is wants an introduction to how to query and join tables: have I got a session for you! (It’s this one.) This will be my first time presenting away from my home turf of Spokane so I’m only a little nervous about it: I hope I see you there!
SQL is an acronym for Structured Query Language so when getting started with it I think it’s best to think about how it was put together to be able to ask questions about data. With that in mind let’s a take a look at some beginning SQL to understand how to write queries.
Queries can be very simple but they won’t answer very many questions in their most simple forms:
Will not a table and filter are clearly not required to write a query, the most common use for writing a query will require 3 parts to be useful:
- a SELECT-ion of things (columns)
- FROM a a collection of things (a table)
- WHERE a criteria is met (filter(s))
I happen to have a database of movie data. Let’s explore some common filter operators by asking some questions about movies. Filters based on mathematical operators are the easiest to start with. We can ask for films with exactly 2 Oscar nominations or more than 4 or less than 6. We make these filters exclusive or inclusive by adding the “=” operator to the comparison. We can also exclude a specific number by using “<>” (Microsoft SQL Server will also support != but this is outside the ANSI standard so “<>” would be best practice as it will be usable in any version of SQL.)
We can have multiple conditions in our WHERE clause and these are combined with AND or OR statements. We can ask for films with 1, 2 or 3 Oscar nominations (but we’ll also see we have a new operator IN that is cleaner and accomplishes the same goal). We can check for movies released on or after Jan 1, 2017 and were nominated for at least 1 Oscar. We can ask about movies released before the year 1950 that made more money than they spent.
If you’re just getting started with writing SQL queries I hope this was helpful! Next up I’ll cover a few tips for working with dates as well as specific operators for filtering strings.
 It’s important to note that SQL does not guarantee any sort of order in your results and thus the last part of a query is generally an ORDER BY statement that tells SQL what order to put the results in.
I’ve been thinking a lot about GDPR and similar legislation for the last year or so. I think that codifying how companies treat our data (and what they can and can’t do it with) is a huge step in the right direction. Late in 2018 I was happy to find myself working with Redgate on their SQL Data Catalog early access program as we started looking at ways to attach and track metadata about our data in our environment. Last month Washington state (where I currently reside and work) proposed it’s own GDPR like legislation. Since we aren’t likely to get any federal level legislation anytime soon it’s most likely the US DBAs will have to learn about a variety of laws and investigate (along with a legal team) what they actually mean and if they impact their organization.
In the mean time now we can still a jump start on collecting the data we need to meet what is sure the bare minimum requirement of these laws: getting a handle on what you actually have in a way that is accessible and reportable. Also reviewing your organizations practices for cleaning data in dev and test environments. This means generating test data or masking/altering production data to remove/replace sensitive information. If digging into market trends to see what other organizations are doing is your thing Redgate is making the 2018 Data Masking Market Guide available for your perusal here. I’m really enjoying my foray into these questions! Have you started trying to solve these problems in your org? What have you run into that you weren’t expecting when you started?
So I tried a very new thing (for me anyway) and after a fair amount of dithering about it I told the little naysayer that lives in the back of my head to shut the hell up and applied for the Friend of Redgate program at the end of December.
I’m writing about it so of course you may easily guess the outcome: I was accepted into the program! I really am a huge fan of their tools ever since I first found SQL Compare 5 years ago (to the month almost now that I think about it) after a ridiculous manual deployment. In the time since I’ve deployed and learned to use more of their tools. They have created a wonderful community to help support the SQL community at large and I am looking forward to working hard to contribute as well! Currently my biggest engagement is with their new SQL Data Catalog. As more and more legislation is created to protect data in our control it’s more and more vital to be able to know exactly what types of data we have so we can take concrete steps to ensure its protection: a data catalog is an integral tool to get that understanding. I really love this problem space and I have lots of ideas and opinions about it!
This is one of those posts so I never have to google this again (one hopes). Here is the PS code to pull back a set of details about every SSRS instance installed on a server including the SSRS instance name, & the first URL port it is running on, the service name and the name of the report server database etc.
2018 was a challenging year for all sorts of reasons. I set some modest goals for myself and it was mixed bag. My work goals got kind of demolished: I failed to post much last year (although I am happy I now have a post to return to every time I need to write a pivot) and after multiple attempts I failed to pass the MS 70-463 exam on building a data warehouse (I just couldn’t seem to get the last 50-100 points to push me over the top). My personal goals I did okay on though: I passed my nidan test back in September and I am about 1/3 of the way through learning to finger pick Loch Lomond (definitive version is Runrig’s live version) on my tenor uke. I would think the year was failure however I did a lot (especially in the first 3 months):
- Coordinated and ran the first SQL Saturday here in Spokane.
- Presented at said SQL Saturday.
- Managed to keep the local PASS chapter meeting every month (even while I was out of the country with the assistance of my co-leader Elizabeth Hunt (t).
- Attended SQL Sat Oregon, rode the SQL Train to PASS Summit. I finally managed to make the Summit meet-up with other chapter leaders and SQL Saturday organizers: so much learning and networking.
- I started the process for creating a not for profit to handle the money for our SQL Saturday since it was successful enough to warrant another one. We officially incorporated right before Christmas.
- At work I finally got automated backup testing up and running. It’s probably a little over-engineered but it was a solid first effort.
- I also began attending the data governance meetings at work so we could start incorporating our SQL data into their consideration and I got us enrolled in the EAP for Redgate’s Data Catalog.
This last one is by far one of my favorite things this year. I’ve spent a lot of time thinking about how we treat our data but beyond being sure it was encrypted and backed up I haven’t been able to do much else concrete but in the last month of 2018 I was able to start cataloging data in our internally developed databases and get start on one of our bigger vendor databases. Adding this metadata will provide us with a much better view of what kinds of data we are storing and where it is. I’m happy with this initial run at the problem space and I can already see where it starts to answer some questions we should all be asking about the data in our care.
2019 is looking to be another challenging and busy year: I need to refocus and look at my path to an MCSA/MCSE certification. I’m hoping I can take the 70-764 and 765 exams this year. I was hoping to get my 2012 MSCA and upgrade that but I still don’t do enough SSIS work to get myself over the last point hurdle with the 70-463. And being certified on the upcoming versions is probably worth my time.
I’ll finish putting together the Inland Northwest Data Professionals Association in the next couple of months, just in time to run the second SQL Saturday Spokane. I also applied to the Friends of Redgate program and we’ll see what comes of that: that could lead to a whole host of community contributions that I can’t even foresee right now.
I don’t find myself pivoting data often but when I do I seem to always need to do the “complicated” pivot. You know, the one where you have a collection of unique values like email addresses where the value isn’t the column name but you want the 4 email addresses connected to one person in a single row rather than four rows. I can never remember how to do that so here is my personal internet bookmark for that code. Thanks to the late Robert Davis who gave the answer I needed back in 2015. Robert was a wonderful and generous contributor to the SQL Family his site is full of useful information and is still available here.
The solution is to add the categories you want as eventual columns to your result set before pivoting (it sounds simple enough but again then I can never remember it). So if you have say an account with multiple contacts:
This returns a list of results such as:
The one simple trick is to create a value for each row that will be repeated it can be used as a column name when you pivot:
Simply repeat the structure of creating a valid column name (‘String’ + CAST(counter as VARCHAR)) and and you can include a pivot across as many different columns as need to be included. Our short list above now becomes:
Hopefully now in the future I’ll remember I wrote this and I won’t have to go searching for the solution (again).
So this is a thing: I was nominated as an outstanding volunteer for PASS for April 2018. I really appreciate that someone thought I was doing something outstanding. I am bad at both accepting compliments and self promotion. I really believe in the PASS as a positive extension of the SQL community. I know that I wouldn’t be where I am today without the generous and helpful people who want to help myself and others do and be their best meeting the data needs of our respective employers/clients/organizations. It’s only natural that you turn around and help those around you that want to do better for themselves as well. I’m excited for the growth of our users group and the success of our SQL Saturday: I could not have done it alone.
My road to tools has been both short and long. Short because I started out (just a few years ago) and didn’t think I needed much more than a connection to the database and my own brain. Long because I also had not yet had the misfortune of discovering how much pain good tooling can eliminate. It started one crisp winter day as bright eyed junior DBA when I was informed I would be supporting an internal deployment that evening. The first that had occurred since I started about 5 months prior. Our internal databases are relatively small and mostly simple so none of this was particularly frightening.
“They’ll give you a deployment script to run on the database. Make sure you take a backup so you can roll back if you need to, but it shouldn’t take more than a few minutes of your evening. Good luck,” they told me. With an update like that what could go wrong? Pretty much most of the things.
The developer provided me with a script. I dutifully took the database backup, ran it and waited for him to confirm the update had applied successfully.
“Somethings wrong. Hang on a sec. I forgot to include some changes in the script.” Within a few minutes I had a second script to run in addition to the first, which I quickly run against the database. A pause.
“No, it’s still not working as expected. Um, let me check some things…” they sent me. This went on for 30 minutes, an hour. I couldn’t make the call to roll it back and cancel the upgrade and the developer was researching what he might have missed in his change script. I asked often if he had anyway to to tell me a total list of objects he expected to have changed for the deployment to work (“Who needs a staging environment?” past me complained). A “simple” deployment that I had expected to maybe spend 30 minutes (including logging in and waiting around for the start time) dragged on to close to 2 hours. Eventually it was completed and we went our separate, frustrated ways for the evening. The next morning I started researching tools for diffing databases and found Redgate’s SQL Compare. Within a week I had approval to purchase their toolbelt and the next time there was a deployment we scheduled in time for them to review the differences between their test environment and production to come up with a deployment script. It took a little longer but I was able to finally convince the development team they needed their own tooling to manage their database code but after about 18 months we had them using Ready Roll to manage database migrations and now a few years later we are finally getting to the point where we can use a build agent like Octopus Deploy so none of us DBAs have to get up at all.
Besides SQL Compare the other tool that I really notice when I don’t have it (absence makes my heart so much more sad) is SQL Prompt. Having to work on a machine without it now makes me feel like an idiot now that I’m used to have easy access to a library of code snippets (and automatic code formatting: tabs forever!).
It’s not essential (yet) but recently I’ve been trying to use PowerShell more when I can and quickly discovered dbatools which does a lot of the heavy lifting and provides immediate access to a lot of management tasks. My favorite database migration: we used to manually copy backups from one server to another and restore them or detach a database and copy the mdf/ldf to the new server and re-attach which was a pain: I can never remember the exact syntax, I end up having to RDP to the server to move files: ugh. But with a single line of powershell I can take a backup (or backups) to a network location, restore that(those) backup(s) without leaving my own desktop and I’ve barely scratched the surface of what I can do with it: highly recommended!
I’ve been taking notes hoping to share some lessons learned from our deployment of Red Gate’s SQL Clone but they kept it so deliciously simply and the documentation and worked examples for PowerShell are difficultto improve upon (also we aren’t doing anything too terrible fancy yet either). My only gotcha was learning that if you don’t use the redgate extension (.SQB) for backups that SQL Clone won’t recognize that it was compressed and encrypted with SQL Backup (we’ve wrapped an existing backup procedure around the SQL Backup proc that preserved our old naming conventions when we switched to the new backup tool). The work around was simple: replace the backup file extension prior to creating the image from the backup. If you think that you could benefit from cloning your databases to replicate (and mask) production data I encourage you to check it out. In the meantime I’ll see if I can find a use case or work around that needs some more documentation as we roll it out for our test group.