Is Your Database Under Version Control?

yes yes Yes YEs YES YES YES!!!

I came in late to a “fast track” project. The developer’s “source control” were copies of the code on their machine. I think they might have been using VSS.

I quickly migrated everything into subversion.

The DB was even worse. Higglety pigglety schema changes … deploying to production consisted of making a copy of the dev database. Because, of course “how else can we do it?!!”

I knew there had to be a way. A few quick googles turned up “Evolutionary Database Design” by Martin Fowler. (http://www.martinfowler.com/articles/evodb.html) Last significant update January of 2003…but the principles (and tools, really) are still applicable.

I clamped down, kicked and screamed (metaphorically), and insisted that EVERY FRICKIN CHANGE made to the schema had to be documented in a .sql file and checked into source control. We (QA) then tested each new build with a database (you guessed it) built and populated from scripts checked into source.

Build reliability skyrocketed. (amazing, that)

Of course, the attitude of “we just need to get it built for the client and sold, then we can fix it” was pervasive all the way to the top. Thus, the project still cratered. And it was still, really, a very straightforward, small scale web app.

[sigh]

I’ll get off my soapbox now… =^)

I think many of the posters are missing an important difference between traditional code and database schemas.

Code is just text. When we write a new version of a chunk of code and check it in, we are just recording the new state of the code, not how we got there. True, many source control systems will in fact compute and store the delta between versions, but a) this is automatic - hidden from the user and b) it is deterministic.

With database structures, we are not recording the version states themselves, but the transitions between them - arcs vs. nodes, if you like.

This implies a lot of extra work. After I’ve designed and made changes to the database, debugged and iterated, then I have to go back and manually compose a change script that gets me from the old state to the new one. This typically involves a few tries and sometime laborious experimentation all by itself - all overhead work and certainly error-prone.

Yes, there are SQL “diff” tools available that can help with this, but even those can’t necessarily handle all the cases.

A common one is that I’m adding a new column to a table, which should be non null. My tool has to be clever enough to add the column allowing nulls, then update the column with its new values, then change it to be non-null as desired. I haven’t seen a tool that can seamlessly integrate schema and data changes like this.

None of this diminishes the need for database source control, just trying to point out why its not trivial.

We have solved this by developing a database installer where we keep our database model (302 tables, 430 stored procedures) in DDL scripts. Our installer first checks if the model is already installed in the target server, and if it is then it tries to move all existing data to the new structure by comparing the existing and new versions dynamically, so no data is lost during the installation process.

As this installer is one of our software projects, it is under version control along with our other modules.

Since Subtext is more of a product that gets deployed by end-users, we have the the scripts in the source tree as embedded resources.

The application actually runs the installation scripts and any upgrade scripts before recreating all stored procs.

The one downside is maintaining these scripts can be is a pain. In another project, I use the standard VS.NET database project, and then have a build step that concatenates all the stored procs into a single embedded resource. I have yet to apply that to Subtext.

I have all my databases, dev, test, prod under version control already with some very good open source scm tools. It’s going to be a challenge convincing me to spend $$ on Team Edition for DB when what I have works great and costs nothing.

Maybe somebody familiar with Team Database can help me here, but I was disappointed that this is a separate app, and not integrated into existing VS Pro products. We recently did a large project where this was an issue, and after trying a database project, decided on a custom solution that executes DDL scripts in a certain order. When I saw MS was addressing this, I was excited, but I was disappointed that it was to be a separate app only available as part of Team System. I think a more common situation is the 2-5 member team where each member contributes to the database design. I’d rather see this as a project add-in to Visual Studio Pro.

My current workplace stores all the procedures in scripts and runs them against the database, keeping the scripts in subversion. I think it would be better to have the whole schema under source control, can anyone recommend me a product for Oracle that does this?

Druid III Database Manager is a very flexible free GUI database designer/manager. It doesn’t have all of the features of some of the commercial offerings, but it is extremely flexible (and open source)… and supports a wide variety of databases. The UI is a bit quirky, but once you’re used to it it all makes sense. It storage is neatly formatted XML, so easy to check into version control (subversion). I personally always check in the .druid XML file and the generated SQL file simultaneously… so others don’t have to have druid to be ready to set up a database.

My generated SQL schema is currently 600+k … druid keeps it all organised in a tree view. And it can create E/R diagrams to help illustrate joins…

http://druid.sourceforge.net/

For those who were wondering about how this product ties into Team System; I may have found your answer. I was pointed to a blog entry on the topic of how to upgrade a copy of Team Suite to include this functionality.

http://blogs.msdn.com/camerons/archive/2006/12/08/final-v1-binaries-on-msdn.aspx

The short version is that if you have Team Suite installed already, you can grab the 20mb “Trial Edition” of VSTS for DB Professionals and it will upgrade your Team Suite to include the new features. Seems a lot easier than downloading and installing both Team Suite and VSTSDB since they are each 4gb downloads from MSDN. Hope this helps.

Well i like the tool, but am stunned of it’s inability to be automated. When i create software in a team the need for a daily build is very legit. I’ve seen no way that this VS.Net for DB prof supports automation…therefore i cannot generate changescripts etc. Information about pricing is not available…so i’m putting my money on RedGate (SQLBundle) i know it’s working and doing stuff I want…

This is actually a fairly deep topic that you guys are only hitting the surface of. One of the posters above talked about versioning the arc not the node…which is true. You need db change scripts. Unfortunately, most production systems require actual data to function. So, if you want your version control system to be able to do more than simply upgrade and existing system (aka create a new environment), then you need to store the data or dce. So far, the best solution I can come up with is to actually stick and export of the db in version control on the releases and the have change scripts that modify them until you get the to next release or stored export.

If anybody has a better solution to this, I would appreciate it.

You should take a look at this : http://macournoyer.wordpress.com/2006/09/20/database-migration-for-net/
(Migration for .NET)

Do you mean keeping the Schema and Test Fixtures under Version Control or dumping the contents of each of the tables out?

I’d think the latter would be kinda… bad in a production app? Non?

Any recommendations for MySQL version control?

We use Subversion which keeps the database repo. It is done a daily basis using a cron job.

VSTS for Databases is a good tool but it costs $7000 per developer (as you either have to buy the full Team Suite licence ($7,000) or buy one of the “role specific” licences (Developer, Tester, Architect) at $3,500 and then you can add VSTS for a further $3,500).

We’ve been doing this for years at Innovartis but, as usual, Microsoft can come late into the market and make it seem like they’re innovating :wink:

Check out a href="http://www.innovartis.co.uk"www.innovartis.co.uk/a for a realistically priced tool ($350) that does the job better, especially as it combines the schema and data upgrade in one step. This means that it can handle the addition of a NOT NULL column to the source as it understands the schema AND data at the same time.

Malc

Can anyone help me? How about a CMS database? The data itself contains templating code (in some records), thus any development involves data changes. Likewise, and content management involves data changes.

Now here’s the clincher - we want to “fork” our development into a long-term-changes “main trunk” and short-term-fixes “interim branch”.

This means software developers will be updating data records for two purposes: 1) maintaining data which is also maintainable by content editors (for example, “adding a new menu”, or “adding a link to an existing page”), and 2) maintaining data which content editors do not have access to (for example, updating a template content item to include new content variables).

All the while content editors will be continuously updating their normal stuff like text and images and creating new pages.

AND there will be two branches of development occurring with the need to merge changes when the short-term-fixes are deployed.

Suggestions?

Why don’t other organisations have this problem? What is a more practical approach? Or do we just not hear about it when others face the same scenario?

TIA.

Chris.

Chris,

I work for a company that has that EXACT problem. We maintain CMS systems for our clients, and everything you described is spot-on. Add do it, development, UAT, and live production servers, and you get quite a mess.

Managing code is easy. But managing a database, is different. Especially when like you said, doing things like updating content, or adding a menu option, almost always make changes to the database.

I have yet to find an all-encompassing solution for this.

Chris, Vic,

Managing the code is a lot easier than managing databases, yes.

My one piece of advice is: Test the heck out of the problem.

This is going to sound like a plug and, to an extent it is, but I really think I can help you with this problem. My company makes a product that significantly reduces the pain involved in version controlling databases. With it, we write one script that is used in /all/ of our tests that touch the database: including acceptance tests, certain unit tests, and (of course) tests for the transition from one version to the next. We also use the same script for our test servers and for our production servers. The technology is set up to ensure that the structure, always grows in exactly the same way, even though the information (which is entered, ultimately, by customers) is necessarily uncontrolled.

Check it out:

http://www.hexsw.com/Products/Components/DataConstructor/Default.aspx

If you have any questions, you can email me directly at max [at] hexsw.com.

Good post and interesting tool that I wasn’t aware of. I can’t seem to find any pricing information on the VS tool though?

So far I’ve been grappling with the kludgy scripting solution. It works, but not surprisingly it isn’t usually as up-to-date as the application code.

Also, do you know how well that tool handles deployment/hotfix scenarios? I always cringe whenever a bug gets discovered in a production database because I end up having to “fix” it 3 times. Can it automate the process of committing a development change to a test/production server?