Darwinweb

Database Migration Strategies

November 27, 2005     

So I’m hard at work on Revision 3, and let me tell you, it’s hot! User accounts, comments and gravatars are on the agenda. All this is coming together in about 8 hours of work. I think I’m starting to get the hang of this Rails thing. But I have a confession to make.

Even though I started out using ActiveRecord::Migration, for Revision 3 I’ve just been making changes by hand in CocoaMySQL. This is bad. Bad, Bad, BAD. The whole point of this mess is agility, and manual database updates spell trouble. So what are the options?

Migrations

ActiveRecord migrations are a very handy tool but, unlike most things in Rails, they don’t seem fully optimized for true agile development. My first problem is that they don’t declare things NOT NULL (is there a way to change this?). NULL values in MySQL tables are typically bad juju because they can become indistinguishable from unmatching foreign keys in certain kinds of joins. Maybe this is a silly thing to obsess over, but I can’t shake the uneasy feeling in the pit of my stomach. By comparison, giving up SETs and ENUMs was but a moment’s hesitation.

NULLs are hardly a deal-breaker though. My biggest issue is that migrations aren’t really dynamic enough in rapid development situations. I’m constantly adding indexes and fields one at a time. Creating and running a migration takes less than a minute, but tweaking something in a GUI frontend takes only a couple seconds. Couple that with 20 extra files for every revision and migrations start to seem downright wasteful.

The Save & Restore Approach

For my purposes a simpler approach (which I’ll be using for Revision 3 at least), is to simply dump the production database content, install the new schema from scratch, and then re-insert the data. Of course this approach has some serious limitations. For one thing it doesn’t work if field names change, and application versions can’t be easily rolled back. In fact, I don’t have anything good to say about this except that it’s quick and easy.

Database Diff

Zach wrote a very cool command line utility called dbdiff. This utility dumps out two mysql schemas and compares them, presenting a list of differences. Of course dbdiff can’t distinguish field name modifications from creation and deletion. It also needs two database schemas to compare, which means that if you’ve been changing your database willy-nilly you better have a schema backed up to make the comparison. It’s definitely not an automatic solution, but I think it provides a critical piece of functionality to migrations users.

Hybrid Solution

Migrations would be ideal for me if I could create them at the end right before I deploy. By leveraging the power of switchtower to run remote commands, my goal is to be able to dbdiff the production and the development databases with a single command.

With that infrastructure, migrations could be created automatically and then hand-tweaked by the developer. This could be my first Rails patch.