Two Phase Data Migration
I ran into an interesting situation today, where I’m refactoring a
has_and_belongs_to_many target into two tables. Originally I had:
class User < ActiveRecord::Base has_and_belongs_to_many :idxes end class Idx < ActiveRecord::Base has_and_belongs_to_many :users end
But now I want:
class User < ActiveRecord::Base has_and_belongs_to_many :associations end class Association < ActiveRecord::Base has_and_belongs_to_many :users has_many :idxes end class Idx < ActiveRecord::Base belongs_to :association end
The interesting part of this is the data migration. Basically the
idxes_users table needs to be dropped, but not before it’s used to populate the
associations_users table. The
has_and_belongs_to_many :idxes association also needs to stick around as the migration code is a lot cleaner if I can use the ActiveRecord methods instead of resorting to direct SQL.
This kind of bugged me because it means I can’t complete my code and data changes in one chunk. It needs to be broken down into two code updates and two migrations. But how to organize the split? I’d basically completed the code updates before thinking about this, so my choice was to update all the code and then remove the leftovers in a tiny subsequent update.
However for the future, I might try a different approach which is to create the minimal data migration to add the new tables and relationship data up front before working on any actual code changes. In this case the new database contents can silently co-exist with the old until the full code changes are done. Then I can deploy the new version of the app in one shot without any leftovers.
August 4, 2007 at 5:28AM
It’s quite common practice to re-define model inside migration class. I think that should solve your problem.
Richard Livsey says…
August 4, 2007 at 6:57PM
As Pratik says, define your models in the migration as your migration shouldn’t care about how they are setup in your actual app.
This is also handy, as you can define helper methods in the migration model to tidy up the migration, but you don’t want them in the actual model itself.
In cases where the models db structure changes during a migration, you can call Model.reset_column_information to reload it.
Gabe da Silveira says…
August 5, 2007 at 1:05AM
Hmm, thanks for the tips guys. That approach never crossed my mind.
Gabe da Silveira says…
August 7, 2007 at 5:35AM
To follow up on this, I found the ideal solution in this case to be to extend rather than redefine the User class. I did this by placing the following at the top of my migration file (outside the actual migration class):
This way I didn’t have to redefine the whole class, only the little bit I needed. The ‘require’ is necessary otherwise defining the class will prevent the lazy-loading of the application model.