Why You Should Avoid Models in Rails Migrations
10 Apr 2021
Humble Beginnings
A simple Rails application exists with two models, Books and Authors.
class Book < ApplicationRecord belongs_to :author end class Author < ApplicationRecord has_many :books end After some domestic success, this simple Rails app goes international. A new column is required on the books table: country to denote which country published the Book.
A Reasonable Migration
To preserve data integrity, the country column should not allow null since null isnβt a country. Existing books also need to have a country set. This is all accomplished within a single migration in 3 steps:
- Add a column
- Update column for existing
bookswith a reasonable value - Add a constraint that the column can not be
null
class AddCountryToBooks < ActiveRecord::Migration[6.1] def up add_column :books, :country, :string, length: 2 Book.update_all(country: 'US') change_column_null :books, :country, false end def down remove_column :books, :country end end This migration does the job but has hidden implications. A developer working in isolation may never run into the trap lurking in this code, but a team could.
Working with Others
Two developers work on this application, divvying up the wild world of books and authors but maintaining healthy work life balances. One developer goes on vacation and has a surprise waiting for them when they return.
| Β | Developer 1 π©βπ» | Developer 2 π¨βπ» |
|---|---|---|
| Day 1 | Write Code | Write Code |
| Day 2 | ποΈ Time Off | ποΈ Create Migration |
| Day 3 | ποΈ Time Off | ποΈ Rename Class |
| Day 4 | ποΈ Time Off | Write Code |
| Day 5 | Run Migrations -> π₯οΈ Error | Write Code |
While Developer 1 was away, the Developer 2 was busy. They wrote the above migration, updated existing data and then a new feature request was completed: A model rename.
Developer 1 returns, updates their local environment, and runs rake db:migrate:
== 20210407191819 AddCountryToBooks: migrating =================== -- add_column(:books, :country, :string, {:length=>2}) -> 0.0015s rake aborted! StandardError: An error has occurred, this and all later migrations canceled: uninitialized constant AddCountryToBooks::Book Book was renamed to Novel breaking the previous migration.
Avoiding Models
Using SQL
When used correctly, Rails migrations can count on the database being in an expected state. The application code moves from commit to commit but database structure (should) only change with migrations. Therefore, even if the Book class is now Novel, the database table books still exists at the time of the AddCountryToBooks migration.
Instead of Book.update_all, writing an update query in SQL and using the base ActiveRecord connection ensures this migration survives model changes like a rename. Because we can count on the state of the database structure, referring to the books table is safe while referring to the Book model may not be.
class AddCountryToBooks < ActiveRecord::Migration[6.1] def up add_column :books, :country, :string, length: 2 ActiveRecord::Base.connection.execute( "UPDATE books SET country = 'US'" ) change_column_null :books, :country, false end def down remove_column :books, :country end end Using Temporary Models
If writing SQL isnβt ideal, a temporary model can be used within the migration. Here, a new Book model is defined to manipulate the books table. Even if app/models/book.rb has been changed to app/models/novel.rb, this definition of Book will be valid within the scope of this migration.
class AddCountryToBooks < ActiveRecord::Migration[6.1] class Book < ApplicationRecord; end def up add_column :books, :country, :string, length: 2 Book.update_all(country: 'US') change_column_null :books, :country, false end def down remove_column :books, :country end end Either of these migration will run correctly for Developer 1 no matter how many model renames happened while they were on vacation.
Problem at Scale
It might not seem like that big of a deal to have a single broken migration. But, within a larger team or with more complex migrations, this can easily become a painful hurdle a team needs to navigate.
Allowing migrations to rely on only the database as a source of truth enables all developers on the team to run migrations without issue. That is until the database or team inevitably get too large and people do database restores instead of running migrations, then this issue is far less common.