Seeding Data In Ruby On Rails

Rails seeds are very powerful

What We'll Cover

  • What Are Seeds?
  • Horror Stories On Why Seeds Are Useful
  • Approaches To Seeding
  • Testing Seeds

Before I Start

What Are Seeds?

They're bits of data we use to pre-populate our database.

They could be anything from a User ever developer will use for local development, to chunks of data used for in all environments.

What Are Seeds?

In Ruby On Rails, we often use the rails db:seed command and you should be able to run it multiple times without fear of it going wrong.

Personally I just run bin/setup & expect it to do everything I need.

What Are Seeds?

Ruby on Rails stores them in the db/seeds.rb file.

▸ app/
▸ bin/
▸ config/
▾ db/
    schema.rb
    seeds.rb

What Are Seeds?

The out of the box the seeds file looks like:

# This file should contain all the record creation needed to seed the database with its default values.
# The data can then be loaded with the rails db:seed command (or created alongside the database with db:setup).
#
# Examples:
#
#   movies = Movie.create([{ name: 'Star Wars' }, { name: 'Lord of the Rings' }])
#   Character.create(name: 'Luke', movie: movies.first)

What Are Seeds?

If I'm lucky, I'd pick up a project where I can safely rerun the seeds:

# db/seeds.rb

User.find_or_create_by!(email: 'admin@example.com') do |user|
  user.password = "12345678"
  user.password_confirmation = "12345678"
end if Rails.env.development?

What Are Seeds?

Sometimes they'd look a bit more like:

# db/seeds.rb

# Last updated 10 years ago - DON'T USE RUN IN PRODUCTION
TaxRate.find_or_create_by!(origin: 'NL', destination: 'GB') do |tax_rate|
  tax_rate.amount = "1.10"
end

🤔

Horror Stories On Why Seeds Are Useful

The stolen laptop

Horror Stories On Why Seeds are Useful

Preview Environment started emailing real users

Horror Stories On Why Seeds are Useful

The poor initial developer experience

Approaches To Seeding

  • Explicitly Defined Seeds
  • Faker Generated Seeds
  • Fixtures & Factories Generated Seeds
  • Anonymised Production Database
  • Plain Old Ruby Objects

Explicitly Defined Seeds

These are the ones you'd find in you db/seeds.rb file.

The files can become very big pretty fast. I've seen them broken up into smaller files before:

# db/seeds.rb
# Load all the files in db/seeds folder
Dir[File.join(Rails.root, 'db', 'seeds', '*.rb')].sort.each do |seed|
  load seed
end

Explicitly Defined Seeds

https://github.com/james2m/seedbank

Gives you more fine grain control of the order your seeds are run:

# db/seeds/companies.seeds.rb
Company.find_or_create_by_name('Hatch')
# db/seeds/projects.seeds.rb
after :companies do
  company = Company.find_by_name('Hatch')
  company.projects.create(title: 'Seedbank')
end

Faker Generated Seeds

https://github.com/faker-ruby/faker

require 'faker'

Faker::Name.name      #=> "Christophe Bartell"

Faker::Internet.email #=> "kirsten.greenholt@corkeryfisher.info"

Faker Generated Seeds

# db/seeds.rb

require 'faker'

User.find_or_create_by!(email: 'admin@example.com') do |user|
  user.name = Faker::Name.name
  user.password = "12345678"
  user.password_confirmation = "12345678"

  user.posts << Post.new(title: Faker::Job.title)
  user.posts << Post.new(title: Faker::Job.title)
  user.posts << Post.new(title: Faker::Job.title)
end if ENV['DURING_RELEASE_SEED_USER'] || Rails.env.development?

Fixtures & Factories Generated Seeds

There is a command rails db:fixtures:load FIXTURES=users,posts, it will load fixtures into your current environment.

You could also loop through your Factories, but ThoughtBot doesn't recommend doing it (it has a lot of short comings).

I've not seen this approach used in the wild.

Anonymised Production Database

https://github.com/evilmartians/evil-seed

require 'evil_seed'
EvilSeed.configure do |config|
  config.root('User', 'created_at > ?', Time.current.beginning_of_day)

  config.anonymize("User")
    name  { Faker::Name.name }
    email { Faker::Internet.email }
  end
end 

EvilSeed.dump('path/to/new_dump.sql')

Plain Old Ruby Objects

Role = Struct.new(:id, :display_adverts, keyword_init: true) do
  def self.find(id)
    all.find { |plan| plan.id == id } || all.first
  end

  def self.all
    @all ||= [
      Role.new(id: 'free', display_adverts: true),
      Role.new(id: 'premium', display_adverts: false)
    ]
  end

  alias_method :display_adverts?, :display_adverts
end

Plain Old Ruby Objects

class User < ApplicationRecord
  def plan
    @plan ||= Plan.find(plan_id)
  end
end

User.new(plan_id: 'premium').plan.display_adverts?
# Outputs: false

Testing Seeds

Yes you can!

# spec/db/seeds_spec.rb
RSpec.describe 'Rails.application' do
  describe '#load_seed' do
    subject { Rails.application.load_seed }

    it do
      expect { subject }.to change(User, :count).by(1)
        .and change(Project, :count).by(1)
    end
  end
end

What is the best way?

  • You should be able to run rails db:seed multiple times without fear.
  • Use them in a preview environment! You'll be more incentivised to keep them up to date.
  • Plain Old Ruby Objects for data that needs to be consistent across all environments.
  • Use partial dumps of production to investigate exceptions more closely.

Questions?

MikeRogers.io
@MikeRogers0 on Twitter

Please remember to Like/Comment/Subscribe!

Could be a user, could be Tax Rates or chucks of fixed data.

The default sample sucks so hard. It can't comfortably be run multiple times & doesn't allow for if the data has changed in the DB since seeding.

If I'm lucky they look like this

Maybe you'll also get this, or just a blank file. They we're made years ago & were to much of a hassle to keep up to date.

First story: Dev given production copy of production DB. Laptop was stolen.

Second story: Preview Environment used real data. It then emailed all the users.

Last Story: Don't make devs jump through hoops to have a decent environment.

Though hasn't been updated in a while

I like this, especially in development environments

Don't use this approach. You might have problems with foreign keys

What if you don't put it in the database at all?

This is fine & my favourite thing to do.

This is one of my favourite bits of code.