Skip to main content

Database Seeding

When spawning environments, you might want to populate your databases with data before using the environment. To support this, Raftt supports database seeding which is defined as a part of the .raftt file.

Seeding Methods

Raftt currently supports three seeding methods -

  • A built-in method to load a .sql dump file into a PostgreSQL database.
  • A built-in method to load a .archive dump file into MongoDB.
  • A user-created script that seeds the databases.

PostgreSQL

Load a database dump file as a PostgreSQL .sql file. To allow all team members to share the same initial database state, we recommend that the dump file be committed to the repo.
This file can be created by pg_dumpall or by running raftt data dump.

To configure seeding of a PostgreSQL database, edit the .raftt configuration file -

db_storage_vol = volume("db_storage")

resources = ... # Load from docker-compose, helm, or k8s manifests
db_pod = resources.pods["db"]

# Use a native PostgreSQL initializer
db_storage_vol.initializer = postgres_volume_initializer(workload=db_pod, dump_file_path="path/to/dump.sql", user="postgres")
db_pod.mount(db_storage_vol, dst="/data")

Link to API Reference.

MongoDB

Load a database dump file as a MongoDB .archive file. To allow all team members to share the same initial database state, we recommend that the dump file be committed to the repo.
This file can be created by mongodump or by running raftt data dump.

To configure seeding of a MongoDB database, edit the .raftt configuration file -

db_storage_vol = volume("db_storage")

resources = ... # Load from docker-compose, helm, or k8s manifests
db_pod = resources.pods["db"]

# Use a native MongoDB initializer
db_storage_vol.initializer = mongodb_volume_initializer(workload=db_pod, dump_file_path="dev_container/dump.archive")
db_pod.mount(db_storage_vol, dst="/data")

Link to API Reference.

Custom Script

If you don't use PostgreSQL or MongoDB, or want a more customized database seeding experience, you can create a custom script.
This allows complete control over how you wish to populate your databases and is database type independent.
Using the custom script, Raftt supports advanced use cases such as:

  • Loading a large dump from S3.
  • Auto-generating DB data.
important

The DB process is not guaranteed to be ready to accept connections before the script is executed.
The script should wait for the DB and poll it.

To configure custom seeding of your database, edit the .raftt configuration file -

db_storage_vol = volume("db_storage")

resources = ... # Load from docker-compose, helm, or k8s manifests
db_pod = resources.pods["db"]

# Use a custom initializer
db_storage_vol.initializer = script_volume_initializer(workload=db_pod, script="bash seed_db.sh")
db_pod.mount(db_storage_vol, dst="/data")

Link to API Reference.

key_provider

Seeding the database using a seeding script can take a while, especially if it downloads large amounts of data. To save time, Raftt can check if the dump was already used.
Raftt checks the dump's unique key created by running the script supplied as keyProvider.
A simple example is a script that prints the hash of the dump file -

shasum dump.sql

Sample Code

The following code samples demonstrate how to poll different databases to ensure they are up and running before seeding them.

#!/usr/bin/env bash
set -e

DB_HOST=db
DB_USER=root
DB_PASSWORD=password
DB_NAME=app_development
DUMP_PATH=dump.sql

retry() {
max_attempts="${1}"; shift
seconds="${1}"; shift
attempt_num=1

until "${@}"; do
if [ "$attempt_num" -eq "$max_attempts" ]
then
echo "Attempt $attempt_num failed and there are no more attempts left!"
exit 1
else
echo "Attempt $attempt_num failed! Trying again in $seconds seconds..."
attempt_num=$(( 1 + attempt_num ))
sleep "$seconds"
fi
done
}

retry 1>&2 "5" "1" mysqladmin -h $DB_HOST -u $DB_USER -p"$DB_PASSWORD" ping

mysql -u $DB_USER -p"$DB_PASSWORD" -h $DB_HOST -e "DROP DATABASE IF EXISTS $DB_NAME;"
mysql -u $DB_USER -p"$DB_PASSWORD" -h $DB_HOST -e "CREATE DATABASE $DB_NAME;"
mysql -u $DB_USER -p"$DB_PASSWORD" -h $DB_HOST $DB_NAME < $DUMP_PATH

raftt data command

To explicitly initiate actions on the databases - dump, save, load, and seed - you can use the raftt data command.

Rebuilding a Workload

When rebuilding a workload using the raftt rebuild, the default behavior is not to reseed the databases that are defined in raftt.yml, but to keep the state as-is. To reseed the databases when rebuilding a service, use raftt rebuild -r.