3

I would to run a script (populate my MySql Docker container) only when my docker containers are built. I'm running the following docker-compose.yml file, which contains a Django container.

version: '3' services: mysql: restart: always image: mysql:5.7 environment: MYSQL_DATABASE: 'maps_data' # So you don't have to use root, but you can if you like MYSQL_USER: 'chicommons' # You can use whatever password you like MYSQL_PASSWORD: 'password' # Password for root access MYSQL_ROOT_PASSWORD: 'password' ports: - "3406:3406" volumes: - my-db:/var/lib/mysql web: restart: always build: ./web ports: # to access the container from outside - "8000:8000" env_file: .env environment: DEBUG: 'true' command: /usr/local/bin/gunicorn maps.wsgi:application -w 2 -b :8000 depends_on: - mysql apache: restart: always build: ./apache/ ports: - "80:80" #volumes: # - web-static:/www/static links: - web:web volumes: my-db: 

I have this web/Dockerfile

FROM python:3.7-slim RUN apt-get update && apt-get install RUN apt-get install -y libmariadb-dev-compat libmariadb-dev RUN apt-get update \ && apt-get install -y --no-install-recommends gcc \ && rm -rf /var/lib/apt/lists/* RUN python -m pip install --upgrade pip RUN mkdir -p /app/ WORKDIR /app/ COPY requirements.txt requirements.txt RUN python -m pip install -r requirements.txt COPY entrypoint.sh /app/ COPY . /app/ RUN ["chmod", "+x", "/app/entrypoint.sh"] ENTRYPOINT ["/app/entrypoint.sh"] 

and these are the contents of my entrypoint.sh file

#!/bin/bash set -e python manage.py migrate maps python manage.py loaddata maps/fixtures/country_data.yaml python manage.py loaddata maps/fixtures/seed_data.yaml exec "$@" 

The issue is, when I repeatedly run "docker-compose up," the entrypoint.sh script is getting run with its commands. I would prefer the commands only get run when the docker container is first built but they seem to always get run when the container is restored. Is there any way to adjust what I have to achieve this?

5
  • replace ENTRYPOINT ["/app/entrypoint.sh"] with RUN /app/entrypoint.sh". Your script will be run on build. Commented Feb 19, 2020 at 17:10
  • @Jean-JacquesMOIROUX That won’t work: you can’t access databases or other resources managed by Docker Compose during the build phase. Commented Feb 19, 2020 at 17:49
  • @Jean-JacquesMOIROUX, unfortunately, that fails with a Django error, "KeyError: 'DB_NAME'" followed by "ERROR: Service 'web' failed to build: The command '/app/entrypoint.sh' returned a non-zero code: 1." "DB_NAME" is an environment variable defined in an ".env" file (at the same level as docker-compose.yml) Commented Feb 19, 2020 at 17:49
  • 1
    You can, however, put arbitrary code or logic into your entrypoint, so you “just” need to detect if the seed data is already there, and wrap the relevant loading steps into a Bourne shell if ... fi conditional. Commented Feb 19, 2020 at 17:50
  • Hi @DavidMaze, thx for this idea, since I didn't have any. Is there a more standard way of doing something like this? It doesn't seem like that odd of a request to me, but then again, I'm quite unfamiliar with Docker. Commented Feb 19, 2020 at 20:02

5 Answers 5

2
+100

An approach that I've used before is to wrap your loaddata calls in your own management command, which first checks if there's any data in the database, and if there is, doesn't do anything. Something like this:

# your_app/management/commands/maybe_init_data.py from django.core.management import call_command from django.core.management.base import BaseCommand from address.models import Country class Command(BaseCommand): def handle(self, *args, **options): if not Country.objects.exists(): self.stdout.write('Seeding initial data') call_command('loaddata', 'maps/fixtures/country_data.yaml') call_command('loaddata', 'maps/fixtures/seed_data.yaml') 

And then change your entrypoint script to:

python manage.py migrate python manage.py maybe_init_data 

(Assumption here that you have a Country model - replace with a model that you do actually have in your fixtures.)

Sign up to request clarification or add additional context in comments.

Comments

1

The idea of seeding your database in the first run, is a very common case. As others have suggested, you can change your entrypoint.sh script and apply some conditioning logic to it and make it the way you want it to work.

But I think it is a really better practice if you separate the logic for seeding the database and running services and do not keep them tangled to each other. This might cause some unwanted behavior in the future.

I was going to suggest a workaround using docker-compose and started searching for some syntax for excluding some services while doing docker-compose up but found out this is still an open issue. But I found this stack overflow answer witch has suggested a very nice approach.

version: '3' services: all-services: image: docker4w/nsenter-dockerd # you want to put there some small image command: sh -c "echo start" depends_on: - mysql - web - apache mysql: restart: always image: mysql:5.7 environment: MYSQL_DATABASE: 'maps_data' # So you don't have to use root, but you can if you like MYSQL_USER: 'chicommons' # You can use whatever password you like MYSQL_PASSWORD: 'password' # Password for root access MYSQL_ROOT_PASSWORD: 'password' ports: - "3406:3406" volumes: - my-db:/var/lib/mysql web: restart: always build: ./web ports: # to access the container from outside - "8000:8000" env_file: .env environment: DEBUG: 'true' command: /usr/local/bin/gunicorn maps.wsgi:application -w 2 -b :8000 depends_on: - mysql apache: restart: always build: ./apache/ ports: - "80:80" #volumes: # - web-static:/www/static links: - web:web seed: build: ./web env_file: .env environment: DEBUG: 'true' entrypoint: /bin/bash -c "/bin/bash -c \"$${@}\"" command: | /bin/bash -c " set -e python manage.py loaddata maps/fixtures/country_data.yaml python manage.py loaddata maps/fixtures/seed_data.yaml /bin/bash || exit 0 " depends_on: - mysql volumes: my-db: 

If you use something like above, you will be able to run seeding stage before running docker-compose up.

For seeding your databse, run:

docker-compose up seed 

For running all your stack, use:

docker-compose up -d all-services 

I think it is a clean approach and, can be extended to many different scenarios and use cases.

UPDATE

If you really want to be able to run the whole stack altogether and also prevent unexpected behaviors caused by running loaddata command multiple times, I would suggest you define a new django management command to check for existing data. Look at this:

checkseed.py

from django.core.management.base import BaseCommand, CommandError from project.models import Country # or whatever model you have seeded class Command(BaseCommand): help = 'Check if seed data already exists' def handle(self, *args, **options): if Country.objects.all().count() > 0: self.stdout.write(self.style.WARNING('Data already exists .. skipping')) return False # do all the checks for your data integrity self.stdout.write(self.style.SUCCESS('Nothing exists')) return True 

And after this, you can change your seed part of docker-compose as below:

 seed: build: ./web env_file: .env environment: DEBUG: 'true' entrypoint: /bin/bash -c "/bin/bash -c \"$${@}\"" command: | /bin/bash -c " set -e python manage.py checkseed && python manage.py loaddata maps/fixtures/country_data.yaml python manage.py loaddata maps/fixtures/seed_data.yaml /bin/bash || exit 0 " depends_on: - mysql 

This way, you can be sure that if anyone runs docker-compose up -d by mistake, will not cause integrity errors and problems like that.

10 Comments

I really like this idea of separating the seed stuff into a separate portion of the docker-compose file, but I have a couple of questions. Before anything exists at all, you're saying I have to run "docker-compose up -d all-services" to build everything, then "docker-compose down" to stop stuff, then "docker-compose up seed" to seed what I just built, then "docker-compose up" to bring everything up again?
I think you just need to first run docker-conpose up seed and after that, run docker-conpose up -d all-services. You see, in thr first step, a container will be created and after running the commands related to seeding, the container will stop running and close. So you wouldn't need to worry about any leftovers.
If I were to just run "docker-conpose up" without "-d all-services," what would get run?
I think you want to keep -d part of the command. If you were to run docker-compose up -d, a scenario very similar to what you have right now (without my suggested changes) will be run. Technically, if you just run docker-compose up -d, your whole stack + a dummy container (with docker4w/nsenter-dockerd image) will get deployed. Plus, one container will start and exit right after doing its job.
Unfortunately I fear people don't read documentation and some folks are going to run "docker-compose up" and that is why I'm asking what will happen if they do that.
|
0

Instead of using the entrypoint.sh file, why not just run the commands in the web/Dockerfile?

RUN python manage.py migrate maps RUN python manage.py loaddata maps/fixtures/country_data.yaml RUN python manage.py loaddata maps/fixtures/seed_data.yaml 

That way these changes will be baked into the image and, when you start the image, these changes will already have been executed.

1 Comment

Thanks but this results in an error, "KeyError: 'DB_NAME', ERROR: Service 'web' failed to build: The command '/bin/sh -c python manage.py migrate maps' returned a non-zero code: 1"
0

I had a similar case recently. As the "ENTRYPOINT" contains the command that will be executed every time the container starts a solution would be to include some logic on the entrypoint.sh script in order to avoid to apply the updates ( in your case the migration and the load of the data ) if the effects of these operations are already present on the database.

For example:

#!/bin/bash set -e #Function that goes to verify if effects of migration and load data are present on database function checkEffects() { IS_UPDATED=0 #Check effects and set to 1 IS_UPDATED if effects are not present } checkEffects if [[ $IS_UPDATED == 0 ]] then echo "Database already initialized. Nothing to do" else echo "Database is clean. Initializing it" python manage.py migrate maps python manage.py loaddata maps/fixtures/country_data.yaml python manage.py loaddata maps/fixtures/seed_data.yaml fi exec "$@" 

However the scenario is more complex because to verify the effects that allow to decide if to proceed or not with the updates can be a quite difficult if these involves multiple data and data. Moreover it becomes very complex if you think on the containers upgrade over time.

Example: Today you're working with a local Dockerfile for your web service but I think in production you'll start to versioning this service uploading it on a Docker registry. So when you'll upload your first release ( for example the 1.0.0 version ) you'll specify the following on your docker-compose.yml:

web: restart: always image: <DOCKER_REGISTRY_HOST>:<DOCKER_REGISTRY_PORT>/web:1.0.0 ports: # to access the container from outside - "8000:8000" 

Then you'll release the "1.2.0" version of the web service container when you'll include other changes on the schema for example loading other data on entrypoint.sh:

#1.0.0 updates python manage.py migrate maps python manage.py loaddata maps/fixtures/country_data.yaml python manage.py loaddata maps/fixtures/seed_data.yaml #1.2.0 updates python manage.py loaddata maps/fixtures/other_seed_data.yaml 

Here you'll have 2 scenarios ( let's ignore for now the need to check for effects on the script ):

1- You deploy for the first time your services with web:1.2.0: As you start from a clean database you should be sure that all updates are executed ( both 1.1.0 and 1.2.0 ).

The solution to this case is easy because you can just execute all updates.
2- You upgrade web container to 1.2.0 on an existing environment where 1.0.0 was running: As your database has been initialized with updates from 1.0.0 you should be sure that only 1.2.0 updates are executed

Here is difficult because you should be able to check what is the version on database applied in order to skip 1.0.0 updates. This will means you should store the web version somewhere on database for example

As per all this discussion so I think the best solution is to work directly on scripts that goes to create schema and populate data in order to make these instructions idempotent paying attention on upgrade ones.

Some examples:

1- Create a table

Instead to create table as follow:

CREATE TABLE country 

use the if not exists to avoid table already present error:

CREATE TABLE IF NOT EXISTS country 

2- Insert default data

Instead to insert data without primary key specified:

INSERT INTO maps.country (name) VALUES ("USA"); 

Include primary key in order to avoid duplicates:

INSERT INTO maps.country (id,name) VALUES (1,"USA"); 

2 Comments

The CREATE SQL is genreated using Django's migration process and the INSERTs are run through the "python manage.py loaddata maps/fixtures/country_data.yaml". How do I do what you say keeping the Django and Python conventions?
You’ve assigned the bounty and confirmed another answer as correct. Didn’t it solved the issue completely? If not, I can try to see in details this for Django
0

Usually build and deploy steps are separated.

Your ENTRYPOINT is part of deploy. If you want configure manually witch deploy run should run migrate commands and witch just replace containers by a new one (maybe from fresh image), then you can slit it into a separate commands

start database (if not running)

docker-compose -p production -f docker-compose.yml up mysql -d 

migrate

docker run \ --rm \ --network production_default \ --env-file docker.env \ --entrypoint python \ my-backend-image-name:prod python manage.py migrate maps 

and then deploy fresh image

docker-compose -p production -f docker-compose.yml up -d 

And each time manually decide should you run migrate step or not

4 Comments

I'm happy to move the seeding of the database into a build phase (as opposed to a deploy phase), but then does that mean "docker-compose up" is doing both a build and deploy each time it is run?
no, for build you should call docker-compose -p production -f docker-compose.yml build
But if "docker-compose up" doesn't build anything, why do I see images downloaded in the docker console and then I'm able to access my services on the various ports?
It use images that available at the host (if you have image: in yml for service) and in case of no image at host it pulls image from repo and in case no "image:" in yml it builds according to "build:"

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.