How to load data in postgres docker image on creation

Question

I have the following Dockerfile

FROM postgres:9.6.18
ENV POSTGRES_PASSWORD postgres
ENV POSTGRES_DB import
COPY docker/admin-db/*.sql /docker-entrypoint-initdb.d/

This works fine, but every time I buiLD and startup my cluster (docker-compose, combined with an api container) it takes about 2 minutes to load all the sql files (it contains test data). This is not very agile, so I would like to load the data on image creation, not when I am starting the container. As the database image will not change frequently, loading the data during image creation, will most of the time be taken from the cached layers.

How can I start the container during image creation, so the data does not need to be loaded during container start every time I run docker-build?

Jordan Arsenault · Accepted Answer · 2022-02-19 21:39:04Z

I have read quite extensively on this topic; finding that the vast majority of recommendations involve using Docker volumes as well. But, I was also convinced that there must be a way. I've come up with the solution, and it works great.

Make a directory called init-scripts and put initialization SQL in there.
Paste the bash script below into build-bootstrapped-postgres-docker-image.sh alongside said directory.
change the POSTGRES_DB and POSTGRES_PASSWORD if you wish.
Make the script executable.
Execute: ./build-bootstrapped-postgres-docker-image.sh postgres:12.9-alpine your-db 1.0

This will build a pre-populated Postgres Image your-db:1.0 using postgres:12.9-alpine.

#!/bin/bash

# set -o xtrace

PG_IMAGE_NAME=$1
IMG_NAME=$2
IMG_TAG=$3
IMG_FQN="$IMG_NAME:$IMG_TAG"

CONTAINER_NAME="$IMG_NAME-$IMG_TAG-container"

echo 'killing any existing container that is running w same name'
docker kill $CONTAINER_NAME

echo 'running postgres container and bootstrapping schema/data... please wait.'
docker container run \
  --rm \
  --interactive \
  --tty \
  --detach \
  --volume ${PWD}/data:/var/lib/postgresql/data \
  --volume ${PWD}/init-scripts:/docker-entrypoint-initdb.d \
  --name $CONTAINER_NAME \
  --entrypoint /bin/bash \
  --env POSTGRES_DB=database \
  --env POSTGRES_PASSWORD=password \
  --env PGDATA=data \
$PG_IMAGE_NAME

docker container exec -d $CONTAINER_NAME sh -c 'docker-entrypoint.sh postgres >> bootstrap.log 2>&1'

echo 'waiting for container...this may take a while'
grep -q 'IPv4' <(docker exec $CONTAINER_NAME tail -f /bootstrap.log)

# echo 'removing the initialization SQL files'
docker container exec $CONTAINER_NAME rm -rf /docker-entrypoint-initdb.d/*

echo 'stopping pg'
docker container exec -u postgres $CONTAINER_NAME pg_ctl stop -D /data

# commit it.
echo 'committing the container to a new image'

docker container commit \
--change='CMD postgres' \
--change='ENTRYPOINT ["docker-entrypoint.sh"]' \
--change='USER postgres' \
$CONTAINER_NAME $IMG_FQN

# cleanup!
docker kill $CONTAINER_NAME

echo "successfully built $IMG_FQN"

Now you can just run the container:

docker run your-db:1.0

Dharman · Accepted Answer · 2021-02-18 05:21:35Z

0

If you want a simple solution thats a bit manual then you would:

Start the postgres container the normal way to apply the test data.
Exit it. and run docker commit to save the container as a new image.
Use that image as the basis for your testing.

A Completely automated solution that applies the test scripts at docker build time is going to have to understand how to start postgres. Here you can investigate the problem like this:

# Create a postgres container.
docker create postgres:9.6 --name postgres
# Copy the entrypoint script out.
docker cp postgres:/usr/local/bin/docker-entrypoint.sh docker-entrypoint.sh

Now, a modified version of the docker-entrypoint.sh script can be used as a

RUN apply-sql-scripts.sh

in your Dockerfile. It looks potentially complicated.

edited Feb 18, 2021 at 5:21

Dharman♦

33.9k27 gold badges106 silver badges157 bronze badges

answered Feb 18, 2021 at 5:15

Chris Becke

36.5k13 gold badges84 silver badges157 bronze badges

1 Comment

Martijn Burger Over a year ago

Thanks for the answer. I really would like it automated, as the solutions needs to be portable. I guess I have to dig into the script. I was hoping that someone already figured this out here. :)

camba1 · Accepted Answer · 2021-02-18 21:09:02Z

0

Use volumes instead of copying the data to /docker-entrypoint-initdb.d/ on build With Volumes, the first time you bring up the container, it will load all the data. After that it will just reuse the data that is already loaded (which is what you seem to need). As long as you do not delete the volume, your data will always be there when you restart.

Here is a sample:

  pgdb:
    image: postgres
    restart: always
    container_name: pgdb
    env_file: ./postgres/docker-compose.env
    volumes:
      - ./postgres/postgresDB:/var/lib/postgresql/data
      - ./postgres/postgresInit:/docker-entrypoint-initdb.d
    ports:
      - "5432:5432"

answered Feb 18, 2021 at 21:09

camba1

1,8302 gold badges14 silver badges18 bronze badges

3 Comments

Martijn Burger Over a year ago

Correct me if I am wrong, but wouldn't that mean that all changes in the data will be persisted? I am looking for a solution where every time that I run the database container, it starts of with the same data.

camba1 Over a year ago

I see. Yes data changes would be persisted. but you could copy the folder that contains the data (in the example above ./postgres/postgresDB) the first time you start the container. Then you can have a simple script to reset the folder the file right before container start up

Martijn Burger Over a year ago

I could do that, but that would not be a very portable solution. I am still convinced that it should be possible to do it during image creation. In fact, I have been working on a solution that is loading the data during creation. It's partially working for now, but I hope to dive in to it a bit more the next few days.

Collectives™ on Stack Overflow

How to load data in postgres docker image on creation

3 Answers 3

Comments

1 Comment

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related