Container Optimized OS Docker Shutdown Behavior

Question

I have been deploying containers on GCP Compute Engine VMs using google's Container Optimized OS. I have been slightly struggling to understand the shutdown behavior of the deployed containers when the host VM is stopped in GCP.

When my containers receive a SIGTERM or SIGINT signal, they perform some cleanup behavior and write some files into mounted volumes. I have tested this extensively with docker stop and docker kill -s SIGINT. However, this behavior doesn't seem to be occurring when I stop the host machine in GCP.

I'm not entirely sure how to debug this process. I tried attaching to the VM's serial console, but it doesn't seem to have any info pertaining to the container shutdown logic.

Any guidance would be very appreciated! For reference, this is the image I am deploying.

Full reproduction steps:

Create a new "Compute Engine" VM with "Deploy a container image to this VM." I have been using an e2 medium with a 20GB boot disk.

Use the "lloesche/valheim-server" image.

Set the following env variables:

SERVER_NAME: Test
WORLD_NAME: Test
SERVER_PASS: Password # must be at least 5 characters

Add a Directory mount of type "Directory" with "/config" as the mount path and "/home/YOUR_GCP_USERNAME/valheim-server-config" as the host path in "Read/write" mode.

After the container starts up, you should have the image running on the host machine (lloesche/valheim-server). You should also have a file created at ~/valheim-server-config/worlds/ called Test.fw1.

Now, stopping this container (docker stop) should cause a write to that file. You can verify this by stopping the container and then observing that file's modified date.

However, this process doesn't seem to be occurring when the host instance is stopped. If you restart the host so the container is again running, then issue a "stop" to the host, that file isn't saved before the container is killed.

Can you provide more details on how you tested shutdown behavior or any logs that you have ? How did you try to test the - lets call it "GCP shutdown" ? Can you provide steps for reproduction ? What's you goal here ? — Wojtek_B
– Wojtek_B, Commented Feb 17, 2021 at 8:47
@Wojtek_B - part of my issue is that I'm not seeing any logs that pertain to the docker on the host during shutdown. I have provided full reproduction steps. — si1entstill
– si1entstill, Commented Feb 17, 2021 at 15:14
I think the issue is that the log level for shutdown in "INFO". The Container OS log level is set lower. I never figured out how to change that "persistently" meaning change the level that stays the same after rebooting the instance. My comments to this question show how to change the logging level (2nd comment from the end): stackoverflow.com/q/65721133/8016720 — John Hanley
– John Hanley, Commented Feb 17, 2021 at 19:40
Details from the referenced question: Edit the file /etc/stackdriver/logging.config.d/fluentd-lakitu.conf Look for the section Collects all journal logs with priority >= warning. The PRIORITY is 0 -> 4. If you add "5" and "6" to the list, then the startup-scripts are logged in Operations Logging. However, this change is not persistent across reboots. The question now is how to make this change persistent. — John Hanley
– John Hanley, Commented Feb 17, 2021 at 19:40
@JohnHanley - My docker service's logs are typically being written fine (I see them in the gce logs explorer). However, the first log I see after shutting down the instance is a "Daemon shutdown complete" log about a second after the container receives the "stop" command. — si1entstill
– si1entstill, Commented Feb 17, 2021 at 20:11

Michael Korn · Accepted Answer · 2021-07-19 17:14:53Z

4

I had the same problem and I found a workaround (not perfect but works for me). Add as part of your startup-script:

mkdir -p /etc/systemd/system/docker.service.d
printf "[Service]\nExecStop=/bin/sh -c 'docker stop \$(docker ps -q)'" > /etc/systemd/system/docker.service.d/override.conf

Usually (and also in this case for testing) you can edit the override file (which adds your config to the existing config) with sudo systemctl edit docker.service. Unfortunately, the override file is apparently deleted every time the system starts, which is why I persisted it via the startup-script.

Before this approach a tried what Wojtek_B suggested (sorry, my reputation is too low to comment directly) but that did not work. The reason is, that the docker daemon gets the termination signal before the shutdown script is processed. As involving docker within the shutdown-script of the "Container Optimized OS" fails (or is at least risky) it could be regarded as a bug.

answered Jul 19, 2021 at 17:14

Michael Korn

413 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Voy Over a year ago

It seems that no logs are output to Logs Explorer indicating that this actually worked. I assume this is due to the fact that the instance shuts down its logging before the Docker service is stopped. However, I managed to confirm it working correctly by SSHing and running sudo journalctl -n 100 -u docker.service -f before stopping the instance. Great job!

Micah Smith · Accepted Answer · 2023-12-29 04:34:33Z

Expanding on the answer of @Michael Korn, which did work for me

I'd suggest the following full startup script

#!/bin/bash

# ensure SIGTERM is sent to ALL docker containers if the instance is killed
mkdir -p /etc/systemd/system/docker.service.d
cat <<EOF >/etc/systemd/system/docker.service.d/override.conf
[Service]
ExecStop=/bin/sh -c 'docker ps -q | xargs docker stop --signal TERM --time 60'
EOF
systemctl daemon-reload
systemctl restart docker

docker systemd unit has started before the startup script is written, so first systemd needs to re-read the configuration for docker unit (daemon-reload), then docker unit needs to be restarted.

example command if using "Containers on Compute Engine" via create-with-container (untested in this exact minimal form, sorry)

gcloud compute instances create-with-container test \
  --container-image=gcr.io/your-image:latest \
  --create-disk=auto-delete=yes,device-name=test,image-project=cos-cloud,image-family=cos-101-lts,mode=rw,size=10GB,type=pd-balanced \
  --metadata-from-file=startup-script=path/to/startup-script.sh

Wojtek_B · Accepted Answer · 2021-02-26 11:08:06Z

0

I've went through the logs and found nothing that would point me to a solution.

There may be however a workaround for this.

You can use shutdown script to stop your containers more "gracefully" before VM shutdown;

You can provide the script using gcloud command:

gcloud compute instances create example-instance \
    --metadata-from-file shutdown-script=examples/scripts/install.sh

or using console UI:

In Cloud Console, specify a shutdown script directly using the shutdown-script metadata key:

In the Cloud Console, go to the VM instances page. Go to VM instances

Click Create instance. On the Create a new instance page, fill in the properties for your instance. For advanced configuration options, expand the Management, security, disks, networking, sole tenancy section. In the Metadata section, fill in shutdown-script as the metadata key. In the Value box, supply the contents of your shutdown script. Click Create to create the instance.

Ultimately you can create a new issue at Google Issuetracker and explain what you expect (what kind of behavior).

answered Feb 26, 2021 at 11:08

Wojtek_B

4,4391 gold badge11 silver badges27 bronze badges

2 Comments

si1entstill Over a year ago

Thanks for taking a look! I did try this and it didn't seem as though the host OS was executing the shutdown script (supplied via the metadata key). I was attempting to have it issue a shutdown command to the docker container. I may give it another shot and open an issue on the the issue tracker. Thanks again!

Wojtek_B Over a year ago

Thenks for feedback - if the shutdown script doesn't work for some reason in your case report a bug at IssueTracker since this may affect more users.

Collectives™ on Stack Overflow

Container Optimized OS Docker Shutdown Behavior

Full reproduction steps:

3 Answers 3

1 Comment

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

Full reproduction steps:

3 Answers 3

1 Comment

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related