This is a more general case of #860 that is worse, but trickier to reproduce.
After adding a new feature, tailscale, my SSH Agent stopped working. The environment variable SSH_AUTH_SOCK was set, but it pointed to a socket in /tmp that no longer existed. It turns out to be a bad interaction between some features and docker-in-docker.
The interaction is order dependent. It can be avoided by explicitly ensuring docker-in-docker is installed first (or at least early) by specifying overrideFeatureInstallOrder.
This seems to be triggered by how docker-in-docker creates a tmpfs /tmp as part of it's entrypoint:
|
# Mount /tmp (conditionally) |
|
if ! mountpoint -q /tmp; then |
|
mount -t tmpfs none /tmp |
|
fi |
When docker-in-docker replaces /tmp, it very predictably drops anything that was already in /tmp. This is always the case. What happens with the interaction with the tailscale feature is that if tailscale installs first, then our sockets are created in the /tmp that is replaced by the later entrypoint of docker-in-docker. This appears to be a race!
I've been able to minimally reproduce this by creating a noop feature that simply sleeps for 30 seconds during it's entrypoint. When it runs first, sleeping for 30 seconds, then the docker-in-docker entrypoint runs removing our sockets.
https://github.com/rhettg/dind-feature-bug
This is a more general case of #860 that is worse, but trickier to reproduce.
After adding a new feature, tailscale, my SSH Agent stopped working. The environment variable
SSH_AUTH_SOCKwas set, but it pointed to a socket in /tmp that no longer existed. It turns out to be a bad interaction between some features anddocker-in-docker.The interaction is order dependent. It can be avoided by explicitly ensuring
docker-in-dockeris installed first (or at least early) by specifyingoverrideFeatureInstallOrder.This seems to be triggered by how
docker-in-dockercreates a tmpfs/tmpas part of it's entrypoint:features/src/docker-in-docker/install.sh
Lines 534 to 537 in 31f99a0
When
docker-in-dockerreplaces/tmp, it very predictably drops anything that was already in /tmp. This is always the case. What happens with the interaction with thetailscalefeature is that iftailscaleinstalls first, then our sockets are created in the /tmp that is replaced by the later entrypoint ofdocker-in-docker. This appears to be a race!I've been able to minimally reproduce this by creating a
noopfeature that simply sleeps for 30 seconds during it's entrypoint. When it runs first, sleeping for 30 seconds, then thedocker-in-dockerentrypoint runs removing our sockets.https://github.com/rhettg/dind-feature-bug