2

We set up GCE instances using terraform and then use ansible-playbooks in order to provision them and get our services onto the machines.

I'm running a project in our organisation which needs to pull a docker image from a different project. The images are hosted in a container registry in that other project.

My ideal sequence of events would be:

  1. Create a GCE in my project using terraform with properly configured service-accounts.
  2. Use ansible to install docker on the GCE.
  3. Use ansible module docker_container to pull the necessary image I want from the container registry.

This seemingly simple workflow is not trivial. At first, I discovered that just running docker_container fails since docker needs to be authenticated first. Given that I don't want to login to the machine and set it up with the credential helper etc., the only way I have is to try and run the command docker login -u _json_key -p<jsonkeyfile> http://gcr.io

I can get this to run directly on the command line if i login to the machine in question but trying to get it to run using ansible docker-login is giving me nightmares separate question so I want to avoid it altogether. The GCE instance is created with a dedicated service account pre-configured during creation (with terraform). All the roles have been granted to the account as I can login and pull images if i use the service account key from the command line.

What I really expect is that in step 1 above, if I use a google service account with all the proper credentials, the GCE instance should already be set up to talk to the container registry. Is there a way to make this work purely as part of the startup configuration? I looked into https://cloud.google.com/container-optimized-os/docs/ but I don't want to go with chromeOS yet, besides I don't know even if that will be set up out of the box although it feels so from reading the documentation.

Is there a way to pre-setup a docker ready GCE instance? If not, has anyone tried out an ansible based workflow using docker login and got it to work?

Devu
  • 381
  • 3
  • 15
  • FYI for anyone interested, there is a google terraform module to prepare the instance for running the container: https://github.com/terraform-google-modules/terraform-google-container-vm – Teghan Nightengale Jun 20 '22 at 02:10

1 Answers1

0

You should take advantage of GCE native docker capability (with container optimized OS!) instead of manually installing Docker and configuring it. This allows the image to be pulled and started as part of the booting process. For authentication you should be able to authorize your Compute Engine default service account to access the image in GCR even from another project. Alternatively create and use a dedicated service account, see this doc on how to activate it in an instance. Also see here about giving IAM roles to service accounts.

To create a Compute Engine instance with a Docker image you can use this Gcloud command (not sure if Ansible has an equivalent)

gcloud beta compute instances create-with-container

NB: Consider posting this type of question in ServerFault instead as it will be targeted to a network and system administrators audience. StackOverflow is more suited for developer community.

Notauser
  • 406
  • 2
  • 10
  • Yes, I've read the resources you mention backwards almost. Like I wrote in the question, I don't want to use container optimised OS at this point because it's a chromium based image. We're in the habit of using Ubuntu based stuff and that would lead me into unchartered territory with other stuff within the company. – Devu Jul 29 '19 at 20:15
  • The GCE instance is created with a dedicated service account pre-configured during creation (with terraform). All the roles have been granted to the account as I can login and pull images if i use the service account key from the command line. Added this to the question now. – Devu Jul 29 '19 at 20:18
  • I understand that you are getting authentication errors using the dedicated service account but not when you login to the instance and use command line. I'm thinking about an IAM issue (insufficient access rights). Maybe you could share the errors you're getting at that stage. Also have a look at this doc if not already read, it might be useful for your use case. https://cloud.google.com/compute/docs/containers/deploying-containers – Notauser Jul 29 '19 at 20:40
  • When I use the command line, I use the naked docker login command `docker login -u _json_key -p "$JSON_KEY" http://eu.gcr.io ` That's why it works. This is exactly what I want to avoid doing. I would like that a machine once created should already be docker image pull capable. I've seen that you can do this if you use Kubernetes or opt for a container-os image but I don't want to be forced into doing either. – Devu Jul 30 '19 at 09:09
  • Have you explored all the options here ? https://cloud.google.com/container-registry/docs/advanced-authentication – Notauser Jul 31 '19 at 18:18
  • Option number 4 is the one I'm trying to in the [separate question](https://stackoverflow.com/questions/57260374/docker-login-to-gce-using-ansible-docker-login-and-json-key) i mentioned about. I got that to work so my problem is solved for now. But the point is I don't want to be wanting to do any of this. I expect that I get a docker ready GCE machine on creation itself that can talk seamlessly with the container registry. They let me do this if i'm using Kubernetes (which has a container ready ubuntu image or if i used container optimised OS, but not for a vanilla GCE machine) – Devu Aug 02 '19 at 07:24