Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sidecar containers for docker agents #84

Open
arvindsv opened this issue Dec 13, 2018 · 6 comments
Open

Sidecar containers for docker agents #84

arvindsv opened this issue Dec 13, 2018 · 6 comments

Comments

@arvindsv
Copy link
Contributor

arvindsv commented Dec 13, 2018

This might not be the right place for it (should it be part of docker-swarm elastic agents or k8s maybe?). However, it will do for now.

The idea I want to discuss is the same as the title: Have a docker image which consists of the GoCD Agent. Use --volumes-from to bring that into a user-specified docker image which doesn't have the GoCD agent at all, and run it from there.

The obvious benefit is that images don't need to be built/rebuilt with the GoCD agent in it. So, users can use whatever they already have, which might be based on scratch or compliant with their security protocols, etc.

All of the elastic agent plugins which use docker containers will need to change to allow this. They'll need to be aware of the sidecar container too.

@ketan says: "We'd likely need to setup some bootstrapper to change dir to the actual container before running the agent (which sits on the sidecar container). That way the --volumes-from can remain mounted read-only, and more importantly, pristine. It should be possible to publish just a sidecar image with the jre+golang bootstrapper for musl and glibc."

@arvindsv
Copy link
Contributor Author

There is no way to not have the GoCD agent be involved at all, since it is what knows how to send logs back securely, etc. It will be needed for that communication, for sure.

@EugenMayer
Copy link

EugenMayer commented Dec 14, 2018

Over the last 2 years ( since i looked at GoCD and was generally very impressed ) i looked at a lot CI/CD solutions and also there technical implementation regarding agents/docker (currently using ConcourseCI).

Those CI/CD solutions split into 4 groups (in terms of docker build integration):
a) No integration: We can build docker images only ( no other docker integration ) + run builds on multi-purpose agents (non docker)
b) Third class citizen: We can run a build in a docker-image of your kind if you make it very special + a)
c) First class citizen: We can run a builds in any docker-image + run builds on multi-purpose agents (non docker)
d) Pure/Extrem: We run all builds in docker-images

I specifically did not pick up any features regarding docker-compose or "run this docker stack" related so we keep the word "docker integration" to be "running builds in a docker container". Also i not mean docker swarm/k8s base agent-scaling ( what you call elastic ) here. We are talking about the "docker builder" concept as docker integration

For example Teamcity/Jenkins would be c), concourse/circleCi would be d) and GoCD/Bamboo is b), rarely its a)

Back to the topic, all implementations in c) and d) are actually the same, they just differ in the underlying "host" running the docker builder .

in c) we often see a non-docker host running a docker engine. Also a special build task called can be found "run this cli commands in a docker image aka builder". Also the host is a build agent itself and can build "locally" too, usually the default task

in d) we often see the host being either very slim, docker engine only, or being a docker container itself. All tasks run in builders, no other options


I call the docker-image we build in "builder" and the host creating/destroying the builds "host"

The technical implementations are always the same:

i) the "agent" (host) starts a docker-image and uses volume mounts for input-artifacts (materials), and outputs like logs / artfifacts. So like

docker -v /tmp/<buildID_projectname>/materials: /ci/input -v /tmp/<buildID_projectname>/output:/ci/ouput

II) The "Build" itself is run as a "script", which the user defines either inline in the build or a command ( pre-deployed script in the builder). For the former, the script is just saved on the host and volume-mounted into the docker-container-builder

III) the build-script ( often called task ) is run using and all logs are already piped to the host

busyscript.sh is just a blocking while loop to keep the container up`

docker run --rm -v .. -v .. -v .. --entrypoint /busyscript.sh <builder-image>
docker exec <builder-container> /tmp/buildscript > /tmp/<buildID_projectname>/logs/build.logs

or using

docker run --rm -v .. -v .. -v .. --entrypoint /tmp/buildscript <builder-image>  > /tmp/<buildID_projectname>/logs/build.logs

IV) If docker is "a requirement" in the builder, one is required to use Docker in Docker

**V) ** Some also add a configureable ( several ) cache-mount which is not flushed. Usually we are talking about .m2 or .npm and so on, to speed up


And that is about it. If an "agent" is used, its used on the host and the agent can then transport whatever there is /tmp/<buildID_projectname> on the host to the "central brain", so artifacts, logs or whatever is needed.

"Users" are often required to design there build-scripts that way, that the copy there artifacts into the /ci/output folder in the end, so that is the folder the agent scans using the "artifacts regexp" and uploads or whatever. Some , often the d) group, parse the artifacts on the builder directly and either use docker cp or docker exec cp <regexp> /ci/output


Those concept are robust, intuitiv, do not require the user to use special docker images. The d) group removes the entire "requirements" section from the host, including the cleanup job. There is no agent pollution in a lot more.

c) is nearly in all cases CI/CD solutions which have been there for a while and transition to docker-only builds somehow or consider the mix of agent-local or docker-based builds as a good solution ( why not ?) but try to have docker-based builds as first-class citizen too.
Some of them try to transition to d) now ( Jenkins i would say ) and so on.


Sorry for the huge post, but i see the GoCD people trying to onboard to the c) group for quiet a while now (with a crying eye) but i somewhat stick to your build-agent idea so much that you are not able to deliver the "docker builder" experience without making the docker-build image "your own"

In the c) d) group people do not even create own containers, they pick node, ruby, go, php,jdk` officials from hub and run there builds inside - they do not even create there own containers. Thats the whole point .. you start what you need, and sometimes your build your own build based on the above.


It would be so great if you could just adopt the docker run based concept, which is technically speaking, not complicated at all and just finish the job.

You could throw away all your docs about how to "create a docker image" (builders) and all those things. It just works.


To go back to the initial issue topic, why use it as a sidekick, at all? Use the agent on the host, run the task in the builder and get all the assets back using volumes.

You will go for the dead horse otherwise, since docker swarm does not support volumes (v3) and you will overcomplicate the setup more then needed.

Whoever happened to read until this very bottom, sorry for the long post. Just add a sad face and punish me :)

@arvindsv
Copy link
Contributor Author

@EugenMayer Thank you for taking the time to write a long, informational comment! Obviously, I can't respond point-by-point. :) But, I'll try and answer some questions / comments:

1. Why use a sidecar at all? Why not just docker run -v ...?

GoCD's concepts include: one agent per job (and one job per agent) at a time. The agent is tasked with checking out materials including plugin materials, continuous communication with the server, sending it logs periodically, telling it that the agent is alive, uploading artifacts securely once the job is finished, handling agent-side plugins, etc.

The server uses this communication to show updated logs to the users, to make sure jobs are rescheduled if an agent dies or times out, etc.

As far as I know, just doing a docker run with no agent/worker in sight is not what most CI/CD solutions do, because they also have all of these concerns, such as logs which update periodically, etc. In concourse, mostly because of being designed more recently, I appreciate the decision to keep it simple with all plugin-like-things (resources, etc) being similar to everything else (scripts, rather than bespoke plugins). This certainly gives more flexibility.

I'm not a fan of breaking backwards compatibility and I'm looking for a solution which allows plugins, etc. as a part of the build, rather than just a scripts. We have had this discussion before, where your opinion is that I shouldn't care about backwards compatibility because it is holding us back. I recognize that opinion but disagree with it. Mostly because I have no choice :/, because causing more work for everyone who uses it seems wrong to me. I think the @EugenMayer's of the world can easily adapt (and kudos to you! I really mean that very respectfully), but different people have different timelines, capabilities and constraints and in my opinion, it is unfair to break compatibility and force everyone to re-do work because I think a way is simpler for me.

2. docker-image and builder pattern scenario in 5 steps:

If, by builder image, you mean: node, ruby, go, php, ... - then the idea is to do that. I don't like the idea of having to add a GoCD agent to an image which already exists, either. However, my question about sidecars is as simple as whether this approach is meaningful:

Your suggestion:
docker run --rm -v .. -v .. -v .. --entrypoint /tmp/buildscript <builder-image>

My suggestion:
docker run --rm -v .. -v .. -v .. -v /tmp/gocd-agent:/gocd-agent --entrypoint /gocd-agent/start <builder-image>

In GoCD's case, the build script might not just be a shell script, but would be a small GoCD agent which connects to the server and gets a job to execute, which it executes in the new container, which is based off a standard image. Because the build script, in this case, is not just a shell script.

Instead of -v, I suggested --volumes-from because it might be easier, but it's not a big deal. Finally, it's a simple volume-mounted directory.

You could throw away all your docs about how to "create a docker image" (builders) and all those things. It just works.

:) Yes. Agree.

3. cache-mount directory:

Yes, I think that should be a part of it too.

Finally:

I understand that in your usecase the build script is a simple shell script, as it should be. That allows for ultimate flexibility and can be version-controlled and is all that is good! Some of GoCD's modeling ability around a job provides some flexibility which makes it a little hard. An example is a job which looks like this:

  1. Task 1: echo something
  2. Task 2: Fetch an artifact
  3. Task 3: Do something with it
  4. Task 4: Plugin task which needs to be called

I'll see if there is a special case where the task is simple, that we can do something about, to make the setup even easier. I think we both have the same ideal of making it easy.

@EugenMayer
Copy link

Well i expected the agent to be a tunnel for materials and all kind of inputs, including the way back of the artifacts, but i cannot see a reason why the agent operates "locally" as if the task is running locally, but the task actually is running in the docker build and streaming back all the assets including the logs, live on the agents file-system using a host-mount.

This seems absolutely transparent and does not break anything. Beside that, i do not intend to break the agent, it should be where it is actually for all people to use who they want it / has been used in the past.

I want to give the agent a new mode of execution with pipes back - in a docker container.

Every single CI/CD solution came from the very same spot GoCD is right now, and they either had a "transition" time with this kind of "agent with special run mode" or they are still use it as a permanent solution to have both worlds for any kind of user taste.

So i did not intend to remove anything from the current GoCD capabilities.


But hands down

docker run --rm -v .. -v .. -v .. -v /tmp/gocd-agent:/gocd-agent --entrypoint /gocd-agent/start <builder-image>

that is not a sidecar go but a really good idea also! I would say, this could be even less "intrusive" / less work so i agree, that sounds like a good idea by all means. Be sure to compile the gocd-agent (go version) using the CGO_ENABLED=0 + -tags netgo in golang so it works on every linux distro (and now net-code is dynamically linked).

Other then that, i would really love to see that!


GoCD is just the best solution out there, but "builders" are just nothing "optional" or "nice to have" anymore for me. We build so many different languages with so many different versions of them, from node8-11, ruby1.9-2.4, java7-10,php5.6-7.2, golang 1.9-1.11 ... there is no way i can configure and maintain this matrix for all the projects in some "self made gocd agents" or something i would need to build myself. The developers prepare the build environment and that is what they test in locally, build in and build in in ci/prod.

I would say, there are a lot of pros/cons/concerns about docker in production, a lot of opinions one can have.

But i am not sure there is a single con point for building CI task in a docker container - that is why the whole sector is moving there (IMHO).

And with GoCD's material/ pipeline concept..combined with that..i am dreaming again, sorry :)

@varshavaradarajan
Copy link
Contributor

So, this would mean allowing agents to upload artifacts from outside the agent working directory? Currently, there is no validation while trying to configure a job with a build artifact which is outside of the agent working directory. But the job fails with The rule [/tmp/go-agent.zip] cannot match any resource under [pipelines/up42].

Thought of mentioning the above because of We'd likely need to setup some bootstrapper to change dir to the actual container before running the agent. I don't know what that would involve, because all the agent related files should be in the shared volume, right? Does this involve changing the AGENT_WORK_DIR to be something on the primary container instead of the volume?

What happens when the user that should run the primary container is not the go user that's been created in the gocd agent docker images? Access-wise I mean, in case something needs to be on a shared volume? And if something owned by some other user needs to be uploaded as an artifact to the gocd server?

cc @ketan , @arvindsv

@arvindsv
Copy link
Contributor Author

arvindsv commented Dec 17, 2018

So, this would mean allowing agents to upload artifacts from outside the agent working directory?

Not necessarily. The agent is running inside the container and can enforce whatever rules it wants to, right?

Thought of mentioning the above because of We'd likely need to setup some bootstrapper to change dir to the actual container before running the agent. I don't know what that would involve, because all the agent related files should be in the shared volume, right? Does this involve changing the AGENT_WORK_DIR to be something on the primary container instead of the volume?

Yes, that changing of AGENT_WORK_DIR would be the main change, I feel. Won't know till we try it.

What happens when the user that should run the primary container is not the go user that's been created in the gocd agent docker images? Access-wise I mean, in case something needs to be on a shared volume? And if something owned by some other user needs to be uploaded as an artifact to the gocd server?

If it is going to be just a docker run -v, then the permissions might not matter much. Whatever user it is running as, will do.

Also: @EugenMayer is right. I probably misused the word "sidecar".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants