Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Docker internal errors when running many concurrent requests (Mostly containers that cannot stop or be removed). . #9

Open
StKyr opened this issue Jul 6, 2017 · 1 comment
Assignees

Comments

@StKyr
Copy link
Contributor

StKyr commented Jul 6, 2017

When executing many requests (the number varies and depends on the total workload of the code submitted) I am getting a lot of exceptions thrown from Docker APIs.

I have classified those errors in three categories:

  1. Inactive containers that cannot be removed (because they are not stopped)
    The error I am getting is: Unhandled rejection Error: (HTTP code 409) unexpected - You cannot remove a running container XXX. Stop the container before attempting removal or force remove. and docker stats outputs
CONTAINER           CPU %               MEM USAGE / LIMIT   MEM %               NET I/O             BLOCK I/O           PIDS
534ad169c325        --                  -- / --             --                  --                  --                  --
6abb6d2812ad        --                  -- / --             --                  --                  --                  --
72612cf971c5        --                  -- / --             --                  --                  --                  --

So it seems some containers are never stopped and thus they cannot be removed, however they do not occupy any resources. (JavaBox by the way is not affected, meaning that it answers back the expected result). I am not 100% sure if that bug has performance issues, though.
Note: I am using container.stop(); and container.remove({f: true}) for each container. Option {f: true} made things better but not completely.

  1. Containers that cannot be stopped and continue to run even after JavaBox replies back
    The error in this case is Unhandled rejection Error: (HTTP code 500) server error - Cannot stop XXX: Cannot kill container XXX: rpc error: code = 14 desc = grpc: the connection is unavailable and the output of docker stats is something like:
CONTAINER           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
7d9d77451f8e        9.60%               47.36MiB / 7.682GiB   0.60%               2.59kB / 0B         0B / 32.8kB         16

meaning that those containers are active and continue to execute (that mainly happens when submitted code requests contain infinite loops), sometimes taking up extreme amounts of CPU usage. Of course those slow down the whole system, but JavaBox still replies back with the appropriate results.

  1. Connection error exceptions
    The errors that have appeared so far are
    Unhandled rejection Error: (HTTP code 500) server error - endpoint with name YYYY already exists in network bridge, where YYYY is a weird name like wonderful_jones,
    Unhandled rejection Error: (HTTP code 500) server error - transport is closing, or
    Unhandled rejection Error: (HTTP code 500) server error - grpc: the connection is unavailable.
    Those start to appear when the concurrent submissions are way more (than in the previous cases) and when those happen JavaBox does not reply back to the client (leaving the connection open).
@StKyr
Copy link
Contributor Author

StKyr commented Jul 6, 2017

Also note: When running many multiple submissions and/or a lot of workload my system CPU usage tops at 100%, so maybe my OS is killing(?) or interfering in other ways with the containers causing the errors above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants