Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sumologic-collector-docker creates duplicate collectors upon every restart #56

Open
arunderwood opened this issue Nov 4, 2017 · 6 comments

Comments

@arunderwood
Copy link

Currently, every time I docker rm sumologic and docker run sumologic/collector:latest-no-source, the collector resends all my LocalFile sources up to Sumologic, creating a bunch of duplicate logs.

Is there a place in the container that tracks the state of what messages have been synced up that I could mount in a volume to persist sync state between container instances?

@maimaisie
Copy link
Collaborator

Hi. The collector has internal state to keep track of what has been collected and the state is persistent for container restarts but not redeploys, so we recommend not redeploying the collector container unless you need a newer collector version (we don’t release new versions very frequently though.)

Another option is to install and run the collector as a service on your docker host. Unless you uninstall the service, the state is persistent through shutdown, restart and upgrade.

@arunderwood
Copy link
Author

My goal is to make collector state persist through redeploys. Is there a specific directory in the container that holds the state so I can put it on a docker volume?

@maimaisie
Copy link
Collaborator

The collector directory is /opt/SumoCollector/ and contains collector/source configuration and states. However it is not officially verified that collector can persist through redeploys by putting the directory on a volume.

@nhoughto
Copy link

nhoughto commented May 22, 2018

@arunderwood Did you attempt / succeed with this? We are seeing the same behaviour, its very annoying and probably should be mentioned in the README

It is especially a problem when near Sumo usage limits when the sumo service itself causes the collector to shutdown, which then triggers a restart, which then re-ships all the local files causing more usage... around and around..

@arunderwood
Copy link
Author

No, sorry, I never dug into this. I just stopped using the docker collector.

@arunderwood arunderwood changed the title Where is state kept in the container? sumologic-collector-docker creates duplicate collectors upon every restart May 22, 2018
@ragnarkurm
Copy link

ragnarkurm commented Oct 1, 2020

SUCCESS, got it working. (with a couple of downsides).

How it works?

  • I moved the whole /opt/SumoCollector/config to persistent storage, and link it (ln -s) from the original place.
  • Making sure no parallel execution happens during deploy, for example. flock works over NFS.

Downsides

  • The hostname. I seem not to control that. In my case collector name is xxx-a149d5664cc7. The ID (a149d5664cc7) seems to be constant across deployments. I cannot understand where it comes from or how to control that. I tried changing blade JSON files and /etc/hostname before the controller starts, but no avail.
  • The sources update. If you update your sumo-sources.json file, the changes may not be taken into account. At least need to change from web UI: Data Manage > Collection > the collector > Edit > Advanced > Local Configuration File.
  • The flock does not work in every image. In Alpine it does not work, in Debian-based images, it works. One of the alternatives is using a database LOCK TABLES table_name WRITE.

The Dockerfile:

...
COPY my-entrypoint.sh /
ENTRYPOINT ["/my-entrypoint.sh"]

The my-entrypoint.sh:

#!/bin/bash

# This is a wrapper script around the collector run script.
# 1. It persists state.
# 2. Prevents parallelism by locking.

set -xeuo pipefail
export PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin

# Config
sumostate_shared=/storage/logs/sumologic.state
sumostate_conf=/opt/SumoCollector/config
lock=/storage/logs/sumologic.lock

# Initialize the state on persistent storage.
# This is performed only on first execution of the script.
if [[ ! -d "$sumostate_shared" ]]; then
  mv -v "$sumostate_conf" "$sumostate_shared"
fi

# Move away container state.
if [[ -d "$sumostate_conf" ]]; then
  mv -v "$sumostate_conf" "$sumostate_conf.orig"
fi

# Make the persisted state available to the current container.
ln -svf "$sumostate_shared" "$sumostate_conf"

# Make sure we don't run the collector in parallel.
# The collector is not designed to handle a shared state.
# This locking works nicely over NFS or Docker volume.
flock --verbose "$lock" /run.sh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants