Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adminaction consumer: failed to write some logs #17

Open
jsandova opened this issue Dec 23, 2020 · 4 comments
Open

adminaction consumer: failed to write some logs #17

jsandova opened this issue Dec 23, 2020 · 4 comments

Comments

@jsandova
Copy link

I am getting the following error about adminaction unable to write to logs. Any ideas?

2020-12-22 23:19:50 INFO Starting DuoLogSync
2020-12-22 23:19:50 INFO DuoLogSync: Opening connection to ls01-dev-qa.aofk.net:2514
2020-12-22 23:19:50 INFO duo_client Admin initialized for ikey: *******, host: api-**.duosecurity.com
2020-12-22 23:19:50 ERROR Could not read checkpoint file for adminaction logs, consuming logs from {log_offset} timestamp
2020-12-22 23:19:50 ERROR Could not read checkpoint file for auth logs, consuming logs from {log_offset} timestamp
2020-12-22 23:19:50 INFO adminaction producer: fetching next logs after 120 seconds
2020-12-22 23:19:50 INFO adminaction consumer: waiting for logs
2020-12-22 23:19:50 INFO auth producer: fetching next logs after 120 seconds
2020-12-22 23:19:50 INFO auth consumer: waiting for logs
2020-12-22 23:21:50 INFO adminaction producer: fetching logs
2020-12-22 23:21:50 INFO auth producer: fetching logs
Traceback (most recent call last):
2020-12-22 23:21:50 INFO adminaction producer: adding 57 logs to the queue
2020-12-22 23:21:50 INFO adminaction producer: added 57 logs to the queue
2020-12-22 23:21:50 INFO adminaction producer: fetching next logs after 120 seconds
2020-12-22 23:21:50 INFO adminaction consumer: received 57 logs from producer
2020-12-22 23:21:50 INFO adminaction consumer: writing logs
2020-12-22 23:21:50 WARNING adminaction consumer: failed to write some logs
File "/usr/local/lib/python3.6/dist-packages/duologsync-2.0.0-py3.6.egg/duologsync/consumer/consumer.py", line 66, in consume
File "/usr/local/lib/python3.6/dist-packages/duologsync-2.0.0-py3.6.egg/duologsync/writer.py", line 97, in write
File "/usr/lib/python3.6/asyncio/streams.py", line 329, in drain
raise exc
File "/usr/lib/python3.6/asyncio/selector_events.py", line 714, in _read_ready
data = self._sock.recv(self.max_size)
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/bin/duologsync", line 11, in
load_entry_point('duologsync==2.0.0', 'console_scripts', 'duologsync')()
File "/usr/local/lib/python3.6/dist-packages/duologsync-2.0.0-py3.6.egg/duologsync/app.py", line 78, in main
File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "/usr/local/lib/python3.6/dist-packages/duologsync-2.0.0-py3.6.egg/duologsync/consumer/consumer.py", line 88, in consume
File "/usr/local/lib/python3.6/dist-packages/duologsync-2.0.0-py3.6.egg/duologsync/producer/producer.py", line 205, in get_log_offset
TypeError: 'NoneType' object is not subscriptable

@jsandova
Copy link
Author

jsandova commented Dec 23, 2020

Here is my config.yml file.

version: '1.0.0'
dls_settings:
log_format: 'JSON'
api:
offset: 1
checkpointing:
enabled: True
directory: '/var/log/duo-logs'
servers:

  • id: 'duo-logging'
    hostname: '10.176.18.45'
    port: 2514
    protocol: 'TCP'
    account:
    ikey: ''
    skey: '
    '
    hostname: 'api-***.duosecurity.com'
    endpoint_server_mappings:
    • endpoints: ['adminaction', 'auth']
      server: 'duo-logging'
      is_msp: False

@jsandova
Copy link
Author

I was able to get it working by switching to UDP and using fluentd to forward the logs to our datadog logging console.

@rka
Copy link

rka commented Jan 4, 2021

I had a similar issue and also solved it by using UDP. Really annoying because we would like to use TCP :/

@skikd636
Copy link

I get the same failure. This is because the code does not handle a TCP connection reset at all, doesn't even shut down cleanly. The second part of the traceback about the None object is because of the unhandled ConnectionResetError. The finally statement in the code tries to access the last_log_written variable (line 92 consumer.py) which was set to None and doesn't get assigned because the writer.write function throws the exception. Hence the TypeError for the None object.

Anyway. The way I got around this, because I MUST use TCP connection into our logging infrastructure, was to start this code in a bash script with an infinite loop. I'll share the code below. It's a work around not a fix but it works since, other than not handling the connection resets, the code works fine. I didn't want to dig into the code to try and fix the actual issue.

Here's my simple workaround, the only issue with it is if duologsync is crashing for another reason it will keep getting restarted. You will need to monitor that some other way. This worked for me once I got doulogsync running successfully by itself, just thought I'd share.

#!/bin/bash

set -o nounset

_term() { 
  echo "Caught SIGTERM signal!" 
  kill -TERM "$child" 2>/dev/null
  exit 
}

trap _term SIGTERM SIGINT SIGKILL 

while true
do
    echo "duologsync process starting..."
    duologsync <path to>/config.yml &
    child=$!
    wait "$child"
    exitcode=$?
    echo "Process ended with exit code ${exitcode}"
    echo "Restarting..."
    sleep 120
done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants