Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

option to exit on error instead of retrying #14

Open
disconn3ct opened this issue Jul 23, 2022 · 4 comments
Open

option to exit on error instead of retrying #14

disconn3ct opened this issue Jul 23, 2022 · 4 comments

Comments

@disconn3ct
Copy link

I'm running a containerized version under k8s and having a problem after device resets. It can lose connection to the underlying serial device and hang on a repeated device-not-found error. (It still answers TCP connections, which prevents the health checks from detecting the problem.)

I think an option to simply exit on errors, instead of retrying, would add a great deal of operational flexibility without affecting existing users. (This would also make it easier to work around broken hardware; for example, I've seen cheap tty adapters that requires a usb reset between attempts. If ser2sock exits on error, a simple wrapper could detect the wedged hw and reset or even power-cycle it before restarting ser2sock.)

Unfortunately everything is working this morning so I don't have log examples, but I'll keep an eye out and add them if needed.

@f34rdotcom
Copy link
Contributor

The eventual lock when looking for serial device to return is not good. That needs to just work. Outside of that an option to die on specific errors seems like a simple switch to add. Just inject a test for the flag and exit at specific points. It sounds like you are not using the -c switch that should keep connections out until serial returns. The purpose of this switch was to avoid the situation you describe except the lockup problem.

@disconn3ct
Copy link
Author

I thought -c specifically allowed connections even if the serial is missing.
-c keep incoming connections when a serial device is disconnected

Looking at the code it looks like it does what it says: if the serial is connected OR that flag is set, continue as if the serial is connected. That is the behavior I am already seeing. It also looks like that flag is not parsed until after the accept call on #1030. That accept is all that the health check is looking for, so it passes before the flag is used.

The dumb-hardware lockup isn't specifically a ser2sock problem; it is related to the containerized environment. On a normal host you might expect the system to try to recover (eg a bus reset), so waiting makes sense. In the container, once it gets wedged it won't get fixed without restarting the container. If the health check works, that should be sufficient to cause a reset.

@disconn3ct
Copy link
Author

disconn3ct commented Jul 28, 2022

Edit to add initial error. The first few lines are the k8s health check simply connecting and then dropping. (That happens fairly constantly without causing issues.) After that, the adapter resets (or crashes? or..?) and ser2sock transitions to error mode (accept then close) which still qualifies as healthy due to the accept():

[✔] Socket connected slot 4
[‼] Closing socket fd slot 3 errno: 0 'No error information'
[‼] Closing socket fd slot 4 errno: 0 'No error information'
[✘] Serial disconnected on write. errno: 5 'I/O error'
[✘] Error can not open com port at /dev/ttyUSB0 errno: 6 'No such device or address'
[‼] Socket refused because serial is not connected
[‼] Socket refused because serial is not connected
[✘] Error can not open com port at /dev/ttyUSB0 errno: 6 'No such device or address'

(repeating)

@disconn3ct
Copy link
Author

For anyone else who hits this, turning line 572 from log_message() to error() and exiting instead of returning solves it. (It looks like that is a fatal error on startup, but not while running.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants