Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configurable abort on fatal errors #4943

Merged
merged 5 commits into from
Aug 19, 2024
Merged

Conversation

elliefm
Copy link
Contributor

@elliefm elliefm commented Jun 24, 2024

When a Cyrus process detects a fatal error, it logs an error message and shuts down with a non-zero exit code. This PR adds a new imapd.conf option fatals_abort, which when enabled, will cause Cyrus to call abort() instead of exiting non-zero. Depending on system configuration, this will usually result in a core dump.

Fastmail:
This will conflict with the "Fastmail ONLY - make assertion failures and fatal errors into coredumps" commit on our capstone branch. To deploy this change at Fastmail, I think we need to do something like:

  1. Remove that commit from the capstone branch. New builds will lack the core dump on fatal error behaviour.
  2. Label this PR include-in-fastmail. New builds will have the configurable behaviour, but it won't be enabled.
  3. Deploy new builds across our environment.
  4. Deploy an imapd.conf change to add fatals_abort: yes. We start getting core dumps on fatal errors again.

If we deploy the imapd.conf update before our Cyrus builds include this PR, we might get errors about the unrecognised option. So I think it's better to get the code deployed first, and then turn it on afterwards, though this leaves a window where we won't get core dumps.

We should not land this PR on the master branch until our own deployment process is finished, so that we can use include-in-fastmail to control the timing.

@elliefm elliefm force-pushed the v311/fatals-abort branch 2 times, most recently from 2b4ae28 to 1c9ad1c Compare June 26, 2024 01:44
@elliefm elliefm changed the title V311/fatals abort configurable abort on fatal errors Jul 22, 2024
@elliefm elliefm marked this pull request as ready for review July 22, 2024 01:00
Copy link
Member

@rsto rsto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem to produce a core dump for failed assertions. Is that on purpose?

@elliefm
Copy link
Contributor Author

elliefm commented Jul 22, 2024

This does not seem to produce a core dump for failed assertions. Is that on purpose?

It ought to, assertionfailed() in lib/assert.h has the update to call abort(). Though even if it didn't have that update, it calls fatal() and all the fatals now have the abort anyway.

Where are you seeing it not dump core? Perhaps I should add a test for assertion failures too.

@rsto
Copy link
Member

rsto commented Jul 22, 2024

Sorry, my error. I must have overlooked the changes in lib/assert.h during code review.

Copy link
Member

@brong brong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks for moving it from a local patch to a useful configurable option.

@elliefm elliefm merged commit 352f9a0 into cyrusimap:master Aug 19, 2024
1 check passed
@elliefm elliefm deleted the v311/fatals-abort branch August 19, 2024 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants