-
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up transaction management for file_complete handler #930
Conversation
70a6702
to
c2fb610
Compare
servicex_app/servicex_app/resources/internal/transformer_file_complete.py
Show resolved
Hide resolved
servicex_app/servicex_app/resources/internal/transformer_file_complete.py
Outdated
Show resolved
Hide resolved
c2fb610
to
4a4ca9a
Compare
The celery transaction model settings to guarantee we never lose a file could result in duplicate file reports. Handle this by adding a unique key to the transformation_result table. Attempt to insert the record and just report a warning if that fails, but don't increment the file counter.
8ca7037
to
9ad4732
Compare
@@ -456,7 +463,7 @@ def init(args: Union[Namespace, SimpleNamespace], app: Celery) -> None: | |||
"--without-mingle", | |||
"--without-gossip", | |||
"--without-heartbeat", | |||
"--loglevel=warning", | |||
"--loglevel=info", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this bring back a lot of uninteresting verbosity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The better to debug with til we are certain this is fixed -- I suppose it could be a helm chart setting, but that's a bit of a chore to work through
Problem
The TransformerFileComplete resource handler is the most critical code in the entire stack. It responds to each file in the dataset being transformed, and is responsible for updating the total number of files processed (either successfully, or failure) - these two counters are how we determine if the transform is complete. The endpoint will be hit repeatedly by all of the running transformers. Consequently, database transaction handing is very important to avoid missing files.
The current implementation uses implicit transactions and doesn't manage locks and flushing to the DB. It's possible that this allows for files to be lost during big transform requests.
Approach
record_file_complete
to read the request withwith_for_update
flag set which will lock the record in the dbretry
call arguments are captured in a single, new decorator.file_complete_ops_retry
-With this decorator, I ran into a problem with unit tests. Importing the module caused the
current_app.logger
expression to be evaluated. This would throwRuntimeError: Working outside of application context.
in the unit tests. Worked around this in the decorator to only access that logger if we are inside the flask app