We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
To handle interrupted spot instances and other system-level exceptions, we need a version of @retry that lets non-retrieable user errors go through.
@retry
The example below does the trick for locally scheduled runs but not on production runs on Argo/SFN/Airflow:
import sys import time import traceback from functools import wraps from metaflow import FlowSpec, step, retry from metaflow.exception import METAFLOW_EXIT_DISALLOW_RETRY def platform_retry(f): @wraps(f) def wrapper(self): try: f(self) except: traceback.print_exc() sys.exit(METAFLOW_EXIT_DISALLOW_RETRY) return retry(wrapper) class PlatformRetryFlow(FlowSpec): @platform_retry @step def start(self): time.sleep(10) print('fail', 1 / 0) self.next(self.end) @platform_retry @step def end(self): print("done!") if __name__ == '__main__': PlatformRetryFlow()
We could implement the pattern e.g. as an option in @retry, e.g. @retry(only_system=True)
@retry(only_system=True)
The text was updated successfully, but these errors were encountered:
Successfully merging a pull request may close this issue.
To handle interrupted spot instances and other system-level exceptions, we need a version of
@retry
that lets non-retrieable user errors go through.The example below does the trick for locally scheduled runs but not on production runs on Argo/SFN/Airflow:
We could implement the pattern e.g. as an option in
@retry
, e.g.@retry(only_system=True)
The text was updated successfully, but these errors were encountered: