-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: backend_options should be an instance of coiled.BackendOptions #756
Comments
@crusaderky we can improve the error message here, but to immediately tell you what's wrong I'd need some logging or a minimal reproducer what's |
I was seeing this over in #752 yesterday, but now no longer am (at least the latest CI run passed). Was this fixed on the |
I'm not aware of any relevant coiled-side changes. |
These are the kwargs: package_sync: true
wait_for_workers: true
scheduler_vm_types: [m6i.large]
backend_options:
send_prometheus_metrics: true
spot: true
spot_on_demand_fallback: true
multizone: true
n_workers: 10
worker_vm_types: [m6i.large] # 2CPU, 8GiB |
Thanks. We'll try to get to this either way (especially if it recurs). But a minimal reproducer (small Python code I can run) would have us on it quicker. |
@crusaderky have you seen this error recently? If not, thoughts on closing for now? We can always re-open if needed |
Reproduced again today. It's 100% deterministic. |
Can you print the |
|
This is yaml, i specifically need some python type information |
Here it is: Works: ubuntu-latest-AB_baseline-1 In each artifact zip you will find
>>> pickle.load(open("ubuntu-latest-AB_py310-1/cluster_kwargs.small_cluster.array.pickle", "rb"))
{'name': 'array-8feeff47',
'environ': {'DASK_COILED__TOKEN': "edit: apologies for leaking this"},
'tags': {'GITHUB_JOB': 'tests',
'GITHUB_REF': 'refs/heads/guido/AB_crash',
'GITHUB_RUN_ATTEMPT': '1',
'GITHUB_RUN_ID': '4700058695',
'GITHUB_RUN_NUMBER': '881',
'GITHUB_SHA': 'b898ebb29464ff45a770f2c6b7f821558e7f1ca6'},
'package_sync': True,
'wait_for_workers': True,
'scheduler_vm_types': ['m6i.large'],
'backend_options': {'send_prometheus_metrics': True,
'spot': True,
'spot_on_demand_fallback': True,
'multizone': True},
'n_workers': 10,
'worker_vm_types': ['m6i.large']}
[EDIT] apologies for leaking the token. It's still present in the artifacts so I'm afraid it will need to be regenerated. |
@crusaderky just a heads up, with @ntabris we just regenerated and changed the token in the secrets. If this is needed again due to the artifacts let us know. |
I tried those kwargs and it worked fine. Does the error happen consistently or sporadically? |
It's reproducible 100% of the times. |
Closed by #793 |
I'm failing to run any kind of A/B tests on Python 3.10 or 3.11.
coiled.Cluster
fails withValueError: backend_options should be an instance of coiled.BackendOptions
.e.g. https://github.com/coiled/coiled-runtime/actions/runs/4574323397
This only happens on A/B tests, and only on Python 3.10 and 3.11.
Regular PR/overnight tests on 3.10, jupyter notebooks on 3.10, and 3.8/3.9 A/B tests are fine.
Everything uses coiled-0.5.9.
CC @fjetter @shughes-uk @ntabris
The text was updated successfully, but these errors were encountered: