Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_nwb2asset_remote_asset timed out in test (ubuntu-latest, 3.8, dandi-api) #1450

Open
yarikoptic opened this issue Jun 6, 2024 · 7 comments
Labels
tests Add or improve existing tests

Comments

@yarikoptic
Copy link
Member

https://github.com/dandi/dandi-cli/actions/runs/9395898413/job/25876096821

_________________________ test_nwb2asset_remote_asset __________________________
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/fscacher/cache.py:133: in fingerprinter
    path = op.realpath(path_orig)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/posixpath.py:390: in realpath
    filename = os.fspath(filename)
E   TypeError: expected str, bytes or os.PathLike object, not RemoteReadableAsset

During handling of the above exception, another exception occurred:
dandi/tests/test_metadata.py:889: in test_nwb2asset_remote_asset
    assert nwb2asset(r, digest=digest) == BareAsset.model_construct(
dandi/metadata/nwb.py:149: in nwb2asset
    metadata = get_metadata(nwb_path)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/fscacher/cache.py:138: in fingerprinter
    return f(*args, **kwargs)
dandi/metadata/nwb.py:106: in get_metadata
    meta.update(_get_pynwb_metadata(r))
dandi/pynwb_utils.py:206: in _get_pynwb_metadata
    with open_readable(path) as fp, h5py.File(fp, "r") as h5, NWBHDF5IO(
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/h5py/_hl/files.py:562: in __init__
    fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/h5py/_hl/files.py:235: in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
h5py/_objects.pyx:54: in h5py._objects.with_phil.wrapper
    ???
h5py/_objects.pyx:55: in h5py._objects.with_phil.wrapper
    ???
h5py/h5f.pyx:102: in h5py.h5f.open
    ???
h5py/h5fd.pyx:163: in h5py.h5fd.H5FD_fileobj_read
    ???
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/fsspec/spec.py:1915: in readinto
    data = self.read(out.nbytes)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/fsspec/implementations/http.py:598: in read
    return super().read(length)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/fsspec/spec.py:1897: in read
    out = self.cache._fetch(self.loc, self.loc + length)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/fsspec/caching.py:481: in _fetch
    self.cache = self.fetcher(start, bend)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/fsspec/asyn.py:118: in wrapper
    return sync(self.loop, func, *args, **kwargs)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/fsspec/asyn.py:91: in sync
    if event.wait(1):
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/threading.py:558: in wait
    signaled = self._cond.wait(timeout)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/threading.py:306: in wait
    gotit = waiter.acquire(True, timeout)
E   Failed: Timeout >300.0s
------------------------------ Captured log setup ------------------------------
DEBUG    dandi:dandiapi.py:197 POST http://localhost:8000/api/dandisets/
DEBUG    dandi:dandiapi.py:261 Response: 200
DEBUG    dandi:dandiapi.py:197 GET http://localhost:8000/api/info/
DEBUG    dandi:dandiapi.py:261 Response: 200
DEBUG    dandi:dandiapi.py:493 Using api key from DANDI_API_KEY environment variable
DEBUG    dandi:dandiapi.py:197 GET http://localhost:8000/api/auth/token
DEBUG    dandi:dandiapi.py:261 Response: 200
DEBUG    dandi:dandiset.py:128 Found identifier 000112 in top level 'identifier'
INFO     dandi:upload.py:214 Found 2 files to consider
DEBUG    dandi:dandiapi.py:197 GET http://localhost:8000/api/dandisets/000112/versions/draft/assets/
DEBUG    dandi:dandiapi.py:261 Response: 200
DEBUG    dandi:bases.py:342 Calculated dandi-etag of 2dfea6d6ebb6018f3a5c59707bc69ed3-1 for /tmp/pytest-of-runner/pytest-0/dandiset107/sub-mouse001/sub-mouse001.nwb
DEBUG    dandi:bases.py:351 sub-mouse001/sub-mouse001.nwb: Beginning upload
DEBUG    dandi:dandiapi.py:197 POST http://localhost:8000/api/uploads/initialize/
DEBUG    dandi:dandiapi.py:261 Response: 409
DEBUG    dandi:dandiapi.py:268 Error 409 while sending POST request to http://localhost:8000/api/uploads/initialize/: "Blob already exists."
DEBUG    dandi:bases.py:367 sub-mouse001/sub-mouse001.nwb: Blob already exists on server
DEBUG    dandi:bases.py:446 sub-mouse001/sub-mouse001.nwb: Assigning asset blob to dandiset & version
DEBUG    dandi:dandiapi.py:197 POST http://localhost:8000/api/dandisets/000112/versions/draft/assets/
DEBUG    dandi:dandiapi.py:261 Response: 200
INFO     dandi:bases.py:460 sub-mouse001/sub-mouse001.nwb: Asset successfully uploaded
------------------------------ Captured log call -------------------------------
DEBUG    dandi:dandiapi.py:197 GET http://localhost:8000/api/dandisets/000112/versions/draft/assets/
DEBUG    dandi:dandiapi.py:261 Response: 200
DEBUG    dandi:dandiapi.py:197 GET http://localhost:8000/api/dandisets/000112/versions/draft/assets/b770f47f-b67f-4a58-8883-09a2466eb577/
DEBUG    dandi:dandiapi.py:261 Response: 200
DEBUG    dandi:dandiapi.py:197 GET http://localhost:8000/api/dandisets/000112/versions/draft/assets/b770f47f-b67f-4a58-8883-09a2466eb577/
DEBUG    dandi:dandiapi.py:261 Response: 200
DEBUG    dandi:dandiapi.py:197 GET http://localhost:8000/api/dandisets/000112/versions/draft/assets/b770f47f-b67f-4a58-8883-09a2466eb577/
DEBUG    dandi:dandiapi.py:261 Response: 200

---------- coverage: platform linux, python 3.8.18-final-0 -----------

@jwodder please remind what those trailing requests are for -- waiting for validation or smth? I wonder if it is result of some recent change...

@jwodder
Copy link
Member

jwodder commented Jun 6, 2024

@yarikoptic That's normal. The test performs three methods calls on the remote asset object that each ultimately end up having to request the asset's metadata from the server.

@yarikoptic
Copy link
Member Author

ah, if that is not a sign of repeated rerequests -- can you tell from above what is timing out?

@jwodder
Copy link
Member

jwodder commented Jun 6, 2024

@yarikoptic fsspec is timing out while trying to read the asset over S3.

@yarikoptic
Copy link
Member Author

ok, let's shelf it for now but if we keep coming to it, might need smaller asset I guess.

@yarikoptic
Copy link
Member Author

it hit us in daily tests again https://github.com/dandi/dandi-cli/actions/runs/10026093264/job/27710002745 and apparently it is quite common

dandi@drogon:/mnt/backup/dandi/tinuous-logs/dandi-cli/2024/07$ git grep 'dandi/tests/test_metadata.py::test_nwb2asset_remote_asset - Failed: Timeout' | grep '/test '
05/github/cron/20240705T060344/e2d963d/Tests/5704/test (ubuntu-latest, 3.8, dandi-api)/10_Run Dandi API tests only.txt:2024-07-05T06:19:22.3952144Z FAILED dandi/tests/test_metadata.py::test_nwb2asset_remote_asset - Failed: Timeout >300.0s
07/github/cron/20240707T060412/e2d963d/Tests/5706/test (ubuntu-latest, 3.10, normal)/8_Run all tests.txt:2024-07-07T06:22:14.4078182Z FAILED dandi/tests/test_metadata.py::test_nwb2asset_remote_asset - Failed: Timeout >300.0s
15/github/cron/20240715T060342/547d7b8/Tests/5716/test (ubuntu-latest, 3.10, normal)/8_Run all tests.txt:2024-07-15T06:22:43.5684157Z FAILED dandi/tests/test_metadata.py::test_nwb2asset_remote_asset - Failed: Timeout >300.0s
15/github/cron/20240715T060342/547d7b8/Tests/5716/test (ubuntu-latest, 3.8, dandi-api)/10_Run Dandi API tests only.txt:2024-07-15T06:23:20.7772926Z FAILED dandi/tests/test_metadata.py::test_nwb2asset_remote_asset - Failed: Timeout >300.0s
16/github/cron/20240716T060409/e975d9a/Tests/5719/test (ubuntu-latest, 3.9, normal)/8_Run all tests.txt:2024-07-16T06:27:03.5490446Z FAILED dandi/tests/test_metadata.py::test_nwb2asset_remote_asset - Failed: Timeout >300.0s
19/github/cron/20240719T060339/e975d9a/Tests/5722/test (ubuntu-latest, 3.9, normal)/8_Run all tests.txt:2024-07-19T06:23:03.5577525Z FAILED dandi/tests/test_metadata.py::test_nwb2asset_remote_asset - Failed: Timeout >300.0s
21/github/cron/20240721T060314/e975d9a/Tests/5724/test (ubuntu-latest, 3.12, normal)/8_Run all tests.txt:2024-07-21T06:25:01.5039466Z FAILED dandi/tests/test_metadata.py::test_nwb2asset_remote_asset - Failed: Timeout >300.0s

@yarikoptic yarikoptic reopened this Jul 21, 2024
@jwodder
Copy link
Member

jwodder commented Jul 22, 2024

@yarikoptic I don't believe there's anything that can be done about this. It's almost certainly a network and/or S3 problem (or conceivably some inefficiency in fsspec, but I'm not about to debug that).

@yarikoptic
Copy link
Member Author

I would expect some layer to time out more promptly and so that we have a record that some attempts are repeated. If it just stalls -- it might be something more severe/worth fixing. Let's see how often it would come about (there was a fresh cases in dandi-schema)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests Add or improve existing tests
Projects
None yet
Development

No branches or pull requests

2 participants