`test_hpoa` is failing, but only on Python 3.9 #95

caufieldjh · 2024-10-09T22:27:12Z

test_hpoa fails, but not consistently (at least once before I've been able to clear it after re-running the test) and (usually?) only on Python 3.9.

Full log result below:

================================== FAILURES ===================================
_______________________________ test_hpoa[True] ________________________________

group_by_publication = True

    @pytest.mark.parametrize("group_by_publication", [True, False])
    def test_hpoa(group_by_publication):
        wrapper = HPOAWrapper(group_by_publication=group_by_publication)
        with open(INPUT_DIR / "example-phenotype-hpoa.tsv") as file:
>           vars = list(wrapper.objects_from_file(file))

tests/wrappers/test_hpoa.py:19: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
src/curategpt/wrappers/clinical/hpoa_wrapper.py:119: in objects_from_file
    yield from self.objects_from_rows(rows)
src/curategpt/wrappers/clinical/hpoa_wrapper.py:100: in objects_from_rows
    pubs = self.pubmed_wrapper.objects_by_ids(pmids)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = PubmedWrapper(source_locator=None, local_store=None, extractor=None, eutils_client=None, session=<CachedSession(cache=...ngs=CacheSettings(expire_after=-1))>, where=None, email=None, ncbi_key=None, is_fetch_full_text=None, _uses_cache=True)
object_ids = ['PMID:33743206']

    def objects_by_ids(self, object_ids: List[str]) -> List[Dict]:
        pubmed_ids = sorted([x.replace("PMID:", "") for x in object_ids])
        session = self.session
        logger.debug(f"Using session: {session} [cached: {self._uses_cache} for {pubmed_ids}")
    
        # Parameters for the efetch request
        efetch_params = {
            "db": "pubmed",
            "id": ",".join(pubmed_ids),  # Combine PubMed IDs into a comma-separated string
            "rettype": "medline",
            "retmode": "text",
        }
        efetch_response = session.get(EFETCH_URL, params=efetch_params)
        if not self._uses_cache or not efetch_response.from_cache:
            # throttle if not using cache or if not cached
            logger.debug(f"Sleeping for {RATE_LIMIT_DELAY} seconds")
            time.sleep(RATE_LIMIT_DELAY)
        if not efetch_response.ok:
            logger.error(f"Failed to fetch data for {pubmed_ids}")
>           raise ValueError(
                f"Failed to fetch data for {pubmed_ids} using {session} and {efetch_params}"
            )
E           ValueError: Failed to fetch data for ['33743206'] using <CachedSession(cache=<SQLiteCache(name=hpoa_pubmed_cache)>, settings=CacheSettings(expire_after=-1))> and {'db': 'pubmed', 'id': '33743206', 'rettype': 'medline', 'retmode': 'text'}

src/curategpt/wrappers/literature/pubmed_wrapper.py:168: ValueError
----------------------------- Captured stderr call -----------------------------

Downloading hp.db.gz: 0.00B [00:00, ?B/s]
Downloading hp.db.gz:   0%|          | 8.00k/87.2M [00:00<47:10, 32.3kB/s]
Downloading hp.db.gz:   1%|▏         | 1.12M/87.2M [00:00<00:21, 4.24MB/s]
Downloading hp.db.gz:   9%|▉         | 7.99M/87.2M [00:00<00:03, 25.1MB/s]
Downloading hp.db.gz:  18%|█▊        | 16.0M/87.2M [00:00<00:01, 42.5MB/s]
Downloading hp.db.gz:  21%|██▏       | 18.6M/87.2M [00:00<00:02, 30.9MB/s]
Downloading hp.db.gz:  26%|██▌       | 22.3M/87.2M [00:00<00:02, 30.1MB/s]
Downloading hp.db.gz:  28%|██▊       | 24.0M/87.2M [00:01<00:02, 24.3MB/s]
Downloading hp.db.gz:  37%|███▋      | 32.0M/87.2M [00:01<00:01, 34.6MB/s]
Downloading hp.db.gz:  44%|████▍     | 38.3M/87.2M [00:01<00:01, 40.1MB/s]
Downloading hp.db.gz:  46%|████▌     | 40.0M/87.2M [00:01<00:01, 31.8MB/s]
Downloading hp.db.gz:  [53](https://github.com/monarch-initiative/curategpt/actions/runs/11263786466/job/31322911746?pr=94#step:8:54)%|█████▎    | 46.3M/87.2M [00:01<00:01, 31.7MB/s]
Downloading hp.db.gz:  55%|█████▌    | 48.0M/87.2M [00:01<00:01, 24.2MB/s]
Downloading hp.db.gz:  63%|██████▎   | [54](https://github.com/monarch-initiative/curategpt/actions/runs/11263786466/job/31322911746?pr=94#step:8:55).8M/87.2M [00:01<00:01, 33.9MB/s]
Downloading hp.db.gz:  64%|██████▍   | 56.0M/87.2M [00:02<00:01, 28.8MB/s]
Downloading hp.db.gz:  71%|███████▏  | 62.3M/87.2M [00:02<00:00, 36.8MB/s]
Downloading hp.db.gz:  73%|███████▎  | 64.0M/87.2M [00:02<00:00, 31.7MB/s]
Downloading hp.db.gz:  81%|████████  | 70.3M/87.2M [00:02<00:00, 35.8MB/s]
Downloading hp.db.gz:  83%|████████▎ | 72.0M/87.2M [00:02<00:00, 30.6MB/s]
Downloading hp.db.gz:  90%|████████▉ | 78.3M/87.2M [00:02<00:00, 33.6MB/s]
Downloading hp.db.gz:  92%|█████████▏| 80.0M/87.2M [00:02<00:00, 24.4MB/s]
Downloading hp.db.gz:  99%|█████████▉| 86.3M/87.2M [00:03<00:00, 27.9MB/s]
                                                                          
------------------------------ Captured log call -------------------------------
ERROR    curategpt.wrappers.literature.pubmed_wrapper:pubmed_wrapper.py:1[67](https://github.com/monarch-initiative/curategpt/actions/runs/11263786466/job/31322911746?pr=94#step:8:68) Failed to fetch data for ['33743206']
_______________________________ test_hpoa[False] _______________________________

group_by_publication = False

    @pytest.mark.parametrize("group_by_publication", [True, False])
    def test_hpoa(group_by_publication):
        wrapper = HPOAWrapper(group_by_publication=group_by_publication)
        with open(INPUT_DIR / "example-phenotype-hpoa.tsv") as file:
>           vars = list(wrapper.objects_from_file(file))

tests/wrappers/test_hpoa.py:19: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
src/curategpt/wrappers/clinical/hpoa_wrapper.py:119: in objects_from_file
    yield from self.objects_from_rows(rows)
src/curategpt/wrappers/clinical/hpoa_wrapper.py:100: in objects_from_rows
    pubs = self.pubmed_wrapper.objects_by_ids(pmids)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = PubmedWrapper(source_locator=None, local_store=None, extractor=None, eutils_client=None, session=<CachedSession(cache=...ngs=CacheSettings(expire_after=-1))>, where=None, email=None, ncbi_key=None, is_fetch_full_text=None, _uses_cache=True)
object_ids = ['PMID:33743206']

    def objects_by_ids(self, object_ids: List[str]) -> List[Dict]:
        pubmed_ids = sorted([x.replace("PMID:", "") for x in object_ids])
        session = self.session
        logger.debug(f"Using session: {session} [cached: {self._uses_cache} for {pubmed_ids}")
    
        # Parameters for the efetch request
        efetch_params = {
            "db": "pubmed",
            "id": ",".join(pubmed_ids),  # Combine PubMed IDs into a comma-separated string
            "rettype": "medline",
            "retmode": "text",
        }
        efetch_response = session.get(EFETCH_URL, params=efetch_params)
        if not self._uses_cache or not efetch_response.from_cache:
            # throttle if not using cache or if not cached
            logger.debug(f"Sleeping for {RATE_LIMIT_DELAY} seconds")
            time.sleep(RATE_LIMIT_DELAY)
        if not efetch_response.ok:
            logger.error(f"Failed to fetch data for {pubmed_ids}")
>           raise ValueError(
                f"Failed to fetch data for {pubmed_ids} using {session} and {efetch_params}"
            )
E           ValueError: Failed to fetch data for ['33743206'] using <CachedSession(cache=<SQLiteCache(name=hpoa_pubmed_cache)>, settings=CacheSettings(expire_after=-1))> and {'db': 'pubmed', 'id': '33743206', 'rettype': 'medline', 'retmode': 'text'}

src/curategpt/wrappers/literature/pubmed_wrapper.py:1[68](https://github.com/monarch-initiative/curategpt/actions/runs/11263786466/job/31322911746?pr=94#step:8:69): ValueError
------------------------------ Captured log call -------------------------------
ERROR    curategpt.wrappers.literature.pubmed_wrapper:pubmed_wrapper.py:167 Failed to fetch data for ['33743206']

The text was updated successfully, but these errors were encountered:

caufieldjh mentioned this issue Oct 9, 2024

Add Bootstrap function to app; fix some parsing bugs #94

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`test_hpoa` is failing, but only on Python 3.9 #95

`test_hpoa` is failing, but only on Python 3.9 #95

caufieldjh commented Oct 9, 2024

test_hpoa is failing, but only on Python 3.9 #95

test_hpoa is failing, but only on Python 3.9 #95

Comments

caufieldjh commented Oct 9, 2024

`test_hpoa` is failing, but only on Python 3.9 #95

`test_hpoa` is failing, but only on Python 3.9 #95