Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running pycistarget error #30

Open
juliasalas01 opened this issue Feb 13, 2024 · 3 comments
Open

Running pycistarget error #30

juliasalas01 opened this issue Feb 13, 2024 · 3 comments

Comments

@juliasalas01
Copy link

juliasalas01 commented Feb 13, 2024

When running pycistarget I get the error:

2024-02-13 11:53:34,200 cisTarget    INFO     Getting cistromes for NC_000022.11
/home/juliasalas/miniconda3/envs/new_conda/lib/python3.8/site-packages/pycistarget/motif_enrichment_cistarget.py:301: FutureWarning: Passing a set as an indexer is deprecated and will raise in a future version. Use a list instead.
  self.regions_to_db = ctx_db.regions_to_db[self.name] if type(ctx_db.regions_to_db) == dict else ctx_db.regions_to_db.loc[set(coord_to_region_names(self.region_set)) & set(ctx_db.regions_to_db['Target'])]
2024-02-13 11:53:34,309 cisTarget    INFO     Running cisTarget for NC_000023.11 which has 10 regions
2024-02-13 11:53:34,479 cisTarget    INFO     Annotating motifs for NC_000023.11
2024-02-13 11:53:36,078 cisTarget    INFO     Getting cistromes for NC_000023.11
2024-02-13 11:53:36,270 cisTarget    INFO     Done!
2024-02-13 11:53:36,271 pycisTarget_wrapper INFO     /home/juliasalas/piRNA/Workspaces/julia/motifs/CTX_increased_clusters_hum1_All folder already exists.
2024-02-13 11:53:36,736 pycisTarget_wrapper INFO     Running cisTarget without promoters for increased_clusters_hum1
Traceback (most recent call last):
  File "/home/juliasalas/miniconda3/envs/new_conda/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3800, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 203, in pandas._libs.index.IndexEngine._get_loc_duplicates
  File "pandas/_libs/index.pyx", line 211, in pandas._libs.index.IndexEngine._maybe_get_bool_indexer
  File "pandas/_libs/index.pyx", line 107, in pandas._libs.index._unpack_bool_indexer
KeyError: 'Query'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "regions.py", line 9, in <module>
    run_pycistarget(region_sets,
  File "/faststorage/project/piRNA/Workspaces/julia/motifs/scenicplus/src/scenicplus/wrappers/run_pycistarget.py", line 224, in run_pycistarget
    db_regions = set(pd.concat([ctx_db.regions_to_db[x] for x in ctx_db.regions_to_db.keys()])['Query'])
  File "/home/juliasalas/miniconda3/envs/new_conda/lib/python3.8/site-packages/pandas/core/series.py", line 982, in __getitem__
    return self._get_value(key)
  File "/home/juliasalas/miniconda3/envs/new_conda/lib/python3.8/site-packages/pandas/core/series.py", line 1092, in _get_value
    loc = self.index.get_loc(label)
  File "/home/juliasalas/miniconda3/envs/new_conda/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3802, in get_loc
    raise KeyError(key) from err
KeyError: 'Query'

Python version: 3.8 also tried with 3.10.13
Pycistarget: 1.0.3.dev2+g81eb875
Pandas: 1.5.0

I am using the motifs from here: https://resources.aertslab.org/cistarget/motif_collections/v10nr_clust_public/snapshots/#:~:text=motifs%2Dv10%2Dnr.hgnc%2Dm0.00001%2Do0.0.tbl
https://resources.aertslab.org/cistarget/motif_collections/v10nr_clust_public/singletons/

@SeppeDeWinter
Copy link
Collaborator

Hi @juliasalas01

Can you show the command that you are running?

I suspect that you did not format your region set correctly as it seems that the function is running for each chromosome instead of each region set.

Your regions set dictionary should be a dictionary of dictionaries, see example below:

for key in region_sets.keys():
    print(f'{key}: {region_sets[key].keys()}')

topics_otsu: dict_keys(['Topic1', 'Topic2', 'Topic3', 'Topic4', 'Topic5', 'Topic6', 'Topic7', 'Topic8', 'Topic9', 'Topic10', 'Topic11', 'Topic12', 'Topic13', 'Topic14', 'Topic15', 'Topic16'])
topics_top_3: dict_keys(['Topic1', 'Topic2', 'Topic3', 'Topic4', 'Topic5', 'Topic6', 'Topic7', 'Topic8', 'Topic9', 'Topic10', 'Topic11', 'Topic12', 'Topic13', 'Topic14', 'Topic15', 'Topic16'])
DARs: dict_keys(['B_cells_1', 'B_cells_2', 'CD14+_Monocytes', 'CD4_T_cells', 'CD8_T_cells', 'Dendritic_cells', 'FCGR3A+_Monocytes', 'NK_cells'])

Can you run the same code for you region set and provide the output?

All the best,

Seppe

@juliasalas01
Copy link
Author

Hi!
Thank you for your response. My regions input file is a bed file, I have transformed the bed file into a dictionary like this:
{'NC_000001.11': {('6608557', '6636255'): None, ('6623250', '6647368'): None, ('9140669', '9161757'): None, ('9252085', '9280779'): None, ('10364269', '10391240'): None}
But I am not sure this format is compatible.

Thanks

@SeppeDeWinter
Copy link
Collaborator

Hi @juliasalas01

No that does not look allright.

You can read your bed file like this:

import pyranges as pr
regions = pr.read_bed(<PATH_TO_BED_FILE>)

And produce a regions dictionary like this (in case you only have a single bed file):

region_sets = {
    "set1": {"bed_file1": regions}
}

I hope this helps?

All the best,

Seppe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants