Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ctf won't work if metadata file has extra samples #61

Open
mestaki opened this issue Oct 7, 2022 · 0 comments
Open

ctf won't work if metadata file has extra samples #61

mestaki opened this issue Oct 7, 2022 · 0 comments

Comments

@mestaki
Copy link

mestaki commented Oct 7, 2022

Hey @cameronmartino,

A little feature-request:

QIIME 2 Plugin 'gemelli' version 0.0.8 (from package 'gemelli' version 0.0.8)
q2cli version 2021.8.0

In the example below, my table ends up with 5 fewer samples than that in the sample-metadata file after filtering based on min-sample-count

!qiime gemelli ctf \
  --i-table table.qza \
  --m-sample-metadata-file ../clean_metadata.tsv \
  --p-state-column stage_char \
  --p-min-sample-count 5000 \
  --p-individual-id-column host_subject_id \
  --output-dir gemelli/ctf-results \
  --verbose 

And so I get the following error:

/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/gemelli/preprocessing.py:884: RuntimeWarning: Subject(s) (131-05,131-09,131-15,131-12,131-07,131-13,131-02,131-08,131-06,131-11) contains multiple samples. Multiple subject counts will be meaned across samples by subject.
  warnings.warn(''.join(["Subject(s) (", str(duplicated_ids),
Traceback (most recent call last):
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/q2cli/commands.py", line 329, in __call__
    results = action(**arguments)
  File "<decorator-gen-564>", line 2, in ctf
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
    outputs = self._callable_executor_(scope, callable_args,
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/qiime2/sdk/action.py", line 391, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/gemelli/ctf.py", line 620, in ctf
    helper_results = ctf_helper(table,
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/gemelli/ctf.py", line 669, in ctf_helper
    tensal_results = tensals_helper(table,
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/gemelli/ctf.py", line 799, in tensals_helper
    tensor.construct(table, sample_metadata,
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/gemelli/preprocessing.py", line 855, in construct
    self._construct()
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/gemelli/preprocessing.py", line 891, in _construct
    table[dup[0]] = table.loc[:, dup].mean(axis=1).astype(int)
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/pandas/core/indexing.py", line 889, in __getitem__
    return self._getitem_tuple(key)
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/pandas/core/indexing.py", line 1069, in _getitem_tuple
    return self._getitem_tuple_same_dim(tup)
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/pandas/core/indexing.py", line 775, in _getitem_tuple_same_dim
    retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/pandas/core/indexing.py", line 1113, in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/pandas/core/indexing.py", line 1053, in _getitem_iterable
    keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/pandas/core/indexing.py", line 1266, in _get_listlike_indexer
    self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
  File "/home/mestaki/miniconda3/envs/qiime2-2021.8/lib/python3.8/site-packages/pandas/core/indexing.py", line 1321, in _validate_read_indexer
    raise KeyError(
KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported. The following labels were missing: Index(['14010.131.08.1SC.swab.28'], dtype='object'). See [https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike"](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike%22)

Plugin error from gemelli:

  "Passing list-likes to .loc or [] with any missing labels is no longer supported. The following labels were missing: Index(['14010.131.08.1SC.swab.28'], dtype='object'). See [https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike"](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike%22)

See above for debug info.

The issue is resolved if I manually remove those 5 samples from my metadata beforehand.

Would be nice if gemelli could do a check/filter on the sample-metadata after filtering the input table, or, include an ignore-missing-samples option like in empress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant