Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZeroDivisionError: float division by zero #93

Open
anke-king opened this issue Mar 25, 2024 · 10 comments
Open

ZeroDivisionError: float division by zero #93

anke-king opened this issue Mar 25, 2024 · 10 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@anke-king
Copy link

Describe the bug
following your tutorial with own data:
li.mt.rank_aggregate.by_sample(
adata,
groupby=groupby,
sample_key=sample_key, # sample key by which we which to loop
use_raw=False,
verbose=True, # use 'full' to show all verbose information
n_perms=100, # reduce permutations for speed
return_all_lrs=True, # return all LR values
)
I get a zero division error.
To Reproduce
If possible please provide a minimal reproducible example.
For example, a downsampled version of the your anndata object.

Screenshots
File ~/miniconda3/lib/python3.9/site-packages/liana/method/sc/_liana_pipe.py:300, in _get_lr(adata, resource, groupby_pairs, relevant_cols, mat_mean, mat_max, de_method, base, verbose)
298 dedict[label]['zscores'] = temp.layers['scaled'].mean(axis=0)
299 if logfc_flag:
--> 300 dedict[label]['logfc'] = _calc_log2fc(adata, label)
301 if isinstance(mat_max, np.float32): # cellchat flag
302 dedict[label]['trimean'] = _trimean(temp.X / mat_max)

File ~/miniconda3/lib/python3.9/site-packages/liana/method/sc/_liana_pipe.py:342, in _calc_log2fc(adata, label)
340 # subject and rest means
341 subj_means = subject.layers['normcounts'].mean(0).A.flatten()
--> 342 rest_means = rest.layers['normcounts'].mean(0).A.flatten()
344 # log2 + 1 transform
345 subj_log2means = np.log2(subj_means + 1)

File ~/miniconda3/lib/python3.9/site-packages/scipy/sparse/_base.py:1191, in spmatrix.mean(self, axis, dtype, out)
1189 # axis = 0 or 1 now
1190 if axis == 0:
-> 1191 return (inter_self * (1.0 / self.shape[0])).sum(
1192 axis=0, dtype=res_dtype, out=out)
1193 else:
1194 return (inter_self * (1.0 / self.shape[1])).sum(
1195 axis=1, dtype=res_dtype, out=out)

ZeroDivisionError: float division by zero

variables used:
sample_key = 'sample' (sample key)
condition_key = 'cell_type' (2 cats: malignant/healthy)
groupby = 'day' (7 cats: 7 days)

@anke-king anke-king added bug Something isn't working help wanted Extra attention is needed labels Mar 25, 2024
@dbdimitrov
Copy link
Collaborator

Hi @anke-king,

Is it possible that you have unexpected values in the anndara object? For example, zeroes or nan?

@anke-king
Copy link
Author

Hi, thanks for your reply.
I checked, and I don't have nans or zeros.

@dbdimitrov
Copy link
Collaborator

@anke-king apologies I meant negative values, not zeroes.

@anke-king
Copy link
Author

anke-king commented Apr 4, 2024

I do have negative values, as the data is normalized, scaled and transformed. Does your tool need raw data? Because in the tutorial you also do standard preprocessing steps.

@dbdimitrov
Copy link
Collaborator

Hi @anke-king, notice that in the tutorial I use the log-normalized counts, not the scaled ones :)

@dbdimitrov dbdimitrov removed the bug Something isn't working label Apr 4, 2024
@anke-king
Copy link
Author

thanks for your help! I changed my code and now use log transformed values (so no nan, no negative) instead of scaled, however I still get the same error.

@anke-king
Copy link
Author

I even tried adding 1e-6 to each value and only using highly variable genes because I thought maybe somewhere the algorithms divides by zero, but the float division by zero error still occurs.

@dbdimitrov
Copy link
Collaborator

Hi @anke-king,

You can share a reprex and I can test it on my end.

Best if I get a subset of your data and the code relevant for running LIANA.

@nrclaudio
Copy link

Hey, giving my two cents here as I ran into this issue today. Make sure that each of your sample_key annotations has more than one groupby annotation (e.g., cell types). Otherwise when computing log2fc, rest will be an empty AnnData.

Maybe a good idea to check this beforehand and raise an error to the user?

@dbdimitrov
Copy link
Collaborator

Hi @nrclaudio,

Thanks. I will make sure to check for this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants