You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the 20230830 release, there is a mismatch in the number of cells between the expression matrix and metadata for the Allen MERFISH data. Metadata has 3938808 cells, and the expression matrix has 4334174 cells.
Both of these numbers are different than the number of cells in 20230630 where both datasets had the same number of cells at 4330907.
If the cell numbers are not the same, the spatial data becomes useless, as you can't correspond between cells and xy position. For example, I suspect that the notebooks merfish_tutorial_1,2a,2b show inaccurate maps of gene expression due to this issue (depending on how filtered cells are distributed across sections).
The text was updated successfully, but these errors were encountered:
I can't explain the number mismatch, but expect it's due to changes in some QC criteria - maybe @mkunst23 can?
Just to note though, this is not an issue for using the remaining data as long as you join the anndata and metadata properly using the cell IDs.
In the 20230830 release, there is a mismatch in the number of cells between the expression matrix and metadata for the Allen MERFISH data. Metadata has 3938808 cells, and the expression matrix has 4334174 cells.
metadata was loaded with:
rpath = metadata['cell_metadata']['files']['csv']['relative_path']
file = os.path.join( download_base, rpath)
cell = pd.read_csv(file, dtype={"cell_label":str})
cell.shape
expression was loaded with:
download_base = '/orangedata/ExternalData/Allen_WMB_2023Sep05'
filename = expression_matrices['C57BL6J-638850']['raw']['files']['h5ad']['relative_path']
adata = anndata.read_h5ad(os.path.join(download_base,filename))
adata.shape
Both of these numbers are different than the number of cells in 20230630 where both datasets had the same number of cells at 4330907.
If the cell numbers are not the same, the spatial data becomes useless, as you can't correspond between cells and xy position. For example, I suspect that the notebooks merfish_tutorial_1,2a,2b show inaccurate maps of gene expression due to this issue (depending on how filtered cells are distributed across sections).
The text was updated successfully, but these errors were encountered: