Error: X needs to be 2-dimensional, not 1-dimensional for bam file with more than 3000 cells

[Running_scTE_pipeline_output_error.txt](https://github.com/user-attachments/files/17848347/Running_scTE_pipeline_output_error.txt)

Hello @jphe @l1y1y @oaxiom @carmarpe,

I was wondering if any of you could help me.

I am running the scTE pipeline and was successful one time in obtaining the python output object from the pipeline which I was able to successfully convert to a Seurat object. The first time I ran the pipeline I was looking at genes and TE family IDs only which has approximately 1167 of these family IDs. However, now I am re-running the scTE pipeline looking at all TE fragments across all chromosomes which has more than 3 million. I built my own costume made index for the human hg38 genome but now I get the following error.

 **File "/data/users/ohoare/Analysis_space/Human_RT_SMARCB1_deficient/scRNA_seq_TE_Analysis/CondaEnvscRNA_seq/lib/python3.9/site-packages/anndata-0.10.7-py3.9.egg/anndata/_core/anndata.py", line 107, in _check_2d_shape
    raise ValueError(
ValueError: X needs to be 2-dimensional, not 1-dimensional**


I have attached the log file with the error. I have tried to do this on a single cell experiment with more than 3000 cells and another one with 5000 cells so the bam file is not too small and when I tried using hdf5 False it just gave me an empty .csv file.


Here are the steps I used below. I run Cellranger version 7.2 with modified parameters and custom made human reference genome (TEs and genes) plus I allowed for multi-mapping parameters. I then took the output .bam file from cell ranger and removed redundant cell barcodes. Here is the code I used.

**# Find multi-mapped reads that map to more than one loci using the NH tag (which gives the number of loci the read can map to), and also a MAPQ score not equal to 255.** 
`samtools view -h possorted_genome_bam.bam | grep -E "^\@|NH:i:2" | awk 'BEGIN{FS="\t"} $5!=255' > multi_mapped_reads.sam`

**# Generate a multi-mapped BAM file** 
`samtools view -S -b multi_mapped_reads.sam -o multi_mapped_reads.bam`

**# Filter the reads with no barcodes**
`samtools view possorted_genome_bam.bam -h | awk '/^@/ || /CB:/' | samtools view -h -b > possorted_genome_bam.clean.bam`

Next I built a custom made index to generate the scTE output files with TEs and genes counted and allowed for multi-mapping.

**# How to run the single cell transposable elements pipeline to identify genes and TEs**

**# Step 1: Build the reference custome made index with human genes and TEs by running the below shell script
# after the scTE pipline has been installed correctly with samtools ect.**
`scTE_build -te gtf_filtered_RMSK_modified.bed -gene gtf_filtered_HAVANA_ENSEMBL.gtf -o custome_all_TEs


**# Step 2: Run the scTE pipeline with the below shell script to generate output.**
`scTE -i /data/users/ohoare/Analysis_space/Human_RT_SMARCB1_deficient/Peripheral_nerve/Output/InnovRT-001_1/outs/possorted_genome_bam.clean.bam -o /data/users/ohoare/Analysis_space/Human_RT_SMARCB1_deficient/Peripheral_nerve/Output/InnovRT-001_1/outs/InnovRT_001_1_scTE_output -x /data/users/ohoare/Analysis_space/Human_RT_SMARCB1_deficient/Peripheral_nerve/Input_Data/custome_all_TEs.exclusive.idx --hdf5 True -CB CB -UMI UB`
`

Could the **Error: X needs to be 2-dimensional, not 1-dimensional** be coming from my custom made index even though it worked well when only doing it at the family level ID. I used exactly the same method the second time with a much bigger index.

I am using python 3.9 which is visible in my log error file I have attached. I think I have used the correct parameters.

Do you have any suggestions or see something I missed. Please feel free to ask any other questions if something is not clear/

Kind regards
Owen

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: X needs to be 2-dimensional, not 1-dimensional for bam file with more than 3000 cells #106

after the scTE pipline has been installed correctly with samtools ect.**

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error: X needs to be 2-dimensional, not 1-dimensional for bam file with more than 3000 cells #106

Description

after the scTE pipline has been installed correctly with samtools ect.**

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions