-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Hi IsoTools team,
Thank you very much for developing and maintaining this valuable tool!
I am trying to integrate PacBio data with IsoTools but am running through some difficulties and be thankful if you could share some of your experience with me.
After getting PacBio output, I implemented Iso-Seq and Pigeon pipelines to create a Seurat object. I filtered out low quality cells based on UMIs and gene expression characteristics. At this point I wish to preform pseudo bulking of the single cells and implement IsoTools in search for differentially spliced genes.
I have previously managed to filter mapped reads from pbmm2 .bam files, and have created an IsoTools object. I have then preformed the relevant steps as you guide in your tutorials to find deferentially spliced genes. Thank you!
Yet, as part of the process I redefined the transcripts with IsoTools algorithm. Currently, I wish to integrate Iso-Seq defined transcripts with IsoTools. Thus, I retrieved from the Seurat object an isoform abundance table (.csv file) and from Pigeon a .gff file. As it is thoroughly explained in your online Transcriptome Import tutorial, I tried executing IsoTools add_sample_from_csv() function. Yet, the function does not fully run.
Here is an exemplary code with the error messages:
isotools_obj = Transcriptome.from_reference("../Mouse/gencode.vM36.annotation.gtf", file_format='auto')
id_map=isotools_obj.add_sample_from_csv(
f”…/aggregated_counts_by_sample.csv", # A file based on a Seurat object
transcripts_file=”Batch001.collapsed.gff”, # A file generated by the Iso-Seq pipeline - https://isoseq.how/classification/isoseq-collapse.html
transcript_id_col='transcript_id',
reconstruct_genes=False,
sep=','
)
TypeError: _read_gff_file() got an unexpected keyword argument 'infer_genes'
I then converted the .gff file into a .gtf file using the gffread Batch001.collapsed.gff -T -o output.collapsed.gtf function and re-run the add_sample_from_csv() using the .gtf file and received this error:
AttributeError: 'Series' object has no attribute 'chr'
I would be very thankful if you could help me figure out how to integrate these tools together. Below you may find my .csv, .gff and .gtf file heads.
Once again, thank you very much for developing and maintaining this valuable tool.
Roy
B.t.w., the link to the files used for the Transcriptome Import tutorial does not work for me.
CSV file:
transcript_id,LPS_Female_sum_coverage,LPS_Male_sum_coverage,Salmonella_Female_sum_coverage,Salmonella_Male_sum_coverage,Steady_State_Female_sum_coverage,Steady_State_Male_sum_coverage
PB.100005.1,5,4,2,10,4,0
PB.100005.100,73,78,243,418,148,252
PB.100005.103,0,1,1,1,1,2
PB.100005.104,0,1,0,2,0,1
PB.100005.107,0,0,0,0,1,2
PB.100005.108,1,0,0,2,0,0
PB.100005.115,0,1,0,0,0,0
PB.100005.116,0,0,0,1,0,0
PB.100005.120,0,0,0,0,0,0
GFF file:
##pacbio-collapse-version 1.0
##date Mon Sep 9 06:22:33 2024 UTC
chr1 PacBio transcript 3905348 3905724 . - . gene_id "PB.1"; transcript_id "PB.1.1";
chr1 PacBio exon 3905348 3905724 . - . gene_id "PB.1"; transcript_id "PB.1.1";
chr1 PacBio transcript 3911074 3911560 . - . gene_id "PB.2"; transcript_id "PB.2.1";
chr1 PacBio exon 3911074 3911560 . - . gene_id "PB.2"; transcript_id "PB.2.1";
chr1 PacBio transcript 3966446 3966898 . - . gene_id "PB.3"; transcript_id "PB.3.1";
chr1 PacBio exon 3966446 3966898 . - . gene_id "PB.3"; transcript_id "PB.3.1";
chr1 PacBio transcript 3971921 3972757 . - . gene_id "PB.4"; transcript_id "PB.4.1";
chr1 PacBio exon 3971921 3972757 . - . gene_id "PB.4"; transcript_id "PB.4.1";
GTF file:
chr1 PacBio transcript 3250396 3250740 . + . transcript_id "PB.7.1"; gene_id "PB.7"
chr1 PacBio exon 3250396 3250740 . + . transcript_id "PB.7.1"; gene_id "PB.7";
chr1 PacBio transcript 3905348 3905724 . - . transcript_id "PB.1.1"; gene_id "PB.1"
chr1 PacBio exon 3905348 3905724 . - . transcript_id "PB.1.1"; gene_id "PB.1";
chr1 PacBio transcript 3911074 3911560 . - . transcript_id "PB.2.1"; gene_id "PB.2"
chr1 PacBio exon 3911074 3911560 . - . transcript_id "PB.2.1"; gene_id "PB.2";
chr1 PacBio transcript 3966446 3966898 . - . transcript_id "PB.3.1"; gene_id "PB.3"
chr1 PacBio exon 3966446 3966898 . - . transcript_id "PB.3.1"; gene_id "PB.3";
chr1 PacBio transcript 3971921 3972757 . - . transcript_id "PB.4.1"; gene_id "PB.4"
chr1 PacBio exon 3971921 3972757 . - . transcript_id "PB.4.1"; gene_id "PB.4";