Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
5dda027
Fix plot_str_var_number to handle the case when group_name is the pre…
YalanBi Mar 19, 2025
5a61ea0
Fix feature_id extraction logic in _read_gtf_file to correctly identi…
YalanBi Mar 19, 2025
9149f7a
Fix write_fasta to handle empty sequences and update parameter descri…
YalanBi Mar 24, 2025
c633990
update version, changelog and metadata
YalanBi Apr 24, 2025
bf24d5d
Enhance write_fasta to include optional coverage information in fasta…
YalanBi Apr 24, 2025
107d5dd
Refactor sashimi_plot docstring for clarity and fix indentation in tr…
YalanBi Apr 24, 2025
3c3c64f
Remove comments
YalanBi Apr 24, 2025
7575504
Fix changelog formatting in version 2.0.0
YalanBi Apr 24, 2025
cfd4a0c
Update Python environment versions in tox.ini
YalanBi Apr 24, 2025
946c4a8
Update python_requires to require Python 3.8 or higher
YalanBi Apr 24, 2025
7c63ae5
Update Python version support in GitHub Actions and Tox configuration
YalanBi Apr 24, 2025
5f2f666
Fix typo in envlist formatting in tox.ini
YalanBi Apr 24, 2025
020be91
Update Python version requirements to 3.10 to be compatiable with Typ…
YalanBi Apr 24, 2025
a6c77e8
remove Python version 3.8 and 3.9 in GitHub Actions
YalanBi Apr 24, 2025
2e825e1
formatting using black
YalanBi Apr 26, 2025
c51aa30
improve formatting
YalanBi Apr 26, 2025
08ca531
improve formatting
YalanBi Apr 27, 2025
89e3859
improve formatting
YalanBi Apr 27, 2025
cdbbbe1
Merge branch 'master' into feature/release_2.0
YalanBi Apr 27, 2025
bfdc41e
improve formatting
YalanBi Apr 27, 2025
fe34fdc
Merge branch 'master' into feature/release_2.0
YalanBi Apr 27, 2025
d4d73a9
update pull request trigger to include all branches
YalanBi Apr 27, 2025
2c0fce4
restrict dependency review action to pull requests targeting the mast…
YalanBi Apr 27, 2025
69815fe
add GitHub Actions workflow for Python linting, and update tox.ini to…
YalanBi Apr 29, 2025
6d8db81
refactor: replace mutable default arguments with None in function def…
YalanBi May 6, 2025
b158542
docs: update citation and feedback section in README.md
YalanBi May 6, 2025
bd0c5ee
fix: update download link for demonstration data in notebook
YalanBi May 6, 2025
9da419e
fix: improve error handling in genomic_position function and validate…
YalanBi May 6, 2025
76c2144
fix: improve readability in genomic_position function by unpacking ex…
YalanBi May 6, 2025
4ebe0b7
fix: simplify loop structure in genomic_position function for better …
YalanBi May 6, 2025
c290e0d
fix: update linting workflow comments and permissions for clarity
YalanBi May 6, 2025
e08e469
Apply linting fixes
actions-user May 6, 2025
d3474e7
fix: update lint workflow steps for clarity and consistency
YalanBi May 6, 2025
cd84b0f
fix: enhance branch name determination logic in lint workflow
YalanBi May 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 3 additions & 8 deletions .github/workflows/dependency-review.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,9 @@
name: 'Dependency review'
on:
pull_request:
branches: [ "master" ]
branches:
- master # Triggers only on pull requests targeting the master branch

# If using a dependency submission action in this workflow this permission will need to be set to:
#
# permissions:
# contents: write
#
# https://docs.github.com/en/enterprise-cloud@latest/code-security/supply-chain-security/understanding-your-software-supply-chain/using-the-dependency-submission-api
permissions:
contents: read
# Write permissions for pull-requests are required for using the `comment-summary-in-pr` option, comment out if you aren't using this option
Expand All @@ -36,4 +31,4 @@ jobs:
comment-summary-in-pr: always
# fail-on-severity: moderate
# deny-licenses: GPL-1.0-or-later, LGPL-2.0-or-later
# retry-on-snapshot-warnings: true
# retry-on-snapshot-warnings: true
53 changes: 53 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# This is a GitHub Actions workflow file for linting Python code using Tox and Flake8.
name: Lint

on:
push:
branches:
- '**' # run on every push to any branch
pull_request:
branches:
- master # run on pull requests targeting the master branch

permissions:
contents: write

jobs:
lint:
runs-on: ubuntu-latest
steps:
# Step 1: Checkout the repository
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0 # Fetch the full history to allow branch checkout

# Step 2: Determine the branch name
- name: Get branch name
id: vars
run: |
if [ "${{ github.event_name }}" = "pull_request" ]; then
echo "BRANCH_NAME=${{ github.head_ref }}" >> $GITHUB_ENV
else
echo "BRANCH_NAME=$(echo ${GITHUB_REF#refs/heads/})" >> $GITHUB_ENV
fi

# Step 3: Install dependencies
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install tox

# Step 4: Run Tox for Linting
- name: Run Tox Lint
run: tox -e flake8

# Step 5: Commit and push changes if Black reformats files
- name: Commit and push changes
run: |
git config --global user.name "GitHub Actions"
git config --global user.email "[email protected]"
git checkout $BRANCH_NAME # Check out the branch
git add .
git commit -m "Apply linting fixes" || echo "No changes to commit"
git push origin $BRANCH_NAME
2 changes: 1 addition & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
strategy:
matrix:
os: [ubuntu-latest, macos-latest]
python-version: ['3.7', '3.8', '3.9', '3.10']
python-version: ['3.10', '3.11', '3.12']

steps:
- uses: actions/checkout@v2
Expand Down
2 changes: 1 addition & 1 deletion .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,6 @@ sphinx:

# Explicitly set the version of Python
python:
version: 3.8
version: 3.12
install:
- requirements: docs/requirements.txt
77 changes: 55 additions & 22 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,40 @@
# Change Log

## TODO: ideas, issues and planed extensions or changes that are not yet implemented

* optimize add_qc_metrics for run after new samples have been added - should not recompute everything
* planned new feature: during import of long reads, (optionally) correct for short exon alignment issues.
* planned new feature: during import of long reads, (optionally) correct for short exon alignment issues.
* separate new read import and classification of isoforms.

## [2.0.0]

* support the analysis of Oxford Nanopore data
* new option of TSS identification according to the reference annotation
* new gene model characteristics, simplex coordinates and relative entropy
* new visualisation using triangle plot
* fix bugs in external gtf import to reconstruct transcriptome
* improve the coordination analysis of TSS and PAS
* support the import of SQANTI QC report and filtering based on it
* support the export of positive and negative TSS for ML
* support proteogenomic approaches at the interface of transcriptomics and proteomics
* improve code readability and filter tag construction

## [0.3.5]

* fixed a bug in domain plots, which was introduced in 0.3.4
* fixed a bug in iter_genes/iter_transcripts with region='chr', and no positions specified
* new option in plot_domains to depict noncoding transcripts, controlled with the coding_only parameter

## [0.3.4]

* fixing #8: AssertationError when unifying TSS/PAS between transcript
* improved domain plots: ORF start and end do not appear like exon exon boundaries.
* API change: separated ORF prediction from QC metrics calculation.
* improved domain plots: ORF start and end do not appear like exon exon boundaries.
* API change: separated ORF prediction from QC metrics calculation.
* new feature: count number of upstream start codons in Gene.add_orfs() (called by default when adding QC metrics to transcriptome)
* new feature: calculate Fickett testcode and hexamer score for longest ORFs, to separate coding and noncoding genes.

* new feature: calculate Fickett testcode and hexamer score for longest ORFs, to separate coding and noncoding genes.

## [0.3.3]

* fixed bug in filter_ref_transcripts with no query
* export gtf with long read transcripts as well uncovered as reference transcripts
* fix warning in plot_diff_results
Expand All @@ -28,11 +44,13 @@
* improved documentation: syntax highlighting, code style, additional explanations on filtering

## [0.3.2]

* restructured tutorials
* new feature: add domains to differential splicing result tables.
* new feature: min_coverage and max_coverage for iter_genes function.

## [0.3.1]

* new feature: add protein domains from 3 different sources and depict them with Gene.plot_domains()
* new feature: restrict gene and transcript iterators on list of genes of interest
* new feature: filter_transcripts function for genes
Expand All @@ -43,39 +61,45 @@
* order of events is now according to gene strand: A upstream of B

## [0.3.0]

* new feature: find longest ORF and infer NMD of lr transcripts (and annotation)
* new feature: allow for several TSS/PAS per intron chain and unify them across intron chains
* changed default parameter of filter_query in run_isotools script to "FSM or not (INTERNAL_PRIMING or RTTS)"

## [0.2.11.1]
* bugfix: KeyError during transcriptome reconstruction in _add_chimeric.

* bugfix: KeyError during transcriptome reconstruction in _add_chimeric.
* bugfix: default colors in plot_diff_results.

## [0.2.11]

* added function to import samples from csv/gtf to import transcriptome reconstruction / quantification from other tools.
* dropped requirement for gtf files to be tabix indexed.


## [0.2.10]

* fixed get_overlap - important for correct assignment of mono exonic genes to reference
* added parameter to control for minimal mapping quality in add_sample_from_bam. This allows for filtering out ambiguous reads, which have mapping quality of 0
* fixed plot_diff_result (Key error due to incorrect parsing of group names)
* New function estimate_tpm_threshold, to estimate the minimal abundance level of observable transcripts, given a sequencing depth.
* New function coordination_test, to test coordination of splicing events within a gene.
* New function estimate_tpm_threshold, to estimate the minimal abundance level of observable transcripts, given a sequencing depth.
* New function coordination_test, to test coordination of splicing events within a gene.
* Optional log or linear scale for the coverage axis in sashimi plots.

## [0.2.9]

* added DIE test
* adjusted classification of novel exonic TSS/PAS to ISM
* improved assignment of reference genes in case of equal number of matching splice sites to several reference genes.
* improved assignment of reference genes in case of equal number of matching splice sites to several reference genes.
* added parameter to control for minimal exonic overlap to reference genes in add_sample_from_bam.
* changed computation of direct repeats. Added wobble and max_mm parameters.
* exposed parameters to end user in the add_qc_metrics function.
* exposed parameters to end user in the add_qc_metrics function.
* added options for additional fields in gtf output
* improved options for graphical output with the command line script
* fixed plot_bar default color scheme

## [0.2.8]

* fix: version information lost when pickeling reference.
* fix missing gene name
* added pt_size parameter to plot_embedding and plot_diff_results function
Expand All @@ -84,11 +108,13 @@


## [0.2.7]

* added command line script run_isotools.py
* added test data for unit tests
* added test data for unit tests


## [0.2.6]

* Added unit tests
* Fixed bug in novel splicing subcategory assignment
* new feature: rarefaction analysis
Expand All @@ -98,55 +124,63 @@
* added optional progress bar to iter_genes/transcripts

## [0.2.5]

* New feature: distinguish noncanonical and canonical novel splice sites for direct repeat hist
* New feature: option to drop partially aligned reads with the min_align_fraction parameter in add_sample_from_bam

## [0.2.4]

* New feature: added option to save read names during bam import
* new feature: gzip compressed gtf output

## [0.2.3]

* Changed assignment of transcripts to genes if no splice sites match.
* Fix: more flexible import of reference files, gene name not required (but id is), introducing "infer_genes" from exon entries of gtf files.
* New function: Transcriptome.remove_filter(filter=[tags])

## [0.2.2]

* Fix: export to gtf with filter features

## [0.2.1]

* Fix: import reference from gtf file
* New feature: Import multiple samples from single bam tagged by barcode (e.g. from single cell data)
* Fix: issue with zero base exons after shifting fuzzy junctions


## [0.2.0]

* restructure to meet PyPI recommendations
* New feature: isoseq.altsplice_test accepts more than 2 groups, and computes ML parameters for all groups

## [0.1.5]

* New feature: restrict tests on provided splice_types
* New feature: provide position to find given alternative splicing events

## [0.1.4]

* Fix: Issue with noncanonical splicing detection introduced in 0.1.3
* Fix: crash with secondary alignments in bam files during import.
* New feature: Report and skip if alignment outside chromosome (uLTRA issue)
* Fix: import of chimeric reads (secondary alignments have no SA tag)
* Fix: Transcripts per sample in sample table: During import count only used transcripts, do not count chimeric transcripts twice.
* Fix: Transcripts per sample in sample table: During import count only used transcripts, do not count chimeric transcripts twice.
* Change: sample_table reports chimeric_reads and nonchimeric_reads (instead of total_reads)
* Change: import of long read bam is more verbose in info mode
* Fix: Bug: import of chained chimeric alignments overwrites read coverage when merging to existing transcript
* Fix: remove_samples actually removes the samples from the sample_table
* Change: refactored add_biases to add_qc_metrics
* fix: property of transcripts included {sample_name:0}
* save the TSS and PAS positions
* New: use_satag parameter for add_sample_from_bam
* New: use_satag parameter for add_sample_from_bam
* Change: use median TSS/PAS (of all reads with same splice pattern) as transcript start/end (e.g. exons[0][0]/exons[-1][1])
* Fix: Novel exon skipping annotation now finds all exonic regions that are skipped.
* change: Default filter of FRAGMENTS now only tags reads that do not use a reference TSS or PAS

## [0.1.3]
* Fix: improved performance of noncanonical splicing detection by avoiding redundant lookups.

* Fix: improved performance of noncanonical splicing detection by avoiding redundant lookups.

## [0.1.2] - 2020-05-03

Expand All @@ -157,7 +191,6 @@
* New: Do not distinguish intronic/exonic novel splice sites. Report distance to shortest splice site of same type.
* Fix: Sashimi plots ignored mono exons


## [0.1.1] - 2020-04-12

* Fix: fixed bug in TSS/PAS events affecting start/end positions and known flag.
Expand All @@ -170,23 +203,23 @@
* moved examples in documentation

## [0.0.2] - 2020-03-22

* Change: refactored SpliceGraph to SegmentGraph to better comply with common terms in literature
* New: added a basic implementation of an actual SpliceGraph (as commonly defined in literature)
* New: added a basic implementation of an actual SpliceGraph (as commonly defined in literature)
* based on sorted dict
* not used so far, but maybe useful in importing the long read bam files since it can be extended easily
* New: added decorators "experimental" and "deprecated" to mark unsafe functions
* New: added decorators "experimental" and "deprecated" to mark unsafe functions
* Change: in differential splicing changed the alternative fraction, to match the common PSI (% spliced in) definition
* Change: narrowed definition of mutually exclusive exons: the alternatives now need to to feature exactly one ME exon and rejoin at node C
* Change: for ME exons now the beginning of node C is returned as "end" of the splice bubble
* New: differential splicing result contains "novel", indicating that the the alternative is in the annotation
* New: differential splicing result contains "novel", indicating that the the alternative is in the annotation
* New: added alternative TSS/alternative PAS to the differential splicing test
* Change: removed obsolete weights from splice graph and added strand
* Change: unified parameters and column names of results of Transcriptome.find_splice_bubbles() and Transcriptome.altsplice_test()
* Fix: add_short_read_coverage broken if short reads are already there.

* Fix: add_short_read_coverage broken if short reads are already there.

## [0.0.1] - 2020-02-25

* first shared version
* New: added option to export alternative splicing events for MISO and rMATS
* New: added change log

6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,5 +62,7 @@ isoseq.save('../tests/data/example_1_isotools.pkl')
## Citation and feedback:

* If you run into any issues, please use the [github issues report feature](https://github.com/HerwigLab/IsoTools2/issues).
* For general feedback, please write me an email to [[email protected]](mailto:[email protected]).
* If you use IsoTools in your publication, please cite the following [paper](https://doi.org/10.1093/bioinformatics/btad364): Lienhard et al, Bioinformatics, 2023: IsoTools: a flexible workflow for long-read transcriptome sequencing analysis
* For general feedback, please write us an email to [[email protected]](mailto:[email protected]) and [[email protected]](mailto:[email protected]).
* If you use IsoTools in your publication, please cite the following paper in addition to this repository:
* Lienhard, Matthias et al. “IsoTools: a flexible workflow for long-read transcriptome sequencing analysis.” Bioinformatics (Oxford, England) vol. 39,6 (2023): btad364. [doi:10.1093/bioinformatics/btad364](https://doi.org/10.1093/bioinformatics/btad364)
* Bi, Yalan et al. “IsoTools 2.0: Software for Comprehensive Analysis of Long-read Transcriptome Sequencing Data.” Journal of molecular biology, 169049. 26 Feb. 2025, [doi:10.1016/j.jmb.2025.169049](https://doi.org/10.1016/j.jmb.2025.169049)
2 changes: 1 addition & 1 deletion VERSION.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.3.5_rc11
2.0.0
Loading
Loading