Skip to content

Releases: uclahs-cds/package-moPepGen

v0.4.1

28 Apr 20:20
37f7dc7
Compare
Choose a tag to compare

Changed

  • Fixed the problem that in summarizeFasta output the order of variant sources in the same group is not consistent across runs. #428
  • Argument --ignore-missing-source added to summarizeFasta so sources not present in any GVF can be ignored without raising any error. #436
  • In filterFasta, when filter with expression table, changed to filter out peptides smaller than, instead of smaller or equal to, the value of --quant-cutoff.
  • Fixed the issue that in splitFasta, variant sources are not grouped as they are specified by --group-source #439

Added

  • Resources usage including memory, CPU and time is now printed to stdout in the end of all command line programs.

Fixed

  • Fixed issue that --additional-split not recognized properly in splitFasta. #443

v0.4.0

18 Mar 04:01
84a28f9
Compare
Choose a tag to compare

Added

  • Added CLI command summarizeFasta to output a summary table of the variant peptide FASTA file output by callVariant.

Changed

  • Attribute key for transcript ID is fixed from 'TRANSCRIPT' to 'TRANSCRIPT_ID' in circRNA's GVF files output by parseCIRCExplorer to be the same as other GVF files.

  • Genomic position for each record is added to the GVF file output by parseCIRCExplorer.

v0.3.1

01 Mar 17:38
99ea45d
Compare
Choose a tag to compare

This is a patch update to 0.3.0 that fixed issue #416

Changed

  • Argument parameter --decoy_string_position is changed to --decoy-string-position. #417

v0.3.0

25 Feb 17:45
830cd37
Compare
Choose a tag to compare

Added

  • Enable filterFasta to filter by number of miscleavages per peptide. #382

  • Added CLI command mergeFasta to merge multiple variant peptide database Fasta files into one. This could be useful when working with multiplexed proteomic experiments such as TMT. #380

  • Added CLI command decoyFasta to generate decoy database by shuffling or reversing each sequence. #386

  • Added parameter --min-coverage-rna to parseREDItools to filter by total RNA reads at a given position. #392

  • Added CLI command encodeFasta to replace the variant peptide headers with UUIDs. The original FASTA headers are stored in a text file together with the UUIDs. This is to make the FASTA header short enough for library search engines. #389

Changed

  • Donor and accepter transcript IDs are now explicitly included in the variant IDs of fusion in both GVFs and variaint peptide FASTA headers. Closed #376 via #377

  • For fusion, callVariant now looks at the entire accepter sequence for potential variant peptides, rather than only the peptides that contains the breakpoint. #377

  • filterFasta updated to support filter by number of miscleavages. #383

  • In parseVEP, chromosome seqname for each record is now read directly from the gene annotation, to avoid the 'chr' prefix issue. #391

  • The --transcript-id-column parameter of parseREDItools is changed to take 1-based index. #392

  • Changed splitDatabase to splitFasta for consistency. #397

  • Updated generateIndex to reduce the size of genomic annotation data and the memory usage when loaded. #395

v0.2.0

28 Jan 18:44
af36ef0
Compare
Choose a tag to compare

This is the first unstable release of moPepGen, the graph based multi-omics peptide generator. Below is what got updated since v0.1.0-beta.1

Added

  • Multi-threading is enabled for callVariant to run in parallel.

  • CLI command indexGVF added to generate a index file for quickly access variant data from the corresponding GVF file. Noted that this command is not required to run.

Changed

  • To solve the complexity of subgraphs introduced by fusion and especially alternative splicing insertion and substitution, the SubgraphTree class is added to keep the graph-subgraph relationship between nodes.

  • Variant records are now kept on disk rather than reading the entire GVF file(s) into memory, and only the file pointers to variant records are kept in memory. This significantly reduces the memory usage of callVariant.

  • The command line arguments are standardized across all commands, for example '-i/--input-path' for inputs and '-o/--output-path' for outputs.

v0.1.0-beta.1

23 Dec 09:48
835230a
Compare
Choose a tag to compare
v0.1.0-beta.1 Pre-release
Pre-release

The first beta release for moPepGen includes:

  • Graph-based data structure and algorithm for calling noncanonical peptides caused by genomic and transcriptional variants.
  • Command line interface that parses genomic/transcriptional variant results into GVF, calls variant or noncoding peptides, splitting database, and filtering fasta.
  • Util package not for general usage but are handy for development.