Skip to content

Commit

Permalink
Merge branch 'release/0.9.10'
Browse files Browse the repository at this point in the history
  • Loading branch information
dputhier committed Jan 14, 2019
2 parents 4750f7a + 8db0025 commit b601ba3
Show file tree
Hide file tree
Showing 102 changed files with 1,573 additions and 804 deletions.
6 changes: 3 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -121,14 +121,14 @@ test_para_travis: $(OUTPUT4)

clean:
@make bats_cmd CMD=clean
@git checkout docs/source/conf.py pygtftk/version.py; rm -rf expected_s* ids* diff_fasta.py chr1_hg38_10M.fa* observed_s* order_fasta.py simple* control_list_reference.txt control_list_data.txt add_attr_to_pos.tab test.py pygtftk.egg-info build airway_love.txt* ENCFF630HEX_Total_RNAseq_K562_count_mini.txt STDIN.e* closest_1.tsv STDIN.o* dist cmd_list.txt example_list.txt tmp_list.txt simple.chromInfo prgm_list.txt test_list.txt *.bats *.completed *mini_real* heatmap_* tx_classes* *~ \#* hh profile_* toto tott; cd docs/; make clean; cd ..; find . -type f -name '*~' -exec rm -f '{}' \;
@git checkout docs/source/conf.py pygtftk/version.py; rm -rf mk_matrix_6 expected_s* ids* diff_fasta.py chr1_hg38_10M.fa* observed_s* order_fasta.py simple* control_list_reference.txt control_list_data.txt add_attr_to_pos.tab test.py pygtftk.egg-info build airway_love.txt* ENCFF630HEX_Total_RNAseq_K562_count_mini.txt STDIN.e* closest_1.tsv STDIN.o* dist cmd_list.txt example_list.txt tmp_list.txt simple.chromInfo prgm_list.txt test_list.txt *.bats *.completed *mini_real* heatmap_* tx_classes* *~ \#* hh profile_* toto tott; cd docs/; make clean; cd ..; find . -type f -name '*~' -exec rm -f '{}' \;

check_cmd_has_example:
@for i in $$(gtftk -l); do if grep -q "^$$i" docs/source/presentation.rst; then echo "" >/dev/null; else echo $$i; fi; done
@for i in $$(gtftk -l); do if grep -q "^$$i" docs/source/*.rst; then echo "" >/dev/null; else echo $$i; fi; done

check_example_has_cmd:
@gtftk -l > cmd_list.txt
@grep "~~" -B 1 docs/source/presentation.rst | grep -v "^$$" | grep -v " " |grep -v "^~~" | grep -v "^\-\-" > example_list.txt
@grep "~~" -B 1 docs/source/*.rst | grep -v "^$$" | grep -v " " |grep -v "^~~" | grep -v "^\-\-" > example_list.txt
@for i in $$(cat example_list.txt); do if $$(cat cmd_list.txt | grep -q "^$$i") ; then echo "" >/dev/null; else echo $$i; fi; done
@#rm -f cmd_list.txt example_list.txt tmp_list.txt

Expand Down
45 changes: 32 additions & 13 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,22 @@
.. image:: https://travis-ci.org/dputhier/pygtftk.svg?branch=master
:target: https://travis-ci.org/dputhier/pygtftk

.. image:: https://img.shields.io/github/repo-size/badges/shields.svg
:target: https://travis-ci.org/dputhier/pygtftk

.. image:: https://img.shields.io/conda/dn/:channel/:package.svg
:target: https://github.com/dputhier/pygtftk



.. highlight-language: shell
Python GTF toolkit (pygtftk)
=============================


The **Python GTF toolkit (pygtftk) package** is intented to ease handling of GTF (Gene Transfer Format) files. The pygtftk package is compatible with Python >=3.5,<3.7 and relies on **libgtftk**, a library of functions **written in C**.
The **Python GTF toolkit (pygtftk) package** is intented to ease handling of GTF/GFF2.0 files (Gene Transfer Format). It currently does not support GFF3 file format. The pygtftk package is compatible with Python >=3.5,<3.7 and relies on **libgtftk**, a library of functions **written in C**.

The package comes with a set of **UNIX commands** that can be accessed through the **gtftk program**. The gtftk program proposes several atomic tools to filter, convert, or extract data from GTF files. The gtftk set of Unix commands can be easily extended using a basic plugin architecture. All these aspects are covered in the help sections.

Expand All @@ -35,16 +42,19 @@ While the gtftk Unix program comes with hundreds of unitary and functional tests
System requirements
--------------------

Depending on the **size of the GTF file**, pygtftk and gtftk may require lot of memory to perform selected tasks. A computer with 16Go is recommended in order to be able to pipe several commands when working with human annotations from ensembl release (e.g. 91).
Depending on the **size of the GTF file**, pygtftk and gtftk may require lot of memory to perform selected tasks. A computer with 16Go is recommended in order to be able to pipe several commands when working with human annotations from ensembl release (e.g. 91). When working with a cluster think about reserving sufficient memory.

At the moment, the gtftk program has been tested on:

- Linux (Ubuntu 12.04 and 18.04)
- OSX (Yosemite, El Capitan).
- OSX (Yosemite, El Capitan, Mojave).


Installation
-------------

Installation through conda package building
--------------------------------------------
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Installation through **conda** should be the **prefered install solution**. The pygtftk package and gtftk command line tool require external dependencies with some version constrains.

Expand All @@ -61,7 +71,7 @@ Then you can simply install pygtftk in its own isolated environment and activate


Installation through setup.py
------------------------------
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is not the prefered way for installation. Choose conda whenever possible. We have observed several issues with dependencies that still need to be fixed. ::

Expand All @@ -73,15 +83,15 @@ This is not the prefered way for installation. Choose conda whenever possible. W


Installation through pip
-------------------------
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**Prerequesites**

Prerequesites
~~~~~~~~~~~~~~

Again, this is not the prefered way for installation. Please choose conda whenever possible. We have observed several issues with dependencies that still need to be fixed.

Running pip
~~~~~~~~~~~~~
**Running pip**


Installation through pip can be done as follow. ::

Expand All @@ -93,8 +103,17 @@ Installation through pip can be done as follow. ::
gtftk -h



Documentation
--------------

Documentation about the latest release is dynamically produced and available at `readthedoc server <https://pygtftk.readthedocs.io/en/master/>`_.

Testing
--------

Running functional tests
-------------------------
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A lot of functional tests have been developed to ensure consistency with expected results. This does not rule out that bugs may hide throughout the code... In order to check that installation is functional you may be interested in running functional tests. The definition of all functional tests declared in gtftk commands is accessible using the -p/--plugin-tests argument: ::

Expand Down Expand Up @@ -124,7 +143,7 @@ Or run tests in parallel using: ::

Running unitary tests
----------------------
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Several unitary tests have been implemented using doctests. You can run them using nose through the following command line: ::

Expand Down
24 changes: 24 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,29 @@
# Changelog

## v0.9.10

### Bug Fixes

- Argparse was part of the dependencies. However, argparse is part of Python 3. Thus, this caused pygtftk to come with an older version of argparse...
- Fixed gene sorting in tss_dict to ensure reproducible result.
- Fixed a problem with retrieve() when used from interpreter (#45).


### API Changes

- BED file in bed3/4/5 format are now converted to bed6 automatically.
- The select_by_numeric() function has been renamed eval_numeric()
- It is now possible to use numpy array of boolean to index the GTF (i.e. using the indexing function).
- the prepare_gffutils_db() function allows one to create a db for gffutils while selecting features and attributes.

### Code changes

- The argformatter module was refactored. Development of FormattedFile(argparse.FileType) that test for file extension and content (at least for bed).

### New Features



## v0.9.9

### Bug Fixes
Expand Down
54 changes: 27 additions & 27 deletions conda/env.yaml
Original file line number Diff line number Diff line change
@@ -1,33 +1,33 @@
name: pygtftk

channels:
- conda-forge
- bioconda
- defaults
- conda-forge
- bioconda
- defaults

dependencies:
- python >=3.5,<3.7
- GitPython >=2.1.8
- python >=3.6
- pyparsing
- argparse
- bedtools ==2.27.1
- cloudpickle >=0.4.0
- ftputil >=3.3.1,<4.0.0
- pandas >=0.23.3
- scipy >=1.1.0
- pybedtools >=0.7.8
- nose
- bats
- pyyaml >=3.12
- gcc >=4.8.5
- requests >=2.13.0
- cffi >=1.10.0
- pyparsing >=2.2.0
- biopython >=1.69
- zlib >=1.2.11
- matplotlib >=2.0.2
- plotnine >=0.4.0
- pyBigWig >=0.3.12
- future
- python >=3.5,<3.7
- GitPython >=2.1.8
- python >=3.6
- pyparsing
- bedtools ==2.27.1
- cloudpickle >=0.4.0
- ftputil >=3.3.1,<4.0.0
- pandas >=0.23.3
- scipy >=1.1.0
- pybedtools >=0.7.8
- nose
- bats
- pyyaml >=3.12
- gcc >=4.8.5
- requests >=2.13.0
- cffi >=1.10.0
- pyparsing >=2.2.0
- biopython >=1.69
- zlib >=1.2.11
- matplotlib >=2.0.2
- plotnine >=0.4.0
- pyBigWig >=0.3.12
- future
- nose

60 changes: 30 additions & 30 deletions conda/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{% set version = "0.9.9" %}
{% set version = "0.9.10" %}

package:
name: pygtftk
Expand All @@ -14,42 +14,42 @@ build:

requirements:
build:
- {{ compiler('c') }}
- {{ compiler('c') }}
host:
- python
- pip
- setuptools
- zlib
- python
- pip
- setuptools
- zlib
run:
- python >=3.5,<3.7
- argparse
- bedtools >=2.23
- cloudpickle >=0.4.0
- ftputil >=3.3.1,<4.0.0
- numpy
- pandas >=0.23.3
- scipy >=1.1.0
- pybedtools >=0.7.8
- pybigwig >=0.3
- pyyaml >=3.12
- requests >=2.13.0
- cffi >=1.10.0
- pyparsing >=2.2.0
- biopython >=1.69
- matplotlib >=2.0.2
- plotnine >=0.4.0
- future
- python >=3.5,<3.7
- bedtools ==2.23
- cloudpickle >=0.4.0
- ftputil >=3.3.1,<4.0.0
- numpy
- pandas >=0.23.3
- scipy >=1.1.0
- pybedtools >=0.7.8
- pybigwig >=0.3
- pyyaml >=3.12
- requests >=2.13.0
- cffi >=1.10.0
- pyparsing >=2.2.0
- biopython >=1.69
- matplotlib >=2.0.2
- plotnine >=0.4.0
- future
- nose

test:
imports:
- pygtftk
- pygtftk
commands:
- gtftk -h
- gtftk -h
requires:
- nose
- bats
- unzip
- perl
- nose
- bats
- unzip
- perl

about:
home: http://github.com/dputhier/pygtftk
Expand Down
22 changes: 0 additions & 22 deletions docs/source/annotation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,28 +2,6 @@ Commands from section 'annotation'
------------------------------------


closest_gn_to_feat
~~~~~~~~~~~~~~~~~~~~~~

**Description:** Find the n closest genes/transcripts for each peak (or the oppposite).

**Example:** Find the closest tss to a set of peak

.. command-output:: gtftk closest_gn_to_feat -t tss -r simple_peaks.bed6 -i simple.gtf -c simple.chromInfo -p 10 -K toto -n transcript_id,gene_id
:shell:

**Example:** Find the closest tss to a set of peak. Use the gene-centric and uncollapsed outout.

.. command-output:: gtftk closest_gn_to_feat -t tss -r simple_peaks.bed6 -i simple.gtf -c simple.chromInfo -p 10 -K toto -n transcript_id,gene_id -gu
:shell:


**Arguments:**

.. command-output:: gtftk closest_gn_to_feat -h
:shell:


closest_genes
~~~~~~~~~~~~~~~~~~~~~~

Expand Down
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,10 @@
# built documents.
#
# The short X.Y version.
version = u'0.9.9'
version = u'0.9.10'

# The full version, including alpha/beta/rc tags.
release = u'0.9.9'
release = u'0.9.10'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
Binary file modified docs/source/example_01.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/example_01b.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/example_02.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/example_05.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/example_06.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/example_06b.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/example_07.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/example_08.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/example_13.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Welcome to pygtftk documentation page
--------------------------------------

The **Python GTF toolkit (pygtftk) package** is intented to ease handling of GTF files (Gene Transfer Format). The pygtftk package is compatible with Python >=3.5,<3.7 and relies on **libgtftk**, a library of functions **written in C**.
The **Python GTF toolkit (pygtftk) package** is intented to ease handling of GTF/GFF2.0 files (Gene Transfer Format). It currently does not support GFF3 file format. The pygtftk package is compatible with Python >=3.5,<3.7 and relies on **libgtftk**, a library of functions **written in C**.

The package comes with a set of **UNIX commands** that can be accessed through the **gtftk main Unix program**. The gtftk program exposes several subcommands than can be piped, for instance, to filter, convert, extract or delete data from GTF files. The gtftk set of Unix commands, can be easily extended using a basic plugin architecture. All these aspects are covered in the help section.

Expand Down
Loading

0 comments on commit b601ba3

Please sign in to comment.