Skip to content

Commit

Permalink
Merge pull request #32 from tyo-nu/develop
Browse files Browse the repository at this point in the history
Develop
  • Loading branch information
jonstrutz11 committed Aug 11, 2021
2 parents 77b0c42 + c17db97 commit 04b64d2
Show file tree
Hide file tree
Showing 3 changed files with 24 additions and 86 deletions.
85 changes: 15 additions & 70 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,89 +3,34 @@
![Documentation](https://readthedocs.org/projects/mine-database/badge)

The MINE Database contains code for generating (through Pickaxe) and storing and retrieving compounds from a database.
Pickaxe applies reaction operators, representing reaction transformation patterns, to a list of user-specified compounds
in order to predict reactions.
Pickaxe applies reaction rules, representing reaction transformation patterns, to a list of user-specified compounds in order to predict reactions.

## Documentation
For general information on MINE Databases, please consult [JJeffryes et al. 2015.](http://jcheminf.springeropen.com/articles/10.1186/s13321-015-0087-1)
Documentation, hosted on read the docs at https://mine-database.readthedocs.io/en/develop/,
gives more detailed descriptions and examples uses of the software.
For general information on MINE Databases, please consult [JJeffryes et al. 2015.](http://jcheminf.springeropen.com/articles/10.1186/s13321-015-0087-1).

## Installation
MINE-Database requires the use of [rdkit](https://rdkit.org/), which currently is unavailable to install on pip. It is recommended to use the
conda environment installation.
Documentation is hosted at https://mine-database.readthedocs.io/en/latest/. It gives more detailed descriptions and example uses of the software.

## Conda Installation
MINE-Database can be installed via conda:
```
conda install -c condaforge mine-database
```
## Installation
MINE-Database requires the use of [rdkit](https://rdkit.org/), which currently is unavailable to install on pip. Thus, we recommend you use conda to create a new environment and then install rdkit into that environment before proceeding:

Alternatively MINE-Database can be installed from source:
```
conda env create --name minedatabase --file requirements.yml
```
`conda create -n mine`

## Pip and Conda Installation from Source
The majority of requirements can be installed through pip:
```
pip install -r requirements.txt
```
RDKit must still be installed using Anaconda
```
conda env create -c rdkit -n minedatabase rdkit
```
`conda activate mine`

`conda install -c rdkit rdkit`

<!-- ## Repository Structure
This repository primarily consists of the minedatabases python module
and its 8 submodules:
- compound_io: Contains functions to load and export chemical structures
from MINE databases. Has command-line interface.
- databases: Contains functions which impose a schema on the underlying
Mongo databases which store MINE data.
-filters.py: Contains filters to apply while generating reaction network
- pickaxe: Allows for the application of reaction rules to compound sets
and the annotation of the resulting compounds an reactions. Has command-line interface.
- queries: Contains logic for text and chemical structure queries of the
MINE database.
- reactions: Contains methods to apply reaction rules to compounds
- utils: Various utility functions such as hashing & type conversions -->
Then, use pip (in your conda environment) to install minedatabase:

<!-- ### Compound_io command-line usage
compound_io may be called independently to import or export chemical
strictures in MINE database format. These may be helpful for sharing
predictions or maintaining current external databases for cross-referencing.
The call format is `python compound_io.py import-<format> <input path>
<database>` for imports and `python compound_io.py export-<format>
<database> <outfile path> <optionally: maximum compounds per file>` for exports.
Valid formats are:
- smi: SMILES line-code
- mol: MDL molecule files (outputs individual files in specified directory)
- sdf: Structured Data File (concatenated mol files)
- tsv: FOR EXPORT ONLY, a tab separated file compatible with ModelSEED -->
`pip install minedatabase`

## Running Pickaxe
### Running Pickaxe through a python file (recommended)
An example file, pickaxe_run.py, provides a framework for running pickaxe through a python file.
The starting compounds, rules and cofactors, optional database information, and Pickaxe run options are specified.
After running the results are stored in a specified database or written to .tsv files. This file is
explained in more detail in the [documentation](https://mine-database.readthedocs.io/en/develop/pickaxe_run.html).
An example file, [pickaxe_run_template.py](https://github.com/tyo-nu/MINE-Database/blob/master/pickaxe_run_template.py), provides a framework for running pickaxe through a python file. Feel free to download it and change it to your needs. The starting compounds, rules and cofactors, optional database information, and Pickaxe run options are specified. After running the results are stored in a specified database or written to .tsv files.

This is all explained in more detail in the [documentation](https://mine-database.readthedocs.io/en/develop/pickaxe_run.html).

### Pickaxe command-line usage (not recommended - see above section)
Pickaxe.py can be called independently to generate predictions with or
without database storage. To list all options call `python -m minedatabase.pickaxe -h`. Note that
due to relative imports, it needs to be run as a module (-m flag) from the MINE-Database directory.
To predict metacyc reactions for one generation on compounds in the iML1515
model one would call
```
python pickaxe.py -C ./data/metacyc_generalized_rules.tsv -r ./data/metacyc_coreactants.tsv -g 1 -c ../example_data/iML1515_ecoli_GEM.csv
```
without database storage. To list all options call `python -m minedatabase.pickaxe -h`. Note that due to relative imports, it needs to be run as a module (-m flag) from the MINE-Database directory. To predict metacyc reactions for one generation on compounds in the iML1515 model one would call

### Testing
`pytest` to run all tests. Ensure that pytest is installed.
To add coverage, run:
```
pytest --cov-report term --cov-report xml:tests/cov.xml --cov=minedatabase minedatabase/tests/
```
Ensure that coverage and pytest-cov are both installed.
`python pickaxe.py -C ./data/metacyc_generalized_rules.tsv -r ./data metacyc_coreactants.tsv -g 1 -c ../example_data/iML1515_ecoli_GEM.csv`
10 changes: 6 additions & 4 deletions pickaxe_run_template.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
from rdkit import DataStructs
from rdkit.Chem import AllChem

# Make sure you have minedatabase installed! (either from GitHub or via pip)
from minedatabase.filters import (
AtomicCompositionFilter,
MCSFilter,
Expand Down Expand Up @@ -187,10 +188,11 @@

# comment below line and uncomment other definition if using thermo filter
feasibility_filter = None
feasibility_filter = ReactionFeasibilityFilter(
generation_list=generation_list,
last_generation_only=last_generation_only
)
# uncomment below if using feasibility filter (note this requires extra dependencies)
# feasibility_filter = ReactionFeasibilityFilter(
# generation_list=generation_list,
# last_generation_only=last_generation_only
# )


##########################################
Expand Down
15 changes: 3 additions & 12 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,11 @@

from setuptools import setup


# Most arguments are set in the `setup.cfg`.
# TODO fix this
setup(version=0.1)

from setuptools import setup
import setuptools

with open("README.md", "r") as fh:
long_description = fh.read()

setup(name='minedatabase',
version='1.0.0',
version='2.0.0',
description='Metabolic In silico Network Expansions',
long_description=long_description,
long_description_content_type="text/markdown",
Expand All @@ -22,19 +14,18 @@
author_email='[email protected]',
license='MIT',
packages=setuptools.find_packages(),
install_requires=['pymongo'],
install_requires=['mordred', 'pymongo', 'scikit-learn<=0.23.2', 'seaborn'],
package_data={'minedatabase': ['data/*'],
'minedatabase.NP_Score': ['*.gz'],
'minedatabase.tests': ['data/*'],
},
include_package_data=True,
extras_require={},
classifiers=[
'Development Status :: 5 - Production/Stable',
'Intended Audience :: Science/Research',
'License :: OSI Approved :: MIT License',
'Topic :: Scientific/Engineering :: Bio-Informatics',
'Topic :: Scientific/Engineering :: Chemistry',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.7',
],
)

0 comments on commit 04b64d2

Please sign in to comment.