Skip to content

Non-Repetitive Parts Calculator - Automated design and discovery of non-repetitive genetic parts for engineering stable genetic systems

License

Notifications You must be signed in to change notification settings

ayaanhossain/nrpcalc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OverviewInstallationLicenseCitationDOCSNRP Calculator in Action!

Overview

Engineering large genetic circuits, pathways and whole genomes require large toolboxes of well characterized genetic parts. Existing toolboxes for composing such genetic systems from scratch are either small (forcing designers to re-use a particular part at multiple locations) and/or frequently share many repeats. Systems composed of repetitive elements are challenging to synthesize using commercial DNA synthesis approaches, often resulting in incomplete or mixed products. Moreover, when a repetitive genetic system introduced in a host organism causes biochemical stress, the cell may respond with homologous recombination – sections of the introduced system are deleted, effectively breaking it. In E. coli, a 20-bp repeat is enough to trigger detectable levels of recombination.

To solve this repeat challenge in synthetic biology, we developed and experimentally validated a suite of algorithms that automatically discovers and designs large toolboxes of highly non-repetitive genetic parts, collectively called the Non-Repetitive Parts Calculator. In Finder Mode, the algorithm discovers subsets of non-repetitive genetic parts from pre-existing toolboxes of characterized parts, while in Maker Mode, the algorithm designs large toolboxes of non-repetitive genetic parts for downstream characterization, based on specified design constraints such as a degenerate DNA or RNA sequence, an RNA secondary structure, and/or arbitrary model-based criteria.

Non-repetitiveness is a global property of the entire genetic part toolbox, and it is quantified by a repeat distribution (the number of repeats of length L) or more simply by the maximum shared repeat length, Lmax. Both algorithms generate toolboxes according to a user-specified Lmax to control the desired level of non-repetitiveness. For example, when a toolbox of genetic parts is designed using an Lmax of 10 base pairs, every genetic part in the toolbox can be simultaneously utilized in the same genetic system without introducing a repetitive sequence longer than 10 base pairs, which is necessary to ensure successful DNA synthesis and assembly.

Installation

Non-Repetitive Parts Calculator is a Linux/MacOS-tested software, and was originally built with Python2.7. The software is now Python3 exclusive, and compatible only with Python3.6 and above. Python2.7 is no longer supported.

Setting up the Environment

The best way to install NRP Calculator is via conda. If you have either anaconda or miniconda installed on your system, you are good to proceed. Otherwise, you may first need to install miniconda.

Assuming you have conda available on your system, you may first create a new environment. Here we are naming it nrpenv but you can name it as you like. Also, we are setting nrpenv up with Python3.6 but feel free to install Python3.7, or one of the newer ones.

$ conda create -n nrpenv python=3.6

Once your new environment is ready, deactivate your current environment

$ conda deactivate

and activate the newly created nrpenv environment

$ conda activate nrpenv

Note You don't necessarily need to make a new environment, as long as you know which environment you want NRP Calculator installed in, and have it activated during installation. For example, you might have a different project environment inside which you want to install NRP Calculator.

Approach 1: Install from PyPI

You can install NRP Calculator from PyPI, where it is published as the nrpcalc package. This is as easy as

$ pip install --upgrade nrpcalc --no-cache-dir

which will also install all dependencies from PyPI.

Approach 2: Install from GitHub

Alternatively, you can install NRP Calculator from GitHub. To do so, first clone the repository with

$ git clone https://github.com/ayaanhossain/nrpcalc.git

Once downloaded, navigate into the nrpcalc directory with

$ cd nrpcalc

and install NRP Calculator using the included setup.py

$ python setup.py install

Verifying Installation

If everything went well, NRP Calculator is now available in your environment under the nrpcalc package. You may verify it like so:

$ python
Python 3.6.10 | packaged by conda-forge | (default, Apr 24 2020, 16:44:11)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import nrpcalc
>>>
>>> print(nrpcalc.__doc__)

Non-Repetitive Parts Calculator

Automated design and discovery of non-repetitive genetic
parts for engineering stable systems.

Version: 1.7.0

Authors: Ayaan Hossain <auh57@psu.edu>
         Howard Salis  <salis@psu.edu>

The Non-Repetitive Parts Calculator offers two modes of operation:

- Finder Mode: Discover toolboxes of non-repetitive parts
               from a list of candidate parts

-  Maker Mode: Design toolboxes of non-repetitive parts
               based on sequence, structure and model
               constraints

Additionally, a 'background' object is available that stores
background sequences. When the 'background' object is used,
designed genetic parts will also be non-repetitive with respect
to these sequences.

You can learn more about the two modes and background via
  help(nrpcalc.background)
  help(nrpcalc.finder)
  help(nrpcalc.maker)

>>>

Note Remember to activate the specific environment in which NRP Calculator is installed in order to use it. For example, if you followed the instructions above exactly, then the environment you created and installed NRP Calculator in is named nrpenv.

Uninstalling NRP Calculator

You can easily remove Non-Repetitive Parts Calculator with

$ pip uninstall nrpcalc

Reporting Installation Issues

If you encounter any problems during installation, please feel free to open an issue describing your problem along with your OS details, and any console output that shows the error.

License

Non-Repetitive Parts Calculator (c) 2020-2022 Ayaan Hossain.

Non-Repetitive Parts Calculator is an open-source software under MIT License.

See LICENSE file for more details.

Citation

If you use Non-Repetitive Parts Calculator or toolboxes designed or discovered using the algorithm in your research publication, please cite

Hossain A., Lopez E., Halper S.M., Cetnar D.P., Reis A.C., Strickland D., Klavins E., and Salis H.M.
Automated design of thousands of nonrepetitive parts for engineering stable genetic systems
Nature Biotechnology, doi:10.1038/s41587-020-0584-2

You can read the complete article online at Nature Biotechnology. A free PDF copy of the paper is accessible here.

Abstract Engineered genetic systems are prone to failure when their genetic parts contain repetitive sequences. Designing many non-repetitive genetic parts with desired functionalities remains a significant challenge with high computational complexity. To overcome this challenge, we developed the nonrepetitive parts calculator to rapidly generate thousands of highly nonrepetitive genetic parts from specified design constraints, including promoters, ribosome-binding sites and terminators. As a demonstration, we designed and experimentally characterized 4,350 nonrepetitive bacterial promoters with transcription rates that varied across a 820,000-fold range, and 1,722 highly nonrepetitive yeast promoters with transcription rates that varied across a 25,000-fold range. We applied machine learning to explain how specific interactions controlled the promoters’ transcription rates. We also show that using nonrepetitive genetic parts substantially reduces homologous recombination, resulting in greater genetic stability.

Acknowledgements This project was supported by funds from the Air Force Office of Scientific Research (grant no. FA9550-14-1-0089), the Defense Advanced Research Projects Agency (grant nos. FA8750-17-C-0254 and HR001117C0095), the Department of Energy (grant no. DE-SC0019090), and a Graduate Research Innovation award from the Huck Institutes of the Life Sciences.

Contributions A.H. and H.M.S. conceived the study. A.H., E.L., D.P.C., S.M.H., A.C.R. and D.S. designed and carried out the experiments. A.H., A.C.R. and H.M.S. developed the algorithms and performed the data analysis. A.H., D.S., E.K. and H.M.S. wrote the manuscript.

Maintenance Non-Repetitive Parts Calculator is currently maintained by

API Documentation

A detailed description of nrpcalc Maker Mode, Finder Mode, and Background API, along with example use cases can be found in DOCS.md. You may also check the API documentation from within the Python REPL via

  • print(nrpcalc.background.__doc__),
  • print(nrpcalc.finder.__doc__), and
  • print(nrpcalc.maker.__doc__).

If you enconter any problem using the API, please feel free to open an issue describing your use case, along with minimal code snippets reproducing the problem and any console output that shows the problem or error.

NRP Calculator in Action!

A jupyter notebook describing the use of NRP Calculator in designing thousands of commonly used genetic parts from constitutive promoters to intrinsic terminators for E. coli systems engineering is available inside /examples/ directory.

If you have installed nrpcalc then you have everything you need to open and execute the notebook.

Simply clone this repository

$ git clone https://github.com/ayaanhossain/nrpcalc.git

and navigate to nrpcalc/examples/ directory

$ cd nrpcalc/examples/

and fire up jupyter.

$ jupyter notebook

This should now open a browser tab showing a single jupyter notebook named NRPCalcInAction.ipynb. Click on the notebook to open it, and follow the content of the notebook.

If any part of the notebook is unclear or confusing to you, please feel free to reach any of the authors via Email or Twitter, who would be more than happy to clarify anything in the notebook.

We hope you enjoy the paper and the notebook!

Releases

No releases published

Packages

No packages published

Languages