We do all of NeMo's development in the open. Contributions from NeMo community are welcome.
Send your PRs to the main
branch
- Make sure your PR does one thing. Have a clear answer to "What does this PR do?".
- Read General Principles and style guide below
- Make sure you sign your commits. E.g. use
git commit -s
when before your commit - Make sure all unittests finish successfully before sending PR
pytest
or (if yor dev box does not have GPU)pytest --cpu
from NeMo's root folder - Send your PR and request a review
Quick tests (locally, while developing)
pytest
# If you don't have NVIDIA GPU do:
# pytest --cpu
Full tests, including pre-trained model downloads
pytest --with_downloads
- For changes to NeMo's core: @ericharper, @titu1994, @blisc, or @okuchaiev
- For changes to NeMo's ASR collection: @titu1994, @redoctopus, @jbalam-nv, or @okuchaiev
- For changes to NeMo's NLP collection: @MaximumEntropy, @ericharper, @ekmb, @yzhang123, @VahidooX, @vladgets, or @okuchaiev
- For changes to NeMo's TTS collection: @blisc, or @okuchaiev
Note that some people may self-assign to review your PR - in which case, please wait for them to add a review.
Your pull requests must pass all checks and peer-review before they can be merged.
- User-oriented: make it easy for end users, even at the cost of writing more code in the background
- Robust: make it hard for users to make mistakes.
- Well-tested: please add simple, fast unittests. Consider adding CI tests for end-to-end functionality.
- Reusable: for every piece of code, think about how it can be reused in the future and make it easy to be reused.
- Readable: code should be easier to read.
- Legal: if you copy even one line of code from the Internet, make sure that the code allows the license that NeMo supports. Give credit and link back to the code.
- Sensible: code should make sense. If you think a piece of code might be confusing, write comments.
- No “I”, “Interface”, “NM” nor “NeMo” pre/postfixes anywhere
- Core interfaces have simple names: Typing, Cloud, Serialization, FileIO*
- Core classes have the simplest names ever: NeuralModule, Model, Graph, Dataset, Loss, Module*
- Abstract classes in the Model hierarchy have Model postfix
- A config class for MyModel should be called MyModelConfig
- Leaf Neural Module classes have simple names without any postfixes (e.g. AudioPreprocess)
- Leaf Datasets have Dataset postfix (e.g. AudioToSpeechLabelDataset)
- Leaf Losses have Loss postfix (e.g. CTCLoss)
- Leaf Models do not have any postfix, just name (e.g. QuartzNet)
We use black
as our style guide. To check whether your code will pass style check (from the NeMo's repo folder) run:
python setup.py style
and if it does not pass run python setup.py style --fix
.
- Include docstrings for every class and method exposed to the user.
- Use Python 3 type hints for every class and method exposed to the user.
- Avoid wild import:
from X import *
unless inX.py
,__all__
is defined. - Minimize the use of
**kwargs
. RaiseError
is preferred toassert
. Write:if X: raise Error
instead ofassert X
.- Classes are preferred to standalone methods.
- Methods should be atomic. A method shouldn't be longer than 75 lines, e.g. can be fit into the computer screen without scrolling.
- If a method has arguments that don't fit into one line, each argument should be in its own line for readability.
- Add
__init__.py
for every folder. - F-strings are prefered to formatted strings.
- Loggers are preferred to print. In NeMo, you can use logger from
from nemo.utils import logging
- Private functions (functions start with
_
) shouldn't be called outside its host file. - If a comment lasts multiple lines, use
'''
instead of#
.
Collection is a logical grouping of related Neural Modules. It is a grouping of modules that share a domain area or semantics. When contributing module to a collection, please make sure it belongs to that category. If you would like to start a new one and contribute back to the platform, you are very welcome to do so.