Word-level Language Modeling using Transformer

This example trains a Transformer on a language modeling task. By default, the training script uses the Wikitext-2 dataset, provided. The trained model can then be used by the generate script to generate new text.

python main.py cuda=true epochs=6 lr=5  # Train a Transformer on Wikitext-2 with CUDA.

python generate.py                      # Generate samples from the default model checkpoint.

The model uses the Transformer module (nn.TransformerEncoder and nn.TransformerEncoderLayer) which will automatically use the cuDNN backend if run on CUDA with cuDNN installed.

During training, if a keyboard interrupt (Ctrl-C) is received, training is stopped and the current model is evaluated against the test dataset.

Run

python main.py --help

to see all available arguments.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
conf		conf
data/wikitext-2		data/wikitext-2
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
data.py		data.py
generate.py		generate.py
main.py		main.py
model.py		model.py
pyproject.toml		pyproject.toml
requirements-dev.lock		requirements-dev.lock
requirements.lock		requirements.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Word-level Language Modeling using Transformer

About

Releases

Packages

Languages

License

wearepal/transformer-example

Folders and files

Latest commit

History

Repository files navigation

Word-level Language Modeling using Transformer

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages