Transformer Language Model

This is a TensorFlow implementation of a Transformer-based language model. The Transformer architecture is a powerful model for various natural language processing tasks, including language generation and translation. This particular implementation focuses on language generation tasks.

Description

The Transformer model consists of encoder and decoder blocks, each containing multi-head self-attention layers and feed-forward neural networks. The model learns to generate text autoregressively, predicting the next token based on the preceding context.

Usage

Install TensorFlow and its dependencies. Make sure to have an input.txt file containing the text corpus for training. Update the configuration parameters such as block_size, batch_size, learning_rate, etc., according to your requirements. Run the script, which will train the Transformer model on the provided text corpus. After training, you can test the model's generation capability by calling the generate() function and passing an initial sequence of tokens.

Requirements

TensorFlow Numpy

Acknowledgements

This implementation is based on the Transformer architecture proposed in the paper "Attention is All You Need" by Vaswani et al.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
input.txt		input.txt
requirements.txt		requirements.txt
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Transformer Language Model

Description

Usage

Requirements

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

GL3MON/TransformerFromScratchUsingTensorflow

Folders and files

Latest commit

History

Repository files navigation

Transformer Language Model

Description

Usage

Requirements

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages