Skip to content

Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag et al., 2022)

License

Notifications You must be signed in to change notification settings

dashstander/block-recurrent-transformer

Repository files navigation

Block Recurrent Transformer

A PyTorch implementation of Hutchins & Schlag et al.. Owes very much to Phil Wang's x-transformers. Very much in-progress.

Dockerfile, requirements.txt, and environment.yaml because I love chaos.

Differences from the Paper (as of 2022/05/04)

  • Keys and values are not shared between the "vertical" and "horizontal" directions (the standard input -> output information flow and the recurrent state flow, respectively).
  • The state vectors are augmented with Rotary Embeddings for positional encoding, instead of using learned embeddings.
  • The special LSTM gate initialization is not yet implemented.

About

Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag et al., 2022)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published