Skip to content

Using High Level Synthesis to create optimized hardware for inference of LLMs with ternary weights.

License

Notifications You must be signed in to change notification settings

thiago-monteiro/HLSBitNet

Repository files navigation

HLS Implementation of BitNet

Prerequisites

  1. ZCU102 Development Platform
  2. Vitis 2024.2
  3. A decent GPU to train the model

Model Creation

First, use the new train_bitnet.py to create a 220M parameter bitnet model using the tinystories dataset.

To quantize, follow the instructions in the README in the cpu_benchmarks folder to create the model file with ternary weights.

Make sure config.h matches the model hyperparameters.

Build Instructions

  1. Open the project in Vitis Unified IDE
  2. Verify the install by building the Software Emulation

Run Instructions

  1. Run the Hardware build.

  2. Flash the hardware image to the platform using the Vitis IDE

  3. Run the host executable

    ./llama2 {path to weights} -z {path to tokenizer} -t {temp} -n {steps} -i {prompt} -k {path to kernel}

About

Using High Level Synthesis to create optimized hardware for inference of LLMs with ternary weights.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •