- ZCU102 Development Platform
- Vitis 2024.2
- A decent GPU to train the model
First, use the new train_bitnet.py to create a 220M parameter bitnet model using the tinystories dataset.
To quantize, follow the instructions in the README in the cpu_benchmarks folder to create the model file with ternary weights.
Make sure config.h matches the model hyperparameters.
- Open the project in Vitis Unified IDE
- Verify the install by building the Software Emulation
-
Run the Hardware build.
-
Flash the hardware image to the platform using the Vitis IDE
-
Run the host executable
./llama2 {path to weights} -z {path to tokenizer} -t {temp} -n {steps} -i {prompt} -k {path to kernel}