This repository contains a Python implementation of a Bigram Language Model. The model is designed to predict the next word in a sequence based on the probability of word pairs (bigrams). This implementation is optimized to run on a MacBook, leveraging Apple's MPS (Metal Performance Shaders) for efficient computation on M1/M2 chips.
Before running the code, ensure you have the following prerequisites installed:
- Python 3.8+: The code is compatible with Python 3.8 and above.
- PyTorch: Install the latest version of PyTorch that supports Apple's MPS.
- Apple Silicon: This code is optimized for MacBooks with M1/M2 chips.
To install the necessary Python packages, you can use pip:
pip install torch numpyFirst, clone the repository to your local machine:
git clone <repository-url>
cd <repository-directory>To run the bigram model, execute the following command in your terminal:
python bigram_model.pyThe code is configured to use Apple’s MPS backend if available. This enables efficient training and inference on MacBooks with M1/M2 chips.
Ensure that the tensors and the model are correctly moved to the MPS device as shown in the code:
import torch
# Check if MPS is available and set the device
device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")
# Move your model and tensors to the device
model.to(device)
input_tensor = input_tensor.to(device)- MPS Device Not Recognized: Ensure that your PyTorch installation supports MPS. If the device is not recognized, it might be due to an outdated version of PyTorch or macOS.
- Performance Issues: If you experience slow performance, try reducing the batch size or sequence length, as larger values may cause excessive memory usage.