- Overview
- Features
- Benefits Over Traditional LLMs
- Installation
- Usage
- Project Structure
- Credits
- License
- Contact
Welcome to the Quantum-Enhanced Language Model (QELM) – an innovative project that merges the power of quantum computing with natural language processing to create a next-generation language model. Leveraging Qiskit and Qiskit Aer, QELM explores the potential of quantum circuits in enhancing language understanding and generation capabilities. Utilizing Qubits Qelm can train a model that would normally take up gigabytes worth of data in llm files all the way down to "ultra-compact models". With this size, llm's can run instantly with no loss of capabilities or intelligence and on small computers instead requiring data centers to run single models. The goal is to create llm's that are instant, smarter and can be utilized anywhere and by anyone.
- Quantum Parameter Optimization: Utilizes the Parameter Shift Rule for gradient-based optimization within quantum circuits.
- Advanced Quantum Circuits: Implements entangling gates and multiple layers for complex state manipulations.
- Synthetic and Real Dataset Support: Capable of training on both synthetic datasets for testing and real-world language data.
- Enhanced Model Architecture: Incorporates residual connections and normalization for stable and efficient training.
- Parameter Persistence: Robust saving and loading mechanisms with versioning to ensure model integrity.
- User-Friendly CLI: Intuitive command-line interface for training, inference, and model management.
Quantum computers can process a vast number of states simultaneously due to superposition and entanglement, potentially enabling faster and more efficient computations compared to classical models.
Quantum states can represent complex, high-dimensional data compactly, allowing QELM to capture intricate patterns and relationships in language data more effectively.
Quantum algorithms can tackle optimization problems differently, potentially finding better minima or converging faster during training.
Quantum circuits can represent complex functions with fewer parameters, leading to more efficient models that are less resource-intensive to store and deploy.
Integrating quantum principles inspires new model architectures, offering unique advantages in handling language tasks and potentially mimicking human-like language understanding more closely.
- Python 3.7 to 3.11
- Git installed on your machine
- Remember if python command does not work then use py
git clone https://github.com/R-D-BioTech-Alaska/QELM.git
cd QELM
It's recommended to use a virtual environment to manage dependencies as it's easier to deal with issues. If you are sure of your capabilities then using a full environment is best.
# Create a virtual environment named 'qiskit_env'
python -m venv qiskit_env
# Activate the virtual environment
# Windows
qiskit_env\Scripts\activate
# Unix/Linux/MacOS
source qiskit_env/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
Note: If you encounter permission issues, you can add the --user
flag to the pip install
command.
QELM provides a command-line interface (CLI) to facilitate training, inference, saving, and loading models. This is extremely basic and being worked on. A fully working GUI model with threading is now available.
You can train QELM using either synthetic data (for testing purposes) or a real dataset. Any dataset can be used. File extension may need to be altered for certain types of data.
Synthetic data is useful for initial testing and ensuring that the training pipeline works correctly. If you are testing try 2 epochs instead of 20. Each epoch can take minutes to hours depending on your computers capabilities.
python Qelm2.py --train --epochs 02 --lr 0.05
Parameters:
--train
: Initiates the training process.--epochs 02
: Sets the number of training epochs to 02. Only increase this if you know what you're doing--lr 0.05
: Sets the learning rate to 0.05.--save_path
: (Optional) Path to save the trained model (default:quantum_llm_model_enhanced.json
).--dataset_path
: (Optional) Path to a real dataset file. If not provided, synthetic data will be used.
Expected Output:
INFO:root:Creating synthetic dataset...
INFO:root:Starting training...
INFO:root:Starting Epoch 1/2
INFO:root:Epoch 1/2, Average Loss: 0.012345
INFO:root:Starting Epoch 2/2
INFO:root:Epoch 2/2, Average Loss: 0.011234
INFO:root:Training completed.
INFO:root:Model saved to quantum_llm_model_enhanced.json
If you have a real language dataset, you can specify its path using the --dataset_path
argument. Ensure that your dataset is in plain text format, csv is also now compatible.
python Qelm2.py --train --epochs 02 --lr 0.05 --dataset_path path_to_your_dataset.txt
Parameters:
--train
: Initiates the training process.--epochs 02
: Sets the number of training epochs to 02. Only increase this if you know what you're doing--lr 0.05
: Sets the learning rate to 0.05.--dataset_path path_to_your_dataset.txt
: Specifies the path to your real dataset file.--save_path
: (Optional) Path to save the trained model (default:quantum_llm_model_enhanced.json
).
Expected Output:
INFO:root:Loading real dataset...
INFO:root:Starting training...
INFO:root:Starting Epoch 1/2
INFO:root:Epoch 1/2, Average Loss: 0.012345
INFO:root:Starting Epoch 2/2
INFO:root:Epoch 2/2, Average Loss: 0.011234
INFO:root:Training completed.
INFO:root:Model saved to quantum_llm_model_enhanced.json
Notes:
- Dataset Preparation: Ensure your dataset file (
path_to_your_dataset.txt
) is properly formatted and preprocessed. The script tokenizes the text and maps the most frequent tokens to unique IDs based on the specifiedvocab_size
. - Vocabulary Size: The default
vocab_size
is set to 256. Adjust this in the script if your dataset requires a different size.
After training the model, you can perform inference to generate logits (predicted outputs) based on an input token ID.
python Qelm2.py --inference --input_id 5 --load_path quantum_llm_model_enhanced.json
Parameters:
--inference
: Initiates the inference process.--input_id 5
: Specifies the input token ID for which you want to generate logits.--load_path quantum_llm_model_enhanced.json
: Specifies the path to the saved model file.
Expected Output:
Logits: [0.00123456 0.00234567 -0.00012345 0.00345678 ...]
Notes:
- Input ID Validation: Ensure that the
input_id
provided is within the range of your vocabulary size (e.g., 0 to 255 forvocab_size=256
). Providing an ID outside this range will result in an error. - Interpreting Logits: The output logits represent the model's predictions for each token in the vocabulary. These can be further processed (e.g., using a softmax function) to obtain probability distributions over the vocabulary.
To understand all available options and ensure you're using the script correctly:
python Qelm2.py --help
Expected Output:
usage: Qelm2.py [-h] [--train] [--inference] [--input_id INPUT_ID]
[--save_path SAVE_PATH] [--load_path LOAD_PATH]
[--epochs EPOCHS] [--lr LR] [--dataset_path DATASET_PATH]
Quantum-Enhanced Language Model (QELM) - Enhanced Version
optional arguments:
-h, --help show this help message and exit
--train Train the model
--inference Run inference
--input_id INPUT_ID Input token ID for inference
--save_path SAVE_PATH
Path to save the model
--load_path LOAD_PATH
Path to load the model
--epochs EPOCHS Number of training epochs
--lr LR Learning rate
--dataset_path DATASET_PATH
Path to real dataset file (optional)
QELM/
├── Qelm2.py
├── requirements.txt
├── README.md
├── quantum_llm_model_enhanced.json
└── docs/
└── images/
└── QELM_Diagram.png
└── quantum.png
Developed by Brenton Carter (Inserian).
Special thanks to the Qiskit community for providing robust quantum computing tools and resources that made this project possible.
If you use or build upon this project, please provide proper credit to the original developer:
-
Include the following attribution in your project:
"Based on Quantum-Enhanced Language Model (QELM) by Brenton Carter (Inserian)" -
Provide a link back to the original repository: R-D-BioTech-Alaska/QELM
-
Include a mention to the Qiskit community. Qiskit
-
Thank IBM!
This project is licensed under the MIT License.
For any inquiries, suggestions, or contributions, feel free to reach out:
- Email: [email protected]
- GitHub: R-D-BioTech-Alaska
- Website: RDBioTech.org
Disclaimer: Quantum computing is an emerging field, and this project serves as an experimental exploration into integrating quantum circuits with language modeling. While promising, it is subject to the current limitations of quantum hardware and algorithms. Please add to this project so we can help move quantum computing into the future.