AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models

🎉 Accepted as Oral Paper at EMNLP 2025 Main Conference! 🎉

Authors

Sangjun Lee*, Seung-taek Woo*, Jun-gyu Jin, Changhun Lee, Eunhyeok Park
_{* Equal contribution}

Description

This is the official repository for AMQ.

AMQ is an automated mixed-precision quantization library for Large Language Models (LLMs). It uses multi-objective optimization to find the optimal balance between model performance and efficiency.

Key Features

Multiple Quantization Methods: Support for AWQ, GPTQ, OWQ, and more
Multi-objective Optimization: NSGA-II based search algorithm
Surrogate Models: Efficient exploration through MLP and RBF
Layer-wise Sensitivity Analysis: Measure quantization sensitivity per layer
Automated Mixed-precision Search: Automatic exploration of optimal bit configurations

Installation

Installation via pip

pip install -e .

Installation via requirements

pip install -r requirements.txt

Usage Examples

0. Prepare the Quantization proxy

bash scripts/amq_quantiztion_proxy.sh 0

1. Measure Layer Sensitivity

bash scripts/amq_sensitivity.sh 0

2. Mixed-precision Search

bash scripts/amq_search.sh 0

3. Evaluate Search Results

bash scripts/amq_quantization_proxy

Speed Benchmark

0. Install kernel environment

bash scripts/amq_install_kernel.sh

1. Measure Speed (It requires the Quantization proxy)

bash scripts/amq_speed_benchmark.sh 0

Supported Models

Llama 2 (7B, 13B, 70B)
Mistral
Qwen2

Dependencies

Python >= 3.8
PyTorch >= 2.0.0
Transformers == 4.45.2
HQQ >= 0.2.0
See requirements.txt for more

Configuration Files

Model-specific configuration files are located in the configs/ directory:

configs/llama.json - Llama model configuration
configs/mistral.json - Mistral model configuration
configs/qwen2.json - Qwen2 model configuration

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Citation

If you use this work in your research, please cite:

@inproceedings{lee2025amq,
  title={Amq: Enabling automl for mixed-precision weight-only quantization of large language models},
  author={Lee, Sangjun and Woo, Seung-taek and Jin, Jun-gyu and Lee, Changhun and Park, Eunhyeok},
  booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
  pages={35520--35538},
  year={2025}
}

Contributing

Contributions are welcome! Please submit a Pull Request or open an issue.

Contact

If you have any questions or feedback, please open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
amq		amq
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models

🎉 Accepted as Oral Paper at EMNLP 2025 Main Conference! 🎉

Authors

Description

Key Features

Installation

Installation via pip

Installation via requirements

Usage Examples

0. Prepare the Quantization proxy

1. Measure Layer Sensitivity

2. Mixed-precision Search

3. Evaluate Search Results

Speed Benchmark

0. Install kernel environment

1. Measure Speed (It requires the Quantization proxy)

Supported Models

Dependencies

Configuration Files

License

Citation

Contributing

Contact

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

dlwns147/amq

Folders and files

Latest commit

History

Repository files navigation

AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models

🎉 Accepted as Oral Paper at EMNLP 2025 Main Conference! 🎉

Authors

Description

Key Features

Installation

Installation via pip

Installation via requirements

Usage Examples

0. Prepare the Quantization proxy

1. Measure Layer Sensitivity

2. Mixed-precision Search

3. Evaluate Search Results

Speed Benchmark

0. Install kernel environment

1. Measure Speed (It requires the Quantization proxy)

Supported Models

Dependencies

Configuration Files

License

Citation

Contributing

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages