This repository contains the code used in the research paper titled "FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs" authored by Albert Sawczyn, Jakub Binkowski, Denis Janiak, Bogdan Gabrys, Tomasz Kajdanowicz.
If you use this repository in your work, please cite it as follows:
@misc{sawczyn2025factselfcheckfactlevelblackboxhallucination,
title={FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs},
author={Albert Sawczyn and Jakub Binkowski and Denis Janiak and Bogdan Gabrys and Tomasz Kajdanowicz},
year={2025},
eprint={2503.17229},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2503.17229},
}CC BY-SA 4.0
The FavaMultiSamples dataset is available at Hugging Face.
- Configured Python 3.12 environment (e.g. using conda)
- Self-hosted OpenAI-compatible server with Llama-3.1-70B-Instruct model. We used VLLM API (see VLLM).
- OpenAI API key
We use a few environment variables to configure the project. You can set them manually or put them in .env file. The file is loaded automatically without overriding existing variables.
The example of the .env file:
# OpenAI key
OPENAI_API_KEY=[your-openai-key]
# Self-hosted OpenAI-compatible server
SELFHOSTED_API_URL=[your-api-url]
SELFHOSTED_API_KEY=[your-api-key]To install all dependencies, run:
pip install -r requirements.txtThe repository uses DVC to manage the data. To download the data, run:
dvc pull[!NOTE] The data will be available soon.
The repository uses DVC to manage the dataset construction pipeline.
dvc.yamlcontains all of the stages except notebooks with results of experiments.
To reproduce all dvc stages run:
dvc reproNotebooks with results of experiments are available in the notebooks directory.
The repository uses LangChain cache to store the results of the LLM calls. The cache is stored in the .langchain.db file. To clear the cache remove the file.