RecToM: A Benchmark for Evaluating Machine Theory of Mind in LLM-based Conversational Recommender Systems
This repository contains the official dataset, evaluation scripts, and benchmark details for our AAAI-accepted paper:
RecToM: A Benchmark for Evaluating Machine Theory of Mind in LLM-based Conversational Recommender Systems
RecToM is a benchmark designed to rigorously evaluate the Theory of Mind (ToM) capabilities of Large Language Models (LLMs) within recommendation dialogues.
LLMs must infer usersβ Beliefs, Desires, and Intents during multi-turn interactionsβskills essential for building context-aware and effective recommender systems.
A single utterance may express multiple distinct intentions. RecToM captures this natural conversational complexity.
Intentions are hierarchical: an utterance may contain both a high-level purpose and fine-grained contextual sub-intentions.
Beliefs about items (e.g., movies) involve multiple interconnected aspects:
who introduces the item, whether the seeker has watched it, and their levels of preference or acceptance.
Users frequently pursue multiple goals simultaneously, such as exploring new items while comparing alternatives.
RecToM contains 20,524 expertly annotated dialogueβquery pairs across 10 ToM reasoning categories.
| Question Type | Quantity | # Options | Answer Type |
|---|---|---|---|
| Desire (Seek) | 1,448 | 2 | single |
| Coarse Intention (Rec / Seek) | 2,205 / 2,205 | 5 / 4 | multiple |
| Fine Intention (Rec / Seek) | 2,205 / 2,205 | 10 / 16 | multiple |
| Belief (Rec) | 1,762 | 7 | single |
| Prediction (Rec / Seek) | 2,098 / 2,149 | 5 / 4 | multiple |
| Judgement (Rec / Seek) | 2,098 / 2,149 | 2 / 2 | single |
Table: Statistics of question types and option distributions in RecToM.
You can run the evaluation using the provided script:
bash evaluate/12_run.shIf you use RecToM in your research, please cite our paper:
@inproceedings{li2026rectom,
title = {RecToM: A Benchmark for Evaluating Machine Theory of Mind in LLM-based Conversational Recommender Systems},
author = {Li, Mengfan and Shi, Xuanhua and Deng, Yang},
booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence (AAAI-26)},
year = {2026},
publisher = {AAAI Press},
note = {To appear}
}