This repository contains the code and data associated with CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations, our EMNLP 2023 paper. If you have any questions, please contact me at: myra [at] cs [dot] stanford [dot] edu
get_caricature_scores.py
: script to run to compute individuation and exaggeration scores for a given dataset of simulations (example usage:python get_caricature_scores.py examples/twitter_mini user comment
)generation_scripts
: example scripts to generate simulations in different contextstopics
: lists of topics for each context.generate_embeddings.ipynb
: compute embeddings for output dataindividuation_scores.ipynb
: reproduce individuation score resultsexaggeration_scores.ipynb
: reproduce exaggeration score results
data
: generated simulations for the Online Forum and Interview contexts