Biases in LLM-Generated Musical Taste Profiles for Recommendation

This repository provides our Python code to reproduce the experiments from the paper "Biases in LLM-Generated Musical Taste Profiles for Recommendation". Accepted to ACM recsys 2025. Link to the paper: https://arxiv.org/abs/2507.16708

Please cite our paper if you use this code in your own work:

@inproceedings{sguerra2025biases,
  title={Biases in LLM-Generated Musical Taste Profiles for Recommendation},
  author={Sguerra, Bruno and Epure, Elena V and Lee, Harin and Moussallam, Manuel},
  booktitle={Proceedings of the Nineteenth ACM Conference on Recommender Systems},
  pages={527--532},
  year={2025}
}

Dataset

The data folder contains the following files:

user_data.csv: user profiles and ratings.
long_term.csv: long-term preferences, used for computing the ATE with Doubly Robust.

Quickstart

Build the docker image:

$ make build

Run a Docker container and start an interactive bash session, while mounting the current directory:

$ make run-bash

Paper plots

To generate the figures of the paper, refer to the notebook LLM_bias_plots.ipynb.

Doubly Robust estimation of ATE

The boostrapped estimations of ATE from the doubly robust method can be optained running doubly_robust.py in the srs folder.

Downstream task

Download the fine-tuned cross-encoder, released by a previous work:

wget https://zenodo.org/records/14289764/files/models.zip
apt-get update
apt-get install unzip
unzip models.zip -d models/ && rm models.zip

Train a new model on our dataset:

poetry run python -m  gpl.train  --path_to_generated_data "./data"    --base_ckpt "msmarco-bert-base-dot-v5"     --gpl_score_function "cos_sim"     --batch_size_gpl 10   --gpl_steps 10000   --output_dir "models/NL_profiles"   --cross_encoder "./models/cross-encoder-musiccaps-ms-marco-MiniLM-L-6-v2/"  --max_seq_length 512

Test the new model on our test datasets:

poetry run python src/eval.py --output_path results/ --input_path data/test/ --our_model_path models/NL_profiles/

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
data		data
src		src
Dockerfile		Dockerfile
LLM_bias_plots.ipynb		LLM_bias_plots.ipynb
Makefile		Makefile
README.md		README.md
repo_plots.ipynb		repo_plots.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Biases in LLM-Generated Musical Taste Profiles for Recommendation

Dataset

Quickstart

Paper plots

Doubly Robust estimation of ATE

Downstream task

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

deezer/recsys25_llm_biases

Folders and files

Latest commit

History

Repository files navigation

Biases in LLM-Generated Musical Taste Profiles for Recommendation

Dataset

Quickstart

Paper plots

Doubly Robust estimation of ATE

Downstream task

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages