Simplifying Content-Based News Recommendation

A NEW VERSION OF THE FRAMEWORK IS OUT IN NewsRecLib

Description

This is the code accompanying the paper Simplifying Content-Based Neural News Recommendation: On User Modeling and Training Objectives in which we propose a unified framework allowing for a systematic and fair comparison of news recommenders across three crucial design dimensions: (i) candidate-awareness in user modeling, (ii) click behavior fusion, and (iii) training objectives.

Project Structure

The directory structure of the project looks like this:

├── configs                   <- Hydra configuration files
│   ├── callbacks                <- Callbacks configs
│   ├── datamodule               <- Datamodule configs
│   ├── debug                    <- Debugging configs
│   ├── experiment               <- Experiment configs
│   ├── extras                   <- Extra utilities configs
│   ├── hparams_search           <- Hyperparameter search configs
│   ├── hydra                    <- Hydra configs
│   ├── local                    <- Local configs
│   ├── logger                   <- Logger configs
│   ├── model                    <- Model configs
│   ├── paths                    <- Project paths configs
│   ├── trainer                  <- Trainer configs
│   │
│   ├── eval.yaml             <- Main config for evaluation
│   └── train.yaml            <- Main config for training
│
├── data                   <- Project data
│
├── logs                   <- Logs generated by hydra and lightning loggers
│
├── scripts                <- Shell scripts
│
├── src                    <- Source code
│   ├── datamodules              <- Lightning datamodules
│   ├── models                   <- Lightning models
│   ├── utils                    <- Utility scripts
│   │
│   ├── eval.py                  <- Run evaluation
│   └── train.py                 <- Run training
│
├── .gitignore                <- List of files ignored by git
├── requirements.txt          <- File for installing python dependencies
├── setup.py                  <- File for installing project as a package
└── README.md

Data

We use the MIND dataset in all experiments.
The datasets are automatially downloaded, cached, and pre-processed when running the train.py pipeline.
Alternatively, the datasets can be manually downloaded into the data directory using the URLs from the MIND data config.

How to run

Install dependencies

# clone project
git clone [email protected]:andreeaiana/simplifying_nnr.git
cd simplifying_nnr

# [OPTIONAL] create conda environment
conda create -n simplifying_nnr_env python=3.9
conda activate simplifying_nnr_env

# install pytorch according to instructions
# https://pytorch.org/get-started/

# install requirements
pip install -r requirements.txt

Train model with chosen experiment configuration from configs/experiment/

# train on CPU
python src/train.py experiment=experiment_name.yaml trainer=cpu

# train on GPU
python src/train.py experiment=experiment_name.yaml trainer=gpu

You can override any parameter from command line like this

python src/train.py trainer.max_epochs=20 datamodule.batch_size=64

Citation

@article{iana2023simplifying,
    title={Simplifying Content-Based Neural News Recommendation: On User Modeling and Training Objectives},
    author={Andreea Iana and Goran Glavaš and Heiko Paulheim},
    booktitle={Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
    pages={2384--2388},
    year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simplifying Content-Based News Recommendation

A NEW VERSION OF THE FRAMEWORK IS OUT IN NewsRecLib

Description

Project Structure

Data

How to run

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
configs		configs
data		data
logs		logs
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
framework.png		framework.png
requirements.txt		requirements.txt
setup.py		setup.py

License

andreeaiana/simplifying_nnr

Folders and files

Latest commit

History

Repository files navigation

Simplifying Content-Based News Recommendation

A NEW VERSION OF THE FRAMEWORK IS OUT IN NewsRecLib

Description

Project Structure

Data

How to run

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages