2024 08 25 We are pleased to announce that our paper has been accepted for publication in TOIS (ACM Transactions on Information Systems) ππ!
- Uni-CTR Description
- Dataset
- Environment Requirements
- Quick Start
- Script Description
- Model Description
- Description of Random Situation
The proposed framework for Uni-CTR comprises three parts,. Initially, the input text undergoes processing via the selected LLM Backbone to extract the commonalities and distinctions of the data across domains. Subsequently, Subsequently, LLM provides the representations obtained from different layers to the domain-specific networks to learn domain-specific characteristics. Additionally, a general network is incorporated to learn the representations of all known domains, which enables zero-shot prediction of newly unseen domains.
A Unified Framework for Multi-Domain CTR Prediction via Large Language Models
- HardwareοΌGPUοΌ
- Prepare hardware environment with GPU processor.
- Framework
- Pytorch
- Requirements
- accelerate
- huggingface-hub
- numpy
- peft
- scipy
- sympy
- tensorboard
- tokenizers
- torch-summary
- torchvision
- tqdm
- transformers
- scikit-learn
- pandas
- tensorflow
- matplotlib
After configuring the environment, you can start training and evaluation as follows:
-
running on GPU
# run training and evaluation example python training/main.py
.
βββ configs # configurations for different paradigm models
βΒ Β βββ __init__.py # relative package import
βΒ Β βββ config.py # configuration for Uni-CTR
βΒ Β βββ config_multi_domain.py # configuration for multi-domain baselines
βΒ Β βββ config_single_domain.py # configuration for single-domain baselines
βββ layers # network layers in models (mostly from package DeepCTR-torch)
βΒ Β βββ __init__.py # relative package import
βΒ Β βββ activation.py # activation networks
βΒ Β βββ core.py # core networks including ladders
βΒ Β βββ interaction.py # modules for single-domain models
βΒ Β βββ sequence.py # sequence processing networks
βΒ Β βββ utils.py # other data processing methods and additional networks
βββ miscellaneous
βββ models # all baseline models
βΒ Β βββ autoint.py
βΒ Β βββ basemodel.py
βΒ Β βββ dcn.py
βΒ Β βββ deepfm.py
βΒ Β βββ fibinet.py
βΒ Β βββ mmoe.py
βΒ Β βββ ple.py
βΒ Β βββ pnn.py
βΒ Β βββ sharedbottom.py
βΒ Β βββ star.py
βΒ Β βββ xdeepfm.py
βββ preprocessing # data preprocessing
βΒ Β βββ amazon_review_data # preprocessing methods for Amazon Review Data (2018)
βΒ Β βΒ Β βββ data_analysis.ipynb # analyse the distributions of the domains
βΒ Β βΒ Β βββ multi_domain_raw_data_processing.py # data preprocessing for baseline models
βΒ Β βΒ Β βββ multi_domain_text_processing.py # prompt generation
βΒ Β βΒ Β βββ one_for_all.py # whole dataset preprocessing pipeline for Uni-CTR
βΒ Β βββ utils.py # data preprocessing methods
βββ training # training files
βΒ Β βββ main.py # train file for Uni-CTR
βΒ Β βββ main_multi_domain.py # train file for multi-domain models
βΒ Β βββ main_single_domain.py # train file for single-domain models
βββ requirements.txt # package requirements
βββ callbacks.py # Early Stopping for single-domain models
βββ inputs.py # data transformation
βββ utils.py # general functions for Uni-CTR
Parameters for Uni-CTR can be set in configs/config.py
- Parameters for Amazon Review Data (2018)
text_encoder_models = [
# Name, num_hidden_layers, text_embedding_dim, max_length
["Llama-2-7b-hf", 24, 2048, 4096],
]
text_encoder_model_name, layer_num, text_embedding_dim, max_length = text_encoder_models[0]
ladder_frequency = 4
ladder_block = ["wo_block", "w_lora", "w_self_attention", "w_transformer_block"]
ladder_block = ladder_block[3]
r = 4
num_heads = 2
narrowed_ratio = 0.25
use_peft = True
mixed_precision = True
dropout = 0.2
epochs = 10
batch_size = 3 * len(device_ids)
seed = 2012
lr = 8e-5
max_lr = 5e-4
weight_decay = 0.001
Parameters for multi-domain can be set in configs/config_multi_domain.py
- Parameters for Amazon Review Data (2018)
multiplier = 6
embed_dim = 32
dropout = 0.2
epochs = 10
batch_size = 2048
seed = 2012
lr = 1e-7
max_lr = 1e-3
weight_decay = 0.002
Parameters for multi-domain can be set in configs/config_single_domain.py
- Parameters for Amazon Review Data (2018)
embed_dim = 32
epoch = 10
batch_size = 2048
seed = 2012
lr = 1e-7
max_lr = 1e-3
weight_decay = 0.002
-
running on GPUs with
DistributedDataParallel
-
Start training:
tmux new -s my_session # (Optional) cd multi-domain CUDA_VISIBLE_DEVICES=0,1 nohup torchrun --nproc_per_node=2 training/main.py > output.log 2>&1 &
-
Press the following keys to detach from session
my_session
:Ctrl + B + D
-
Use the following code to attach session
my_session
:tmux attach-session -t my_session
-
-
The python command above will run in the background, you can view the results through the file
ms_log/output.log
.13%|ββ | 31/14524 [06:23<36:26:32, 1.36it/s, train_auc=0.713, train_loss=0.47034054] ...
-
The model checkpoint will be saved in the current directory.
Parameters | GPU |
---|---|
Model Version | Uni-CTR |
Resource | GPU 8$\times$NVIDIA V100 32G |
Uploaded Date | 12/09/2023 (month/day/year) |
Pytorch Version | 2.0.1 |
Dataset | [1] |
Domains | [0,2,3] |
Training Parameters | epoch=10, batch_size=3$\times$len(device_ids), lr=1e-4 |
Optimizer | AdamW |
Loss Function | Sigmoid Cross Entropy With Logits |
outputs | AUC |
Loss | |
Per Step Time | ms |
Parameters | GPU |
---|---|
Model Version | Uni-CTR |
Resource | GPU 8$\times$NVIDIA V100 32G |
Uploaded Date | 12/09/2023 (month/day/year) |
Pytorch Version | 2.0.1 |
Dataset | [1] |
Domains | [0,2,3] |
batch_size | 150$\times$len(device_ids) |
outputs | AUC |
AUC | [0.7523, 0.7569, 0.7246] |
We set the random seed before training in model_config.py.