mirror of https://github.com/Dadams2/DMRec.git synced 2026-06-27 12:09:09 +00:00

[SIGIR2025] Towards Distribution Matching between Collaborative and Language Spaces for Generative Recommendation

Python 100%

Find a file

dadams 7a71bb262a fix trainer initialisation		2025-11-30 18:40:17 +09:00
data	Add files via upload	2025-08-03 19:04:15 +08:00
encoder	fix trainer initialisation	2025-11-30 18:40:17 +09:00
.gitignore	add gitignore	2025-11-30 18:32:35 +09:00
.python-version	add initial uv python dependencies	2025-11-30 18:33:01 +09:00
DMRec.png	Add files via upload	2025-04-14 15:03:01 +10:00
pyproject.toml	add initial uv python dependencies	2025-11-30 18:33:01 +09:00
README.md	Update README.md	2025-08-03 21:05:38 +10:00
uv.lock	add initial uv python dependencies	2025-11-30 18:33:01 +09:00

README.md

DMRec

This is the PyTorch implementation for our SIGIR 2025 paper:

Yi Zhang, Yiwen Zhang*, Yu Wang, Tong Chen, and Hongzhi Yin*. 2025. Towards Distribution Matching between Collaborative and Language Spaces for Generative Recommendation. In Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’25).

DMRec

📝 Environment

python == 3.8.18
pytorch == 2.1.0 (cuda:12.1)
scipy == 1.10.1
numpy == 1.24.3
yaml == 0.2.5

📝 Examples to run the codes

We adopt three widely used recommendation datasets: Amazon-Book, Yelp, and Steam used in previous work. DMRec is a recommendation framework, which means you need to first define the base model to be run, and then select three matching strategies (GODM, CPDM, and MDDM). The following are examples of runs on three datasets:

Global Optimality for Distribution Matching:

python train_encoder.py --model {model_name}_godm --dataset {dataset} --cuda 0
Composite Prior for Distribution Matching:

python train_encoder.py --model {model_name}_cpdm --dataset {dataset} --cuda 0
Mixing Divergence for Distribution Matching:

python train_encoder.py --model {model_name}_mddm --dataset {dataset} --cuda 0

The hyperparameters of each model are stored in encoder/config/modelconf. The most important hyperparameter is the trade-off coefficient beta, and the other hyperparameters can be set by default. The encoder/log folder provides training logs for reference. The results of a single experiment may differ slightly from those given in the paper because they were run several times and averaged in the experiment.

📝 Acknowledgement

To maintain fair comparisons and consistency, the model training framework, the user (item) profiles generated by LLM and their corresponding embedding representations are mainly adapted from the following repo:

https://github.com/HKUDS/RLMRec

Many thanks to them for providing the training framework and for the active contribution to the open source community.

📝 Citation

If you find this work is helpful to your research, please consider citing our paper:

@article{zhang2025towards,
  title={Towards Distribution Matching between Collaborative and Language Spaces for Generative Recommendation},
  author={Zhang, Yi and Zhang, Yiwen and Wang, Yu and Chen, Tong and Yin, Hongzhi},
  journal={arXiv preprint arXiv:2504.07363},
  year={2025}
}

README.md Unescape Escape

DMRec

📝 Environment

📝 Examples to run the codes

📝 Acknowledgement

📝 Citation

README.md