- Python 98.3%
- Shell 1.7%
| data | ||
| encoder | ||
| generation | ||
| pics | ||
| proplus_paper@32b7d46af2 | ||
| RLMRec@c709177e70 | ||
| scripts/paper | ||
| .gitignore | ||
| .python-version | ||
| pyproject.toml | ||
| README.md | ||
| regen_all.sh | ||
| uv.lock | ||
ProPlus-v2
This repository contains the code and paper artifacts for the ProPlus-v2 recommendation experiments. The implementation builds on a LightGCN training stack and studies how LLM-derived user/item profile embeddings should be transferred into collaborative recommendation.
The current paper result uses ProPlus-v2 with dataset-specific graph calibration:
- Amazon/Yelp: candidate-calibrated graph with SDR removed.
- Steam: the strongest static soft-graph ProPlus-v2 variant.
The paper tables are generated from tagged three-seed runs over seeds 2024, 2025, and 2026.
Repository Layout
encoder/: model training, configs, and experiment launcher.generation/: graph construction utilities.scripts/paper/: paper aggregation and significance scripts.proplus_paper/: LaTeX source and generated tables/summaries.data/: local datasets, profile embeddings, and graph artifacts.
Environment
The project is managed with uv and targets Python 3.10+.
uv sync
If using an existing environment, the key runtime dependencies are PyTorch, SciPy, NumPy, PyYAML, scikit-learn, tqdm, and matplotlib.
Required Data Artifacts
Each dataset directory under data/{amazon,yelp,steam}/ is expected to contain:
trn_mat.pkl,val_mat.pkl,tst_mat.pklusr_emb_np.pkl,itm_emb_np.pklnew_trn_rag_mat.pkl- generated soft/candidate graph artifacts used by the sweeps
The ProPlus-v2 candidate graph artifacts can be rebuilt with:
python generation/build_proplus_candidate_graph.py --dataset amazon --overwrite
python generation/build_proplus_candidate_graph.py --dataset yelp --overwrite
python generation/build_proplus_candidate_graph.py --dataset steam --overwrite
The main generated files are:
proplus_candidate_graph_mat.pklproplus_candidate_graph_meta.pklproplus_candidate_graph_summary.json
Running Paper Sweeps
Run from encoder/.
Candidate tuning:
python run_proplus_paper_sweep.py \
--suite proplus_v2_candidate_tune \
--datasets amazon yelp steam \
--seeds 2024 2025 2026 \
--gpus 0 1 2 3 \
--slots-per-gpu 2
Candidate ablations:
python run_proplus_paper_sweep.py \
--suite proplus_v2_candidate_ablation \
--datasets amazon yelp steam \
--seeds 2024 2025 2026 \
--gpus 0 1 2 3 \
--slots-per-gpu 2
The selected final ProPlus-v2 variants are:
- Amazon:
proplus_v2_candidate_ablation/full_candidate_nosdr - Yelp:
proplus_v2_candidate_ablation/full_candidate_nosdr - Steam:
proplus_v2_ablation/static_uniform_graph
Regenerating Paper Results
Aggregate tagged logs into paper tables:
python scripts/paper/aggregate_experiment_results.py
Run the direct ProPlus-v2 vs ProMax significance test:
python scripts/paper/proplus_v2_promax_significance.py \
--out-csv proplus_paper/generated/proplus_v2_vs_promax_significance.csv \
--out-json proplus_paper/generated/proplus_v2_vs_promax_significance.json
Important generated outputs:
proplus_paper/generated/proplus_main_llm_results_table.texproplus_paper/generated/proplus_v2_main_table.texproplus_paper/generated/proplus_v2_candidate_ablation_table.texproplus_paper/generated/proplus_v2_vs_promax_significance.csvproplus_paper/generated/proplus_v2_tuning_summary.md
Current Main Result
Against ProMax, the final ProPlus-v2 table reports:
- Amazon: improves R@20/N@20, but loses R@10/N@10.
- Yelp: improves all four reported metrics.
- Steam: improves all four reported metrics.
The Amazon early-rank gap is interpreted as a candidate coverage/ranking tradeoff: the candidate graph recovers additional relevant items by R@20, but introduces plausible semantic substitutes that are not always ranked inside the first ten positions.