Code Repository for CIKM 26 Proplus submission
  • Python 98.3%
  • Shell 1.7%
Find a file
2026-05-23 11:22:40 +00:00
data main commit 2026-05-20 05:17:50 +00:00
encoder prep for submision 2026-05-23 11:22:40 +00:00
generation add proplus v2 2026-05-23 06:33:04 +00:00
pics Add files via upload 2026-05-02 15:54:28 +08:00
proplus_paper@32b7d46af2 main commit 2026-05-20 05:17:50 +00:00
RLMRec@c709177e70 main commit 2026-05-20 05:17:50 +00:00
scripts/paper prep for submision 2026-05-23 11:22:40 +00:00
.gitignore prep for submision 2026-05-23 11:22:40 +00:00
.python-version main commit 2026-05-20 05:17:50 +00:00
pyproject.toml main commit 2026-05-20 05:17:50 +00:00
README.md prep for submision 2026-05-23 11:22:40 +00:00
regen_all.sh main commit 2026-05-20 05:17:50 +00:00
uv.lock main commit 2026-05-20 05:17:50 +00:00

ProPlus-v2

This repository contains the code and paper artifacts for the ProPlus-v2 recommendation experiments. The implementation builds on a LightGCN training stack and studies how LLM-derived user/item profile embeddings should be transferred into collaborative recommendation.

The current paper result uses ProPlus-v2 with dataset-specific graph calibration:

  • Amazon/Yelp: candidate-calibrated graph with SDR removed.
  • Steam: the strongest static soft-graph ProPlus-v2 variant.

The paper tables are generated from tagged three-seed runs over seeds 2024, 2025, and 2026.

Repository Layout

  • encoder/: model training, configs, and experiment launcher.
  • generation/: graph construction utilities.
  • scripts/paper/: paper aggregation and significance scripts.
  • proplus_paper/: LaTeX source and generated tables/summaries.
  • data/: local datasets, profile embeddings, and graph artifacts.

Environment

The project is managed with uv and targets Python 3.10+.

uv sync

If using an existing environment, the key runtime dependencies are PyTorch, SciPy, NumPy, PyYAML, scikit-learn, tqdm, and matplotlib.

Required Data Artifacts

Each dataset directory under data/{amazon,yelp,steam}/ is expected to contain:

  • trn_mat.pkl, val_mat.pkl, tst_mat.pkl
  • usr_emb_np.pkl, itm_emb_np.pkl
  • new_trn_rag_mat.pkl
  • generated soft/candidate graph artifacts used by the sweeps

The ProPlus-v2 candidate graph artifacts can be rebuilt with:

python generation/build_proplus_candidate_graph.py --dataset amazon --overwrite
python generation/build_proplus_candidate_graph.py --dataset yelp --overwrite
python generation/build_proplus_candidate_graph.py --dataset steam --overwrite

The main generated files are:

  • proplus_candidate_graph_mat.pkl
  • proplus_candidate_graph_meta.pkl
  • proplus_candidate_graph_summary.json

Running Paper Sweeps

Run from encoder/.

Candidate tuning:

python run_proplus_paper_sweep.py \
  --suite proplus_v2_candidate_tune \
  --datasets amazon yelp steam \
  --seeds 2024 2025 2026 \
  --gpus 0 1 2 3 \
  --slots-per-gpu 2

Candidate ablations:

python run_proplus_paper_sweep.py \
  --suite proplus_v2_candidate_ablation \
  --datasets amazon yelp steam \
  --seeds 2024 2025 2026 \
  --gpus 0 1 2 3 \
  --slots-per-gpu 2

The selected final ProPlus-v2 variants are:

  • Amazon: proplus_v2_candidate_ablation/full_candidate_nosdr
  • Yelp: proplus_v2_candidate_ablation/full_candidate_nosdr
  • Steam: proplus_v2_ablation/static_uniform_graph

Regenerating Paper Results

Aggregate tagged logs into paper tables:

python scripts/paper/aggregate_experiment_results.py

Run the direct ProPlus-v2 vs ProMax significance test:

python scripts/paper/proplus_v2_promax_significance.py \
  --out-csv proplus_paper/generated/proplus_v2_vs_promax_significance.csv \
  --out-json proplus_paper/generated/proplus_v2_vs_promax_significance.json

Important generated outputs:

  • proplus_paper/generated/proplus_main_llm_results_table.tex
  • proplus_paper/generated/proplus_v2_main_table.tex
  • proplus_paper/generated/proplus_v2_candidate_ablation_table.tex
  • proplus_paper/generated/proplus_v2_vs_promax_significance.csv
  • proplus_paper/generated/proplus_v2_tuning_summary.md

Current Main Result

Against ProMax, the final ProPlus-v2 table reports:

  • Amazon: improves R@20/N@20, but loses R@10/N@10.
  • Yelp: improves all four reported metrics.
  • Steam: improves all four reported metrics.

The Amazon early-rank gap is interpreted as a candidate coverage/ranking tradeoff: the candidate graph recovers additional relevant items by R@20, but introduces plausible semantic substitutes that are not always ranked inside the first ten positions.