HORIZON creates a cross-domain, long-horizon user modeling benchmark from Amazon Reviews that tests generalization across time, domains, and unseen users, exposing gaps in sequential and LLM-based recommendation models.
MIND : A Large-scale Dataset for News Recommendation
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
A two-phase data construction framework generates explanatory rationales from user feedback and applies uncertainty-based distillation to fine-tune lightweight LLMs as preference-aligned user simulators for recommender systems.
Data-CUBE applies a two-level curriculum (TSP-based task ordering via simulated annealing plus difficulty-sorted mini-batches) to multi-task instruction tuning and reports gains on MTEB sentence representation tasks.
GTE_base is a compact text embedding model using multi-stage contrastive learning on diverse data that outperforms OpenAI's API and 10x larger models on massive benchmarks and works for code as text.
citing papers explorer
-
HORIZON: A Benchmark for In-the-wild User Behaviour Modeling
HORIZON creates a cross-domain, long-horizon user modeling benchmark from Amazon Reviews that tests generalization across time, domains, and unseen users, exposing gaps in sequential and LLM-based recommendation models.
-
Mirroring Users: Towards Building Preference-aligned User Simulator with User Feedback in Recommendation
A two-phase data construction framework generates explanatory rationales from user feedback and applies uncertainty-based distillation to fine-tune lightweight LLMs as preference-aligned user simulators for recommender systems.
-
Data-CUBE: Data Curriculum for Instruction-based Sentence Representation Learning
Data-CUBE applies a two-level curriculum (TSP-based task ordering via simulated annealing plus difficulty-sorted mini-batches) to multi-task instruction tuning and reports gains on MTEB sentence representation tasks.
-
Towards General Text Embeddings with Multi-stage Contrastive Learning
GTE_base is a compact text embedding model using multi-stage contrastive learning on diverse data that outperforms OpenAI's API and 10x larger models on massive benchmarks and works for code as text.