SPECTRA: Revealing the Full Spectrum of User Preferences via Distributional LLM Inference
read the original abstract
Large Language Models (LLMs) are increasingly used to model user preferences, with the typical output as a directly-generated ranked item list per user. However, this generative paradigm inherits the bias and opacity of autoregressive decoding. It over-emphasizes frequent (head) preferences and suppresses minority, long-tail ones. To address this, we propose SPECTRA (Softmax Probing for Extracted Category-level Token Readouts and Analysis), which treats the finetuned LLM as an implicit probabilistic model and probes its softmax to infer a probability distribution over semantically interpretable preference categories. We evaluate SPECTRA on MovieLens, Yelp, and a large-scale short-video platform. SPECTRA delivers (i) distributional alignment, reducing Jensen-Shannon divergence to the empirical preference distribution by 38 to 44 percent across public datasets; (ii) long-tail recovery with cross-user fairness, raising top-3 category exposure entropy by 23 percent on MovieLens and producing a larger gain on tail-preference users than on head-preference users; and (iii) downstream application value, with a 41 to 46 percent category-NDCG boost on MovieLens and Yelp, and a 7x improvement on long-tail category ranking on a large-scale deployment against a head-optimized production ranker.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.