LocalSUG: City-Preference-Enhanced LLM for Query Suggestion in Local-Life Services

Haibo Zhou; Hainan Zhang; Jinwen Chen; Lingxiang Wang; Shiwen Zhang; Shuai Gong; Wei Lin; Yachao Zhao; Zheng Zhang

arxiv: 2603.04946 · v2 · pith:3MH6UTCNnew · submitted 2026-03-05 · 💻 cs.CL

LocalSUG: City-Preference-Enhanced LLM for Query Suggestion in Local-Life Services

Jinwen Chen , Shiwen Zhang , Shuai Gong , Zheng Zhang , Yachao Zhao , Lingxiang Wang , Haibo Zhou , Wei Lin

show 1 more author

Hainan Zhang

This is my paper

classification 💻 cs.CL

keywords local-lifelocalsugonlinequeryservicessuggestioncity-preference-enhanceddeployment

0 comments

read the original abstract

In local-life service platforms, query suggestion reduces user effort by generating candidate queries from input prefixes. Traditional multi-stage systems rely heavily on historical popular queries, limiting their ability to capture long-tail and emerging demand. Although LLMs provide strong semantic generalization, their deployment in local-life services faces three challenges: insufficient city-preference awareness, exposure bias in preference optimization, and strict online latency constraints. We propose LocalSUG, an LLM-based query suggestion framework for local-life services. LocalSUG mines city-preference-enhanced candidates from term co-occurrence and injects them into prompts as dynamic references rather than fusing them into model parameters. This allows the model to adapt to changing city preferences, such as merchant openings or closures, while reducing stale or locally invalid suggestions. We further introduce a beam-search-driven GRPO algorithm to align training with inference-time decoding and optimize relevance together with business-oriented rewards. Finally, quality-aware beam acceleration and vocabulary pruning reduce online latency while preserving generation quality. Offline evaluations and large-scale online A/B testing show that LocalSUG improves CTR by +0.35% and reduces the low/no-result rate by 3.98%, demonstrating its effectiveness in real-world deployment.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

OnePred: Next-Query Prediction via Recursive Intent Memory in Multi-Turn Conversations
cs.CL 2026-05 unverdicted novelty 6.0

OnePred maintains a recursively updated intent memory and uses two-stage RL to predict next queries, cutting token use by up to 22x while outperforming baselines on a new NQP-Bench dataset.