pith. sign in

arxiv: 2603.04946 · v2 · pith:3MH6UTCNnew · submitted 2026-03-05 · 💻 cs.CL

LocalSUG: City-Preference-Enhanced LLM for Query Suggestion in Local-Life Services

classification 💻 cs.CL
keywords local-lifelocalsugonlinequeryservicessuggestioncity-preference-enhanceddeployment
0
0 comments X
read the original abstract

In local-life service platforms, query suggestion reduces user effort by generating candidate queries from input prefixes. Traditional multi-stage systems rely heavily on historical popular queries, limiting their ability to capture long-tail and emerging demand. Although LLMs provide strong semantic generalization, their deployment in local-life services faces three challenges: insufficient city-preference awareness, exposure bias in preference optimization, and strict online latency constraints. We propose LocalSUG, an LLM-based query suggestion framework for local-life services. LocalSUG mines city-preference-enhanced candidates from term co-occurrence and injects them into prompts as dynamic references rather than fusing them into model parameters. This allows the model to adapt to changing city preferences, such as merchant openings or closures, while reducing stale or locally invalid suggestions. We further introduce a beam-search-driven GRPO algorithm to align training with inference-time decoding and optimize relevance together with business-oriented rewards. Finally, quality-aware beam acceleration and vocabulary pruning reduce online latency while preserving generation quality. Offline evaluations and large-scale online A/B testing show that LocalSUG improves CTR by +0.35% and reduces the low/no-result rate by 3.98%, demonstrating its effectiveness in real-world deployment.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. OnePred: Next-Query Prediction via Recursive Intent Memory in Multi-Turn Conversations

    cs.CL 2026-05 unverdicted novelty 6.0

    OnePred maintains a recursively updated intent memory and uses two-stage RL to predict next queries, cutting token use by up to 22x while outperforming baselines on a new NQP-Bench dataset.