pith. sign in

arxiv: 1906.09450 · v1 · pith:CBC3SHNRnew · submitted 2019-06-22 · 💻 cs.CL · cs.IR

Semantically Driven Auto-completion

Pith reviewed 2026-05-25 18:07 UTC · model grok-4.3

classification 💻 cs.CL cs.IR
keywords auto-completionsemantic parsingnatural language interfacesquestion answeringquery formulationfinancial data
0
0 comments X

The pith

Auto-completion for natural language queries works by being guided by semantic parsing systems that understand the intended meaning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to build auto-completion tools for complex natural language interfaces, such as those used to query financial data and analytics. These tools are driven directly by semantic parsers that interpret the partial query and suggest completions that remain meaningful. The approach addresses usability problems that arise when users interact with question-answering systems over large, mixed data sources. Novel algorithms are presented that keep the process fast enough for real-time use while preserving the quality of the resulting suggestions.

Core claim

Auto-complete systems can be based on and guided by corresponding semantic parsing systems, which solve the auto-complete problem for natural language query formulation by producing relevant, parseable suggestions efficiently.

What carries the argument

Semantically guided auto-completion, in which a semantic parser directs the generation and ranking of completion candidates in real time.

Load-bearing premise

Semantic parsing systems can be run quickly enough on partial queries to guide auto-completion without adding noticeable delays or producing unhelpful suggestions.

What would settle it

A controlled comparison that measures query completion time and success rate for users of the Bloomberg Terminal when semantic guidance is present versus when it is replaced by a non-semantic baseline.

Figures

Figures reproduced from arXiv: 1906.09450 by Konstantine Arkoudas, Mohamed Yahya.

Figure 1
Figure 1. Figure 1: Semantic auto-completion in action. encoded in the semantic parser, it is possible, with the aid of appropriate lexicons and certain statistics, to generate large sets of queries synthetically, which can then be used as if they were user queries. Synthetically generated queries will never be as good as the genuine article, but when carefully prepared they can be very helpful. In this paper we report on our… view at source ↗
Figure 2
Figure 2. Figure 2: We do not have space here to say much about our seman￾tic parsing technology, but it is important to note that our discussion in this paper is agnostic on that point. The se￾mantic parsers could be based on CCGs and machine learn￾ing, or on PCFGs and first-order or higher-order logic (with or without machine learning), or on parser combinators, or even on a purely deep learning pipeline. The only require￾m… view at source ↗
Figure 2
Figure 2. Figure 2: A bonds query with its interpretation and correspo [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example of a full query template. The root of the tem [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
read the original abstract

The Bloomberg Terminal has been a leading source of financial data and analytics for over 30 years. Through its thousands of functions, the Terminal allows its users to query and run analytics over a large array of data sources, including structured, semi-structured, and unstructured data; as well as plot charts, set up event-driven alerts and triggers, create interactive maps, exchange information via instant and email-style messages, and so on. To improve user experience, we have been building question answering systems that can understand a wide range of natural language constructions for various domains that are of fundamental interest to our users. Such natural language interfaces, while exceedingly helpful to users, introduce a number of usability challenges of their own. We tackle some of these challenges through auto-completion for query formulation. A distinguishing mark of our auto-complete systems is that they are based on and guided by corresponding semantic parsing systems. We describe the auto-complete problem as it arises in this setting, the novel algorithms that we use to solve it, and report on the quality of the results and the efficiency of our approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that auto-completion for natural language query formulation in the Bloomberg Terminal can be effectively guided by corresponding semantic parsing systems. It describes the problem setting in this domain, presents novel algorithms for semantically driven auto-completion, and reports on the quality of results and efficiency of the approach.

Significance. If the central claim holds, the work could improve usability of semantic parsing-based QA systems in professional financial analytics by providing real-time guidance that reduces query formulation errors, representing a practical integration of semantic parsing into interactive interfaces with potential for deployment in production tools.

major comments (2)
  1. [Abstract] Abstract: the abstract states that the systems are 'based on and guided by corresponding semantic parsing systems' and that novel algorithms are used, but supplies no equations, pseudocode, complexity analysis, evaluation metrics, or error analysis, preventing assessment of whether the distinguishing claim is supported by evidence.
  2. [Evaluation section] Evaluation (assumed §5 or equivalent): without reported metrics on latency overhead, error propagation from the semantic parser, or comparisons to non-semantically-guided baselines, the assumption that semantic guidance introduces no significant performance or usability issues cannot be verified and is load-bearing for the practical contribution.
minor comments (2)
  1. Add explicit discussion of coverage limitations of the underlying semantic parser and how they affect auto-completion suggestions.
  2. Clarify notation for any semantic representations used in the algorithms to improve readability for readers outside the specific system.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and indicate planned revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the abstract states that the systems are 'based on and guided by corresponding semantic parsing systems' and that novel algorithms are used, but supplies no equations, pseudocode, complexity analysis, evaluation metrics, or error analysis, preventing assessment of whether the distinguishing claim is supported by evidence.

    Authors: The abstract is intentionally high-level and concise per standard conventions; technical details including algorithms, pseudocode, complexity analysis, metrics, and error analysis appear in the body (methods and evaluation sections). To better foreground the distinguishing claims within the abstract itself, we will revise it to reference key evaluation metrics and efficiency results. revision: yes

  2. Referee: [Evaluation section] Evaluation (assumed §5 or equivalent): without reported metrics on latency overhead, error propagation from the semantic parser, or comparisons to non-semantically-guided baselines, the assumption that semantic guidance introduces no significant performance or usability issues cannot be verified and is load-bearing for the practical contribution.

    Authors: The evaluation reports overall result quality and approach efficiency. We acknowledge the absence of explicit latency-overhead measurements, error-propagation analysis from the parser, and direct comparisons to non-semantic baselines. These additions would strengthen the practical claims, and we will include them in the revised version. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The provided abstract and description contain no equations, derivations, fitted parameters, or self-citation chains. The central claim that auto-completion is guided by semantic parsing systems is presented descriptively as a distinguishing feature of the authors' work, without any reduction of a 'prediction' or result to an input by construction, renaming of known results, or load-bearing uniqueness theorems imported from prior self-authored work. No load-bearing steps match the enumerated circularity patterns, and the paper's content remains self-contained against external benchmarks with no internal self-referential forcing of outcomes.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no information on free parameters, axioms, or invented entities; evaluation is limited to surface-level description.

pith-pipeline@v0.9.0 · 5705 in / 908 out tokens · 27608 ms · 2026-05-25T18:07:35.582380+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 1 internal anchor

  1. [1]

    Customers use it to query a wide variety of structured, semi-structured, and unstructured sources, create alerts, plot charts, draw map s, compute statistics, etc

    INTRODUCTION The Bloomberg Professional Service, popularly known as the Terminal, has been a leading source of financial data, analytics, and insights for over 30 years. Customers use it to query a wide variety of structured, semi-structured, and unstructured sources, create alerts, plot charts, draw map s, compute statistics, etc. For most of its history,...

  2. [2]

    This is done in two conceptual stages

    BACKGROUND: SEMANTIC PARSING Semantic parsers map natural language utterances into logical forms that capture their meaning [14]. This is done in two conceptual stages. The first is a parsing analysis , whereby a sentence is mapped to all interpretations that can be derived from it, reflecting the lexical and syntactic am- biguity of natural language. The s...

  3. [3]

    The problem we address in this paper is building AC systems that satisfy these prop- erties

    PROBLEM STATEMENT We now outline a set of properties that should be sat- isfied by AC systems designed to improve the usability of semantics-based QA technology. The problem we address in this paper is building AC systems that satisfy these prop- erties. As the QA and corresponding AC systems should ideally be released together, a major challenge that we d...

  4. [4]

    The completions should be predictive of user intent; in particular, the user’s intended query should be as high up on the list of completions as possible

  5. [5]

    chinese non-tech bonds maturing in three years

    The completion list should be diverse: It should con- tain entries of different types. In the case of a QA 2 q = “chinese non-tech bonds maturing in three years” φ = (COUNTRY OF RISK = CHINA) AND NOT(SECTOR = SEC TECH) AND MATURITY DATE = RELATIVE TIME(3,YEAR,NOW) D(q, φ) = φ φ 1 (COUNTRY OF RISK = CHINA) CHINA chinese φ 2 NOT(SECTOR = SEC TECH) NOT non (S...

  6. [6]

    The completions should be propositional, meaning that they should have full sentential semantics: The seman- tic parser must fully map the completion to a formula in the underlying logic, which could be a sentential atom or a more complex formula. For example, if the partial query is investment grade bonds i , then invest- ment grade bonds in the emerging...

  7. [7]

    The completions should be as grammatical as possi- ble, modulo what the user has already typed. The QA system should be able to understand telegraphically formulated queries [13], but nevertheless we should strive to offer completions that are as linguistically well-formed as possible. There is tension between this requirement and completeness, which is wh...

  8. [8]

    most popular completion

    APPROACH We now outline the high-level approach we take to solve the auto-completion problem introduced and motivated abov e. The approach relies on a number of different completion algorithms, each of which takes a prefix string provided by the user ( p in the notation of Section 3), potentially along with additional domain-dependent configuration pa- rame...

  9. [9]

    an initial segment ip = w1 · · ·wm that is understood by the semantic parser and results in some semantics φip , where m might be equal to k; and

  10. [10]

    ibm bonds

    the remainder of the input, rp = wm+1 · · ·wk, which constitutes an unrecognized segment. Assuming that the remainder is non-empty, we match it against the atom trie TA, and this returns a list of atoms L = [ A1, . . . , An] as potential completions. We then assign a score to each atom Aj, relative to the initial segment ip. This score can be understood a...

  11. [11]

    Negative news about Guaido

    EXPERIMENTAL RESULTS In this section we report on quantitative experiments in- tended to evaluate our approach to auto-completion in do- mains with different characteristics. Our experiments focu s on predictiveness and efficiency. Some of the other desider- ata mentioned in Section 3 are guaranteed by the manner in which we compute completions: soundness, d...

  12. [12]

    guai” is “juan guaido

    to choose the best 10 (accounting for grades, the need for diversification, and so on). Completion algorithms run con- currently, and pass their results to the top-level algorith m to be merged. Table 1: Predictiveness BNDS NEWS match(q, q′) MRR PARTIAL MRR PARTIAL instantiation MRR MRR STR 0.028 0.374 0.226 0.355 BOW 0.031 0.442 0.243 0.457 SEM 0.081 0.58...

  13. [13]

    These usability issues have long been recognized [12, 17]

    RELATED WORK We developed the AC framework described in this work and the corresponding QA framework in the context of the larger problem of improving the usability of information sys- tems. These usability issues have long been recognized [12, 17]. Our focus here is on usable query interfaces. A wide array of solutions have been proposed, from visual que...

  14. [14]

    Aditya, G

    B. Aditya, G. Bhalotia, S. Chakrabarti, A. Hulgeri, C. Nakhe, Parag, and S. Sudarshan. BANKS: Browsing and Keyword Searching in Relational Databases. In VLDB, pages 1083–1086, 2002

  15. [15]

    Bast and B

    H. Bast and B. Buchhold. QLever: A Query Engine for Efficient SPARQL+Text Search. In CIKM, pages 647–656, 2017

  16. [16]

    Bast and E

    H. Bast and E. Haussmann. More Accurate Question Answering on Freebase. In CIKM 2015 , pages 299–304, 2015

  17. [17]

    Bast and I

    H. Bast and I. Weber. Type less, find more: fast autocompletion search with a succinct index. In SIGIR, pages 364–371, 2006

  18. [18]

    Bhatia, D

    S. Bhatia, D. Majumdar, and P. Mitra. Query suggestions in the absence of query logs. In SIGIR, pages 795–804, 2011

  19. [19]

    S. S. Bhowmick, B. Choi, and C. E. Dyreson. Data-driven Visual Graph Query Interface Construction and Maintenance: Challenges and Opportunities. PVLDB, 9(12):984–992, 2016

  20. [20]

    Cai and M

    F. Cai and M. de Rijke. A Survey of Query Auto Completion in Information Retrieval. Foundations and Trends in Information Retrieval , 10(4):273–363, 2016

  21. [21]

    F. Cai, S. Liang, and M. de Rijke. Time-sensitive Personalized Query Auto-Completion. In CIKM, pages 1599–1608, 2014

  22. [22]

    M. G. Helander, T. K. Landauer, and P. V. Prabhu, editors. Handbook of Human-Computer Interaction . Elsevier Science Inc., New York, NY, USA, 2nd edition, 1997

  23. [23]

    Horovitz, L

    M. Horovitz, L. Lewin-Eytan, A. Libov, Y. Maarek, and A. Raviv. Mailbox-based vs. log-based query completion for mail search. In SIGIR, 2017

  24. [24]

    G. Hutton. Higher-Order Functions for Parsing. Journal of Functional Programming , 2(3):323–343, 1992

  25. [25]

    H. V. Jagadish, A. Chapman, A. Elkiss, M. Jayapandian, Y. Li, A. Nandi, and C. Yu. Making Database Systems Usable. In SIGMOD, pages 13–24, 2007

  26. [26]

    Joshi, U

    M. Joshi, U. Sawant, and S. Chakrabarti. Knowledge Graph and Corpus Driven Segmentation and Answer Inference for Telegraphic Entity-seeking Queries. In EMNLP, pages 1104–1114, 2014

  27. [27]

    A Survey on Semantic Parsing

    A. Kamath and R. Das. A survey on semantic parsing. CoRR, abs/1812.00978, 2018

  28. [28]

    Khoussainova, Y

    N. Khoussainova, Y. Kwon, M. Balazinska, and D. Suciu. SnipSuggest: Context-Aware Autocompletion for SQL. PVLDB, 4(1):22–33, 2010

  29. [29]

    Koutrika, A

    G. Koutrika, A. Simitsis, and Y. E. Ioannidis. Explaining Structured Queries in Natural Language. In ICDE, pages 333–344, 2010

  30. [30]

    Li and H

    F. Li and H. V. Jagadish. Usability, Databases, and HCI. IEEE Data Engineering Bulletin , 35(3):37–45, 2012

  31. [31]

    Li and H

    F. Li and H. V. Jagadish. Constructing an Interactive Natural Language Interface for Relational Databases. PVLDB, 8(1):73–84, 2014

  32. [32]

    Li and H

    F. Li and H. V. Jagadish. NaLIR: an Interactive Natural Language Interface for Querying Relational Databases. In SIGMOD, pages 709–712, 2014

  33. [33]

    Li and H

    F. Li and H. V. Jagadish. Understanding Natural Language Queries over Relational Databases. SIGMOD Record, 45(1):6–13, 2016

  34. [34]

    R. B. Miller. Response Time in Man-Computer Conversational Transactions. In AFIPS, pages 267–277, 1968

  35. [35]

    Mitra and N

    B. Mitra and N. Craswell. Query Auto-Completion for Rare Prefixes. In CIKM, pages 1755–1758, 2015

  36. [36]

    A. N. Ngomo, L. B¨ uhmann, C. Unger, J. Lehmann, and D. Gerber. Sorry, I Don’t Speak SPARQL: Translating SPARQL Queries into Natural Language. In WWW, pages 977–988, 2013

  37. [37]

    D. H. Park and R. Chiba. A Neural Language Model for Query Auto-Completion. In SIGIR, pages 1189–1192, 2017

  38. [38]

    Savenkov and E

    D. Savenkov and E. Agichtein. EviNets: Neural Networks for Combining Evidence Signals for Factoid Question Answering. In ACL, pages 299–304, 2017

  39. [39]

    Shokouhi

    M. Shokouhi. Learning to Personalize Query Auto-Completion. In SIGIR 2013 , pages 103–112, 2013

  40. [40]

    Wiseman, S

    S. Wiseman, S. M. Shieber, and A. M. Rush. Learning Neural Templates for Text Generation. In EMNLP, pages 3174–3187, 2018

  41. [41]

    Woods, R

    W. Woods, R. Kaplan, and B. Nash-Webber. The Lunar Sciences Natural Language Information System. Technical report, BBN Inc., 1974

  42. [42]

    W. A. Woods. Transition Network Grammars for Natural Language Analysis. Communications of the ACM, 13(10):591–606, 1970

  43. [43]

    Yu and H

    C. Yu and H. V. Jagadish. Querying Complex Structured Databases. In VLDB, pages 1010–1021, 2007. 12