pith. machine review for the scientific record. sign in

arxiv: 2510.25426 · v2 · submitted 2025-10-29 · 💻 cs.CL · cs.AI

Implicature in Interaction: Understanding Implicature Improves Alignment in Human-LLM Interaction

Pith reviewed 2026-05-18 02:55 UTC · model grok-4.3

classification 💻 cs.CL cs.AI
keywords implicaturehuman-LLM alignmentprompt designcontextual intentresponse qualityuser preferencelinguistic contextAI interaction
0
0 comments X

The pith

Prompts that embed implicature lead to LLM responses preferred by users 67.6 percent of the time over literal ones.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that human-LLM alignment improves when prompts use implicature to convey intent through shared context rather than direct statements. Larger models already infer such intent reasonably well, yet adding implicature to prompts raises perceived relevance and quality for every model size, with the clearest lift in smaller models. In user tests, participants chose the resulting responses over literal-prompt responses in more than two-thirds of trials, indicating that contextually grounded language makes interactions feel more natural.

Core claim

The study shows that LLMs can infer user intent from context-driven prompts that rely on implicature and that responses produced from implicature-embedded prompts are rated higher in relevance and quality. Larger models track human interpretations of implicature more closely while smaller models gain the most from the added context; across all tested models, 67.6 percent of participants preferred the implicature-based responses.

What carries the argument

Implicature (meaning conveyed beyond explicit statements through shared context) functions as the mechanism that lets prompts carry implicit user intent into the LLM's response generation process.

If this is right

  • Smaller models can deliver noticeably more relevant answers once prompts include implicature.
  • Users consistently favor contextually nuanced responses over strictly literal ones in human-LLM exchanges.
  • Linguistic devices such as implicature offer a direct route to better alignment without requiring larger models.
  • Response quality rises when prompts draw on shared context instead of spelling out every detail.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same prompt technique could be tested in extended conversations to check whether alignment holds over multiple turns.
  • Pairing implicature with other pragmatic cues might produce further gains in task-oriented settings.
  • Production systems could adopt implicature prompts to raise user satisfaction while keeping model size fixed.

Load-bearing premise

The prompts used in the study genuinely represent implicature and that participant preferences measure real gains in alignment rather than superficial differences in wording.

What would settle it

A follow-up test that measures objective success at completing user-specified tasks when responses come from implicature prompts versus literal prompts, rather than relying on preference ratings.

Figures

Figures reproduced from arXiv: 2510.25426 by Asutosh Hota, Jussi P. P. Jokinen.

Figure 1
Figure 1. Figure 1: Implicature in interaction. (a) Humans are adept at inferring implicatures during discussion. (b) Our experiment is designed to evaluate how large language models (LLMs) handle implicature-embedded prompts by asking humans to rate the responses based on their perceived relevance, quality, and preference., (c) In Experiment 1, larger models demonstrate a better understanding of conversational implicatures, … view at source ↗
Figure 2
Figure 2. Figure 2: Model accuracy relative to human baseline. Model Category R2 GPT 4o Information 0.56 Direction 0.67 Expressing 0.82 Llama 2 Information 0.34 Direction 0.04 Expressing 0.29 GPT 4 Information 0.78 Direction 0.68 Expressing 0.95 GPT 3.5 Information 0.12 Direction 0.09 Expressing 0.01 Mistral 7B Information 0.23 Direction 0.02 Expressing 0.81 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Effects of model, intervention, and class on perceived relevance. Effect F-value p-value Intervention 19.57 1.11e-05 Model 44.74 2.2e-16 Class 26.75 6.01e-12 Intervention:Model 1.28 0.276 Intervention:Class 0.56 0.570 Model:Class 2.07 0.036 Intervention:Model:Class 1.95 0.050 [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Effects of model, intervention, and class on perceived quality. prompts yielded higher ratings across most conditions. For relevance, implicature prompts were consistently rated higher, particularly in direction-seeking and expressive tasks. The Model:Class interaction was statistically significant (F(8) = 2.07, p = 0.036), suggesting that certain models performed differently depending on the type of impli… view at source ↗
Figure 5
Figure 5. Figure 5: Caption for Experiment 3 5 Discussion 5.1 Experiment 1: Implicature Interpretation and Model-Human Alignment The first experiment benchmarked five leading large language models (LLMs) against human performance in inter￾preting conversational implicatures. Our findings reveal a clear stratification in LLM capability: larger models such as GPT-4o and GPT-4 demonstrated strong alignment with human judgment, a… view at source ↗
Figure 6
Figure 6. Figure 6: Instructions provided to participants for Experiment 2 (perceived relevance and quality task). Participants were asked to read a prompt and an LLM-generated response, and then rate the response on both relevance and quality using a 5-point Likert scale. "Information Seeking": system_prompt + " The user is seeking information, facts, or knowledge. Your goal is to provide relevant data or insights in respons… view at source ↗
Figure 7
Figure 7. Figure 7: Example prompt and response pair from Experiment 2. Participants rated the relevance (alignment with user intent) and quality (clarity, coherence, usefulness) of the LLM response [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Instructions provided to participants for Experiment 3 (preference task). Participants compared two LLM-generated responses to the same prompt and selected their preferred response. 22 [PITH_FULL_IMAGE:figures/full_fig_p022_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Example trial from Experiment 3, showing a prompt and two candidate responses. Participants chose which response they preferred based on perceived quality and relevance [PITH_FULL_IMAGE:figures/full_fig_p023_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Sample Information Seeking prompt: Comparison between responses generated with an implicature embedded prompt (a) and a standard prompt (b). Response (a) is more tailored and contextually relevant, which was preferred by users in the study. 23 [PITH_FULL_IMAGE:figures/full_fig_p023_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Sample Expressive prompt: Comparison of responses to expressive prompts. The implicature embedded response (a) provides an empathetic and human-like interpretation, while the standard response (b) focuses on restating context. Users preferred (a) for its warmth and naturalness [PITH_FULL_IMAGE:figures/full_fig_p024_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Sample Direction Seeking prompt: Comparison of responses for direction-seeking prompts. The implicature embed￾ded response (a) provides clear, structured guidance, while the standard response (b) is less targeted. Users favored (a) for its clarity and practical usefulness. 24 [PITH_FULL_IMAGE:figures/full_fig_p024_12.png] view at source ↗
read the original abstract

The rapid advancement of Large Language Models (LLMs) is positioning language at the core of human-computer interaction (HCI). We argue that advancing HCI requires attention to the linguistic foundations of interaction, particularly implicature (meaning conveyed beyond explicit statements through shared context) which is essential for human-AI (HAI) alignment. This study examines LLMs' ability to infer user intent embedded in context-driven prompts and whether understanding implicature improves response generation. Results show that larger models approximate human interpretations more closely, while smaller models struggle with implicature inference. Furthermore, implicature-based prompts significantly enhance the perceived relevance and quality of responses across models, with notable gains in smaller models. Overall, 67.6% of participants preferred responses with implicature-embedded prompts to literal ones, highlighting a clear preference for contextually nuanced communication. Our work contributes to understanding how linguistic theory can be used to address the alignment problem by making HAI interaction more natural and contextually grounded.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript argues that Gricean implicature is key to human-LLM alignment in HCI. It claims larger models more closely approximate human implicature inference than smaller ones, and that prompts embedding implicature yield responses preferred by 67.6% of participants over literal prompts, with larger gains for smaller models.

Significance. If the empirical results hold under scrutiny, the work offers a concrete linguistic mechanism for improving prompt effectiveness and alignment, particularly for resource-constrained models. It provides a falsifiable link between pragmatic theory and measurable user preference that could guide both prompt engineering and evaluation protocols.

major comments (2)
  1. [Abstract] Abstract: the headline 67.6% preference figure is reported without participant count, statistical tests, confidence intervals, or controls for prompt length, lexical richness, or response verbosity; this information is load-bearing for the central claim that implicature (rather than any richer prompt) drives the preference.
  2. [Results] Results / Experimental setup: no concrete literal vs. implicature prompt pairs are exhibited, nor is there inter-annotator validation or annotation protocol showing that the added material reliably triggers a specific Gricean implicature rather than generic contextual enrichment; without this, the operationalization of the independent variable remains unverified.
minor comments (2)
  1. [Abstract] The abstract contains several long sentences that could be split to improve readability.
  2. Notation for model sizes and preference percentages should be defined on first use rather than assumed from context.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback, which identifies key areas where additional detail will strengthen the manuscript. We address each major comment below and commit to revisions that directly respond to the concerns raised.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the headline 67.6% preference figure is reported without participant count, statistical tests, confidence intervals, or controls for prompt length, lexical richness, or response verbosity; this information is load-bearing for the central claim that implicature (rather than any richer prompt) drives the preference.

    Authors: We agree that the abstract would be improved by including these supporting details. In the revised version we will report the participant count, the results of the relevant statistical tests, confidence intervals, and a concise statement of the controls applied for prompt length and response verbosity. These elements are already present in the full experimental results and will now be summarized in the abstract to make the central claim more robust. revision: yes

  2. Referee: [Results] Results / Experimental setup: no concrete literal vs. implicature prompt pairs are exhibited, nor is there inter-annotator validation or annotation protocol showing that the added material reliably triggers a specific Gricean implicature rather than generic contextual enrichment; without this, the operationalization of the independent variable remains unverified.

    Authors: We accept that explicit examples and validation details are necessary for full transparency. The revised manuscript will include concrete literal-versus-implicature prompt pairs drawn from the study. We will also add a description of the prompt-construction and annotation protocol, including any inter-annotator agreement measures used to confirm that the added material targets specific Gricean implicatures rather than generic enrichment. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical preference study with direct comparisons

full rationale

The paper reports an empirical user study and model evaluation comparing literal vs. implicature-embedded prompts. The headline result (67.6% preference) is obtained from participant choices between response pairs. No equations, fitted parameters presented as predictions, self-citation of uniqueness theorems, or ansatz smuggling appear in the abstract or described methodology. The derivation is observational and does not reduce to its inputs by construction; the experimental design (prompt construction and preference collection) stands independently of the claimed linguistic mechanism.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on empirical observations from a user study and linguistic theory of implicature, with no new free parameters or invented entities introduced.

axioms (1)
  • domain assumption Standard assumptions in human-computer interaction studies regarding participant judgment of response quality.
    The paper relies on user preferences as a proxy for alignment without questioning the validity of subjective ratings.

pith-pipeline@v0.9.0 · 5702 in / 1035 out tokens · 37170 ms · 2026-05-18T02:55:34.749278+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 2 internal anchors

  1. [1]

    Survey on computational approaches to implicature

    Kaveri Anuranjana, Srihitha Mallepally, Sriharshitha Mareddy, Amit Shukla, and Radhika Mamidi. Survey on computational approaches to implicature. InProceedings of the 21st International Conference on Natural Language Processing (ICON), pages 224–229, 2024

  2. [2]

    Does gpt-4 surpass human performance in linguistic pragmatics?Humanities and Social Sciences Communications, 12(1):1–10, 2025

    Ljubiša Boji ´c, Predrag Kovaˇcevi´c, and Milan ˇCabarkapa. Does gpt-4 surpass human performance in linguistic pragmatics?Humanities and Social Sciences Communications, 12(1):1–10, 2025

  3. [3]

    Scalar implicatures, polarity phenomena, and the syntax/pragmatics interface.Struc- tures and beyond, 3:39–103, 2004

    Gennaro Chierchia et al. Scalar implicatures, polarity phenomena, and the syntax/pragmatics interface.Struc- tures and beyond, 3:39–103, 2004

  4. [4]

    Gennaro Chierchia, Danny Fox, and Benjamin Spector. The grammatical view of scalar implicatures and the relationship between semantics and pragmatics.Semantics: An international handbook of natural language meaning, 3:2297–2332, 2012

  5. [5]

    Scalar implicature as a grammatical phenomenon

    Gennaro Chierchia, Danny Fox, and Benjamin Spector. Scalar implicature as a grammatical phenomenon. In Handbücher zur Sprach-und Kommunikationswissenschaft/Handbooks of Linguistics and Communication Sci- ence Semantics Volume 3. de Gruyter, 2012

  6. [6]

    Pragmatic inference of scalar implicature by LLMs

    Ye-eun Cho and Seong mook Kim. Pragmatic inference of scalar implicature by LLMs. In Xiyan Fu and Eve Fleisig, editors,Proceedings of the 62nd Annual Meeting of the Association for Computational Linguis- tics (Volume 4: Student Research Workshop), pages 10–20, Bangkok, Thailand, August 2024. Association for Computational Linguistics

  7. [7]

    Manner implicatures in large language models.Scientific Reports, 14(1):29113, 2024

    Yan Cong. Manner implicatures in large language models.Scientific Reports, 14(1):29113, 2024

  8. [8]

    Conversational implicatures in english dialogue: Annotated dataset.Procedia Computer Science, 171:2316–2323, 2020

    Elizabeth Jasmi George and Radhika Mamidi. Conversational implicatures in english dialogue: Annotated dataset.Procedia Computer Science, 171:2316–2323, 2020

  9. [9]

    Logic and conversation

    Herbert P Grice. Logic and conversation. InSpeech acts, pages 41–58. Brill, 1975

  10. [10]

    Harvard University Press, 1991

    Paul Grice.Studies in the Way of Words. Harvard University Press, 1991

  11. [11]

    Conscience conflict? evaluating language models’ moral understanding

    Asutosh Hota and Jussi PP Jokinen. Conscience conflict? evaluating language models’ moral understanding. 2025

  12. [12]

    Nomiclaw: Emergent trust and strategic argumentation in llms during collaborative law-making.arXiv preprint arXiv:2508.05344, 2025

    Asutosh Hota and Jussi PP Jokinen. Nomiclaw: Emergent trust and strategic argumentation in llms during collaborative law-making.arXiv preprint arXiv:2508.05344, 2025

  13. [13]

    Oxford University Press, 2017

    Yan Huang.The Oxford handbook of pragmatics. Oxford University Press, 2017

  14. [14]

    Integrating large language model and mental model of others: Studies on dialogue communication based on implicature

    Ayu Iida, Kohei Okuoka, Satoko Fukuda, Takashi Omori, Ryoichi Nakashima, and Masahiko Osawa. Integrating large language model and mental model of others: Studies on dialogue communication based on implicature. In Proceedings of the 12th International Conference on Human-Agent Interaction, pages 260–269, 2024. 17

  15. [15]

    Stop anthropomorphizing intermediate tokens as reasoning/thinking traces!arXiv preprint arXiv:2504.09762, 2025

    Subbarao Kambhampati, Kaya Stechly, Karthik Valmeekam, Lucas Saldyt, Siddhant Bhambri, Vardhan Palod, Atharva Gundawar, Soumya Rani Samineni, Durgesh Kalwar, and Upasana Biswas. Stop anthropomorphizing intermediate tokens as reasoning/thinking traces!arXiv preprint arXiv:2504.09762, 2025

  16. [16]

    Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei

    Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models, 2020

  17. [17]

    Cognitive status and form of reference in multimodal human-computer interaction

    Andrew Kehler. Cognitive status and form of reference in multimodal human-computer interaction. InAAAI/I- AAI, pages 685–690, 2000

  18. [18]

    Large language models and emergence: A complex systems perspective.arXiv preprint arXiv:2506.11135, 2025

    David C Krakauer, John W Krakauer, and Melanie Mitchell. Large language models and emergence: A complex systems perspective.arXiv preprint arXiv:2506.11135, 2025

  19. [19]

    Measuring Faithfulness in Chain-of-Thought Reasoning

    Tamera Lanham, Anna Chen, Ansh Radhakrishnan, Benoit Steiner, Carson Denison, Danny Hernandez, Dustin Li, Esin Durmus, Evan Hubinger, Jackson Kernion, et al. Measuring faithfulness in chain-of-thought reasoning. arXiv preprint arXiv:2307.13702, 2023

  20. [20]

    Implicit communication of actionable information in human-ai teams

    Claire Liang, Julia Proft, Erik Andersen, and Ross A Knepper. Implicit communication of actionable information in human-ai teams. InProceedings of the 2019 CHI conference on human factors in computing systems, pages 1–13, 2019

  21. [21]

    Dai, Diyi Yang, and Soroush V osoughi

    Ruibo Liu, Ruixin Yang, Chenyan Jia, Ge Zhang, Denny Zhou, Andrew M. Dai, Diyi Yang, and Soroush V osoughi. Training socially aligned language models on simulated social interactions, 2023

  22. [22]

    agents” or “tools

    Chisato Nishihata, Harumi Kobayashi, and Tetsuya Yasuda. Human-like “agents” or “tools”?: Exploring the implicature-of-quantity in hai. InProceedings of the 11th International Conference on Human-Agent Interaction, pages 387–389, 2023

  23. [23]

    Grice for graphics: pragmatic implicature in network diagrams.Information design journal, 8(2):163–179, 1995

    Jon Oberlander. Grice for graphics: pragmatic implicature in network diagrams.Information design journal, 8(2):163–179, 1995

  24. [24]

    Understanding the llm-ification of chi: Unpacking the impact of llms at chi through a systematic literature review

    Rock Yuren Pang, Hope Schroeder, Kynnedy Simone Smith, Solon Barocas, Ziang Xiao, Emily Tseng, and Danielle Bragg. Understanding the llm-ification of chi: Unpacking the impact of llms at chi through a systematic literature review. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–20, 2025

  25. [25]

    The pragmatics of what is said

    François Recanati. The pragmatics of what is said. 1989

  26. [26]

    Embedded implicatures.Philosophical perspectives, 17:299–332, 2003

    François Recanati. Embedded implicatures.Philosophical perspectives, 17:299–332, 2003

  27. [27]

    The goldilocks of pragmatic understanding: Fine-tuning strategy matters for implicature resolution by llms.Ad- vances in Neural Information Processing Systems, 36:20827–20905, 2023

    Laura Ruis, Akbir Khan, Stella Biderman, Sara Hooker, Tim Rocktäschel, and Edward Grefenstette. The goldilocks of pragmatic understanding: Fine-tuning strategy matters for implicature resolution by llms.Ad- vances in Neural Information Processing Systems, 36:20827–20905, 2023

  28. [28]

    The computation of scalar implicatures: Pragmatic, lexical or grammatical?Language and Linguistics Compass, 6(1):36–49, 2012

    Uli Sauerland. The computation of scalar implicatures: Pragmatic, lexical or grammatical?Language and Linguistics Compass, 6(1):36–49, 2012

  29. [29]

    Pragmatics in human-computer conversations.Journal of Pragmatics, 34(3):227–258, 2002

    Ayse Pinar Saygin and Ilyas Cicekli. Pragmatics in human-computer conversations.Journal of Pragmatics, 34(3):227–258, 2002

  30. [30]

    Indirect speech acts

    John R Searle. Indirect speech acts. InSpeech acts, pages 59–82. Brill, 1975

  31. [31]

    implicit interaction

    Barı¸ s Serim and Giulio Jacucci. Explicating" implicit interaction" an examination of the concept and challenges for research. InProceedings of the 2019 chi conference on human factors in computing systems, pages 1–16, 2019

  32. [32]

    Donghee Shin. The effects of explainability and causability on perception, trust, and acceptance: Implications for explainable ai.International journal of human-computer studies, 146:102551, 2021

  33. [33]

    The future of interactive systems and the emergence of direct manipulation.Behaviour & Information Technology, 1(3):237–256, 1982

    Ben Shneiderman. The future of interactive systems and the emergence of direct manipulation.Behaviour & Information Technology, 1(3):237–256, 1982. 18

  34. [34]

    The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

    Parshin Shojaee, Iman Mirzadeh, Keivan Alizadeh, Maxwell Horton, Samy Bengio, and Mehrdad Farajtabar. The illusion of thinking: Understanding the strengths and limitations of reasoning models via the lens of problem complexity.arXiv preprint arXiv:2506.06941, 2025

  35. [35]

    Aspects of the pragmatics of plural morphology: On higher-order implicatures

    Benjamin Spector. Aspects of the pragmatics of plural morphology: On higher-order implicatures. InPresuppo- sition and implicature in compositional semantics, pages 243–281. Springer, 2007

  36. [36]

    Pub: A pragmatics understanding benchmark for assessing llms’ pragmatics capabilities.arXiv preprint arXiv:2401.07078, 2024

    Settaluri Lakshmi Sravanthi, Meet Doshi, Tankala Pavan Kalyan, Rudra Murthy, Pushpak Bhattacharyya, and Raj Dabre. Pub: A pragmatics understanding benchmark for assessing llms’ pragmatics capabilities.arXiv preprint arXiv:2401.07078, 2024

  37. [37]

    Beyond the imitation game: Quantifying and extrapolating the capabilities of language models.Transactions on machine learning research, 2023

    Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Shoeb, Abubakar Abid, Adam Fisch, Adam R Brown, Adam Santoro, Aditya Gupta, Adri Garriga-Alonso, et al. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models.Transactions on machine learning research, 2023

  38. [38]

    Beyond semantics: The unreasonable effectiveness of reasonless intermediate tokens.arXiv preprint arXiv:2505.13775, 2025

    Kaya Stechly, Karthik Valmeekam, Atharva Gundawar, Vardhan Palod, and Subbarao Kambhampati. Beyond semantics: The unreasonable effectiveness of reasonless intermediate tokens.arXiv preprint arXiv:2505.13775, 2025

  39. [39]

    Generative ai in the wild: Prospects, challenges, and strategies

    Yuan Sun, Eunchae Jang, Fenglong Ma, and Ting Wang. Generative ai in the wild: Prospects, challenges, and strategies. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems, pages 1–16, 2024

  40. [40]

    Albert Webson and Ellie Pavlick. Do prompt-based models really understand the meaning of their prompts? In Proceedings of the 2022 conference of the north american chapter of the association for computational linguis- tics: Human language technologies, pages 2300–2344, 2022

  41. [41]

    what it can create, it may not understand

    Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li, Jena D Hwang, Liwei Jiang, Jillian Fisher, Abhilasha Ravichander, Khyathi Chandu, et al. The generative ai paradox:" what it can create, it may not understand".arXiv preprint arXiv:2311.00059, 2023

  42. [42]

    Do large language models understand conversational implicature–a case study with a chinese sitcom

    Shisen Yue, Siyuan Song, Xinyuan Cheng, and Hai Hu. Do large language models understand conversational implicature–a case study with a chinese sitcom. InChina National Conference on Chinese Computational Lin- guistics, pages 402–418. Springer, 2024. 19 A Appendix Listing 1:System prompt for LLM classification of implicature in Experiment 1. This prompt fr...

  43. [43]

    What is the weather report for the next week?

    Information Seeking: Asking for information, facts, or knowledge from others. The primary goal is to obtain necessary data or insights. For example, "What is the weather report for the next week?"

  44. [44]

    It often involves commands, instructions, or requests, leading to an action

    Direction Seeking: Asking for instructions or directions to perform a specific task or action. It often involves commands, instructions, or requests, leading to an action. For instance, seeking instructions to complete an assignment

  45. [45]

    I’m really happy about the results

    Expressing: Communicating feelings, emotions, opinions, or attitudes. The focus is on sharing one’s personal state rather than expecting information or action. For example, saying "I’m really happy about the results" expresses one’s feelings. Your task is to read the message as Person B and select the implication class (Information Seeking, Direction Seek...