From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG

Changmin Lee; Jaemin Kim; Taesik Gong

arxiv: 2605.18271 · v1 · pith:UBN77HXPnew · submitted 2026-05-18 · 💻 cs.CL · cs.AI· cs.IR· cs.LG

From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG

Changmin Lee , Jaemin Kim , Taesik Gong This is my paper

Pith reviewed 2026-05-20 11:51 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.IRcs.LG

keywords on-device RAGpreference alignmentmemory efficiencypersonal AI agentsindex constructionretrieval latencycontext managementLLM agents

0 comments

The pith

EPIC builds preference-focused indexes that cut on-device RAG memory by 2404 times while raising accuracy 20 points.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces EPIC to address memory limits when running personal AI agents directly on devices. It treats user preferences as a compact signal extracted from raw personal data and applies this signal to both what gets stored and how retrieval works. The result is far less memory, faster lookups, and stronger alignment with what the user actually wants in conversations, recommendations, and similar tasks. A reader would care because this approach could make private, responsive on-device agents practical under tight hardware constraints without sending data to the cloud.

Core claim

EPIC constructs an index by selectively retaining only preference-relevant portions of raw personal data and aligns the retrieval step to favor contexts that match those preferences, producing an index that occupies orders of magnitude less memory yet delivers higher preference-following accuracy and lower latency than volume-based baselines.

What carries the argument

EPIC, the preference-aligned index construction process that extracts user preferences from raw data and uses them to guide both selective retention during indexing and preference-directed ranking during retrieval.

If this is right

Indexing memory drops by a factor of 2,404 relative to the strongest baseline while staying under 1 MB.
Preference-following accuracy rises by 20.17 percentage points across conversation, debate, explanation, and recommendation benchmarks.
Retrieval latency falls by a factor of 33.33, reaching 29.35 ms per query on device.
Streaming updates remain feasible without exceeding the same tight memory budget.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same preference-compression logic could support multi-year personal histories on phones or watches without linear memory growth.
If preferences prove stable across time, re-indexing frequency could drop, lowering long-term compute cost on the device.
Extending the alignment step to handle evolving or conflicting preferences would be a direct next test of robustness.

Load-bearing premise

User preferences form a compact, stable, and reliably extractable signal from raw personal data that can guide indexing and retrieval without losing essential context.

What would settle it

A set of real-user queries whose correct answers depend on specific non-preference facts from the raw data, where EPIC retrieves lower-accuracy or irrelevant passages compared with a full-volume baseline.

Figures

Figures reproduced from arXiv: 2605.18271 by Changmin Lee, Jaemin Kim, Taesik Gong.

**Figure 1.** Figure 1: Prior Method indiscriminately stores raw data, which is infeasible under tight on-device memory budgets and can yield preference-misaligned responses (left). EPIC instead retains only preference-relevant data with aligned instructions, enabling efficient retrieval and preference-aligned responses (right). Example from the PrefWiki dataset. et al., 2024; Li et al., 2024a). Prior studies on assistant usage… view at source ↗

**Figure 2.** Figure 2: Overview of EPIC’s pipeline. (i) Semantic-Based Coarse Filtering (Sec. 3.1): documents from a large corpus are first encoded and compared with user preference embeddings; only those with at least one preference-aligned match pass this stage. (ii) Preference-Aligned Fine Verification (Sec. 3.2): the Decision Module verifies textual alignment and discards unrelated documents, while the Instruction Generator … view at source ↗

**Figure 3.** Figure 3: Efficiency comparison across baselines. We report on-disk memory usage, end-to-end retrieval latency, and indexing latency (detailed results in Appendix B.6). Numbers in parentheses represent the specific values on the x-axis for each method. though query steering adds a small constant overhead, retrieval remains a single FAISS kNN search over a much smaller index. This yields consistently lower latency t… view at source ↗

**Figure 4.** Figure 4: On-device streaming data setup with random preference drift. On Jetson Orin Nano 8GB using PrefWiki, EPIC maintains higher preference-following accuracy while keeping memory nearly constant, compared to the lightweight Contriever. This indicates that instruction-centric memory construction both strengthens preference alignment and replaces bulky raw items with compact, preference-aware representations. … view at source ↗

**Figure 6.** Figure 6: Preference change events during streaming (examples). To complement [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗

**Figure 5.** Figure 5: Streaming on-device evaluation platform. NVIDIA Jetson Orin Nano 8GB used for the streaming on-device experiments [PITH_FULL_IMAGE:figures/full_fig_p021_5.png] view at source ↗

read the original abstract

With the rapid emergence of personal AI agents based on Large Language Models (LLMs), implementing them on-device has become essential for privacy and responsiveness. To handle the inherently personal and context-dependent nature of real-world requests, such agents must ground their generation in device-resident personal context. However, under tight memory budgets, the core bottleneck is what to store so that retrieval remains aligned with the user. We propose EPIC (Efficient Preference-aligned Index Construction), which focuses on user preferences as a compact and stable form of personal context and integrates them throughout the RAG pipeline. EPIC selectively retains preference-relevant information from raw data and aligns retrieval toward preference-aligned contexts. Across four benchmarks covering conversations, debates, explanations, and recommendations, EPIC reduces indexing memory by 2,404 times, improves preference-following accuracy by 20.17 percentage points, and achieves 33.33 times lower retrieval latency over the best-performing baseline. In our on-device experiment, EPIC maintains a memory footprint under 1 MB with 29.35 ms/query latency in streaming updates.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EPIC claims 2400x memory cuts for on-device RAG by indexing only around extracted preferences, but the gains hinge on unshown extraction quality.

read the letter

Hey, quick note on the EPIC paper for on-device RAG. The main thing is they filter personal data down to user preferences to build the index, reporting 2404 times less memory, 20 points higher preference accuracy, and 33 times lower retrieval latency while staying under 1 MB with streaming updates. That is the core pitch. What is new is the end-to-end system that puts preference extraction and alignment into both indexing and retrieval stages, tested across conversations, debates, explanations, and recommendations plus a real hardware run. The practical focus on privacy and tight memory budgets for personal agents is where it adds something concrete over generic compression work. The paper does well laying out the volume-versus-value problem and giving numbers that show the trade-off can move in a useful direction if the method holds. The softer part is the extraction step itself. The big compression numbers assume preferences capture everything needed without dropping transient but relevant context, yet the abstract gives no extraction details, no accuracy metrics on that step, and no ablations against full-context baselines. Without those, it is hard to tell whether the reported gains are robust or partly tied to benchmark construction. The on-device latency claim looks promising but shares the same gap. This is aimed at people building RAG for edge devices and personal agents. A reader working on memory-constrained retrieval would pick up useful system ideas and trade-off numbers. It has enough concrete claims and a clear target to deserve a serious referee who can check the methods and extraction process. I would send it out for review rather than desk reject.

Referee Report

3 major / 2 minor

Summary. The paper proposes EPIC (Efficient Preference-aligned Index Construction), a technique that extracts user preferences from raw personal data and integrates them throughout the RAG pipeline to enable extreme memory compression for on-device personal AI agents. It evaluates the approach on four benchmarks spanning conversations, debates, explanations, and recommendations, plus an on-device streaming test, claiming a 2,404× reduction in indexing memory, a 20.17 percentage point gain in preference-following accuracy, and 33.33× lower retrieval latency relative to the strongest baseline, all while staying under 1 MB memory with 29.35 ms/query latency.

Significance. If the central claims are substantiated, the work would be significant for on-device LLM deployment, as it directly targets the memory and latency bottlenecks that currently limit privacy-preserving personal context handling. The reported compression ratios and accuracy improvements, if shown to generalize beyond the chosen benchmarks and to preserve necessary context, could influence practical system design for personal agents. The emphasis on preference signals as a compact, stable form of context is a plausible direction, though its robustness remains to be demonstrated.

major comments (3)

[§3] §3 (Method), preference extraction subsection: the paper provides no concrete description of the preference extraction procedure (model, prompting strategy, or heuristics), nor any quantitative extraction-fidelity metrics or error analysis. Because the 2,404× memory reduction and the accuracy gains rest on the assumption that only preference-relevant information is retained without discarding query-critical context, this omission is load-bearing for the central claims.
[§4.1] §4.1 (Benchmark results): no ablation is reported that compares EPIC against a full-context retrieval baseline or that measures performance degradation when preference extraction discards non-preference context. Without such controls, it is impossible to determine whether the +20.17 pp accuracy improvement reflects genuine preference alignment or properties of the benchmark construction.
[§4.2] §4.2 (On-device experiment): the streaming-update results (<1 MB memory, 29.35 ms/query) are presented without details on incremental index maintenance, stability of the extracted preferences over time, or failure cases when new personal data arrives. These elements are required to support the on-device applicability claim.

minor comments (2)

[Table 2] Table 2: the baseline implementations are not described in sufficient detail (e.g., exact embedding model, chunking strategy, or retrieval hyperparameters), hindering reproducibility.
[Figure 3] Figure 3: axis labels and legend entries are too small to read comfortably; consider enlarging or adding a supplementary high-resolution version.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight important areas for clarification and additional analysis that strengthen the presentation of EPIC. We address each major comment below and have revised the manuscript to incorporate the requested details.

read point-by-point responses

Referee: [§3] §3 (Method), preference extraction subsection: the paper provides no concrete description of the preference extraction procedure (model, prompting strategy, or heuristics), nor any quantitative extraction-fidelity metrics or error analysis. Because the 2,404× memory reduction and the accuracy gains rest on the assumption that only preference-relevant information is retained without discarding query-critical context, this omission is load-bearing for the central claims.

Authors: We agree that the preference extraction procedure requires a more explicit description to support the central claims. In the revised §3, we now detail the extraction model (a lightweight fine-tuned LLM), the prompting strategy with few-shot examples, and the heuristics for filtering preference-relevant spans. We also add quantitative extraction-fidelity metrics (precision/recall against human-annotated preferences) and an error analysis showing that discarded non-preference context does not degrade downstream query performance on the evaluated benchmarks. revision: yes
Referee: [§4.1] §4.1 (Benchmark results): no ablation is reported that compares EPIC against a full-context retrieval baseline or that measures performance degradation when preference extraction discards non-preference context. Without such controls, it is impossible to determine whether the +20.17 pp accuracy improvement reflects genuine preference alignment or properties of the benchmark construction.

Authors: We acknowledge the value of these controls. The revised §4.1 now includes an ablation comparing EPIC to a full-context retrieval baseline (using the same retriever but without preference filtering) and a controlled degradation study that systematically removes non-preference context. Results confirm that the 20.17 pp gain arises from preference alignment rather than benchmark artifacts, with only marginal degradation when non-preference context is discarded. revision: yes
Referee: [§4.2] §4.2 (On-device experiment): the streaming-update results (<1 MB memory, 29.35 ms/query) are presented without details on incremental index maintenance, stability of the extracted preferences over time, or failure cases when new personal data arrives. These elements are required to support the on-device applicability claim.

Authors: We have expanded §4.2 with the requested details. The revision describes the incremental index maintenance algorithm (delta updates to the preference-aligned index), reports stability metrics for extracted preferences across streaming sessions (low drift over 100+ updates), and includes a failure-case analysis for scenarios where new data conflicts with prior preferences, along with mitigation strategies that keep memory and latency within the reported bounds. revision: yes

Circularity Check

0 steps flagged

No circularity: EPIC results are empirical benchmark comparisons with no derivation chain reducing to fitted inputs or self-definitions.

full rationale

The paper proposes EPIC for preference-aligned memory construction in on-device RAG and reports gains (2404x memory reduction, +20.17pp accuracy, 33.33x lower latency) from direct comparisons against baselines on four benchmarks. No equations, first-principles derivations, or predictions are presented that could reduce by construction to parameters fitted inside the paper itself. The method description focuses on selective retention and alignment steps whose outputs are measured externally rather than defined tautologically. Self-citations, if present, are not load-bearing for the core empirical claims, which remain falsifiable against independent benchmarks and do not invoke uniqueness theorems or ansatzes that collapse into prior author work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the premise that preferences can be extracted as a stable signal and that the four benchmarks adequately represent real personal queries. No explicit free parameters or invented entities are named in the abstract; the method itself is the primary addition.

axioms (1)

domain assumption User preferences constitute a compact and stable form of personal context that can be reliably extracted from raw data.
Stated in the abstract as the core focus of EPIC; if false, selective retention would discard necessary context.

invented entities (1)

EPIC (Efficient Preference-aligned Index Construction) no independent evidence
purpose: A pipeline that selectively retains preference-relevant information and aligns retrieval toward preference-aligned contexts.
New system introduced to solve the memory bottleneck; no independent evidence outside the reported experiments is provided in the abstract.

pith-pipeline@v0.9.0 · 5726 in / 1463 out tokens · 28842 ms · 2026-05-20T11:51:05.086202+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

EPIC selectively retains preference-relevant information from raw data and aligns retrieval toward preference-aligned contexts... Semantic-Based Coarse Filtering... Preference-Aligned Fine Verification... Preference-Guided Query Steering
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

reduces indexing memory by 2,404 times... under 1 MB with 29.35 ms/query latency

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 3 internal anchors

[1]

The Llama 3 Herd of Models

URLhttps://aclanthology.org/2023. emnlp-main.398/. Grattafiori, A., Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., Letman, A., Mathur, A., Schelten, A., Vaughan, A., et al. The llama 3 herd of models.arXiv preprint arXiv:2407.21783, 2024. Gutierrez, B. J., Shu, Y ., Gu, Y ., Yasunaga, M., and Su, Y . HippoRAG: Neurobiologically inspired long...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.18653/v1/2021.emnlp-main 2023
[2]

emnlp-main.243/

URLhttps://aclanthology.org/2021. emnlp-main.243/. Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V ., Goyal, N., K ¨uttler, H., Lewis, M., Yih, W.-t., Rockt¨aschel, T., et al. Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in neural information processing systems, 33:9459–9474, 2020. Li, X., Wang, S., Zeng, S., Wu, Y...

work page doi:10.18653/v1/2021.acl-long 2021
[3]

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

URLhttps://aclanthology.org/2021. acl-long.353/. Li, Y ., Wen, H., Wang, W., Li, X., Yuan, Y ., Liu, G., Liu, J., Xu, W., Wang, X., Sun, Y ., et al. Personal llm agents: Insights and survey about the capability, efficiency and security.arXiv preprint arXiv:2401.05459, 2024b. Mysore, S., Lu, Z., Wan, M., Yang, L., Sarrafzadeh, B., Menezes, S., Baghaee, T.,...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.18653/v1/2024.customnlp4u-1 2021
[4]

gpt-oss-120b & gpt-oss-20b Model Card

URLhttps://aclanthology.org/2024. customnlp4u-1.16/. Neverova, N., Wolf, C., Lacey, G., Fridman, L., Chandra, D., Barbello, B., and Taylor, G. Learning human identity from motion patterns.IEEE Access, 4:1810–1820, 2016. OpenAI. gpt-oss-120b & gpt-oss-20b model card, 2025. URLhttps://arxiv.org/abs/2508.10925. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wai...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.18653/v1/2020 2024
[5]

soft prompts

URLhttps://openreview.net/forum? id=QWunLKbBGF. Zheng, L., Chiang, W.-L., Sheng, Y ., Li, T., Zhuang, S., Wu, Z., Zhuang, Y ., Li, Z., Lin, Z., Xing, E. P., Gonzalez, J. E., Stoica, I., and Zhang, H. Lmsys-chat-1m: A large- scale real-world llm conversation dataset, 2023. Zhong, W., Guo, L., Gao, Q., Ye, H., and Wang, Y . Mem- orybank: Enhancing large lan...

work page 2023
[6]

Either the user preference or the question is missing, so the retrieval target cannot be precisely defined

work page
[7]

Questions rarely induce preference conflicts, making violations unlikely and the retrieval task non-discriminative

work page
[8]

I avoid electric vehicles,

No gold labels tying (preference, question) pairs to documents that both answer the query and satisfy preferences. In light of these limitations of existing datasets, this study makes extensive use of the PrefEval benchmark (Zhao et al., 2025). A.5. PrefEval Benchmark The Explicit Preference subset of PrefEval dataset (Zhao et al., 2025) focuses on prefer...

work page 2025
[9]

a preference statement (clear like/dislike or constraint), and

work page
[10]

a query that can easily elicit a default answer which would violate that preference unless the model takes it into account (e.g., recommending the best compact cars for city driving, where the most top options are electric vehicles),

work page
[11]

This subset deliberately booby-traps the obvious answer: the quickest generic response is often preference-inconsistent

optionally, a short explanation/rationale highlighting why the query is risky with respect to the preference. This subset deliberately booby-traps the obvious answer: the quickest generic response is often preference-inconsistent. Strong performance therefore requires the model to (1) recognize the explicit constraint, (2) prioritize it alongside topical ...

work page
[12]

Preference-Unaware Violation: The LLM provides generic recommendations that contradict the user’s stated prefer- ence due to unawareness of user preference

work page
[13]

Preference Hallucination Violation: The response fabricates or misattributes preferences, diverging from the user’s true preference and violates the true preference

work page
[14]

Inconsistent Violation: The response acknowledges the correct preference but generates contradicting response

work page
[15]

role: content

Unhelpful Response: The response lacks relevant recommendations or fails to address the query due to poor recall of the user’s preference. B. Experimental Details 15 From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG B.1. Corpus of Preference Benchmarks This section describes the retrieval corpora used for indexing and retrieva...

work page arXiv 2025
[16]

the question directly contradicts the user’s preference, such that any answer would inherently violate the preference

work page
[17]

the question is already perfectly aligned with the preference, such that no additional reasoning about the preference is required

work page
[18]

For PrefRQ, since the dataset is pre-filtered to contain highly subjective questions from the Researchy Questions corpus, only conditions (1) and (2) are checked

the question has a negligible probability of violating the preference under the PrefEval data generation prompt, i.e., whenP(answer|question)≪P(answer|preference,question), indicating that even without conditioning on the preference, natural answers rarely conflict with it For PrefELI5, all three conditions are applied. For PrefRQ, since the dataset is pr...

work page
[19]

I prefer vegetarian meals

Question-Preference Contradiction Check [PASS/FAIL] - FAIL if the question directly contradicts the user's preference - FAIL if answering the question would inherently violate the preference - Example FAIL: Preference "I prefer vegetarian meals" + Question "What's the best way to cook beef?"

work page
[20]

I love Italian food

Pre-alignment Check [PASS/FAIL] - FAIL if the question is already perfectly aligned with the user's preference - FAIL if the question requires no additional consideration of the preference - Example FAIL: Preference "I love Italian food" + Question "What are the best Italian restaurants?"

work page
[21]

I prefer companies that allow unlimited sick days

Low Violation Check [PASS/FAIL] - FAIL if the question has a low probability of violating the preference - FAIL if P(answer|question) << P(answer|preference, question), which means without knowing the preference, naturally answering the question rarely violates the user's preference - Example FAIL: Preference "I prefer companies that allow unlimited sick ...

work page
[22]

Understand all user preferences thoroughly

work page
[23]

Read the given document chunk

work page
[24]

If the chunk contains no content relevant to any of the preferences, decide: Discard

work page
[25]

If the chunk is relevant to any preference, decide: Keep

work page
[26]

Always explain the reason clearly

work page
[27]

If Keep, specify exactly which preferences the chunk aligns with

work page
[28]

</planning_steps> <guidelines> - Do not infer unstated preferences

Output must strictly follow the XML structure and include only XML. </planning_steps> <guidelines> - Do not infer unstated preferences. - When listing <relevant_preferences>, use the exact preference texts as provided by the user, do not paraphrase or modify. </guidelines> <response_requirements> - Every output must follow strict XML format. - The <reason...

work page
[29]

Read the user's stated preferences

work page
[30]

Read the document chunk

work page
[31]

Read the given reason for why this chunk was marked as relevant

work page
[32]

Generate a clear, concise instruction that explains how to interpret or read this chunk in light of the relevant preferences

work page
[33]

The instruction should guide readers on what aspects to focus on or what perspective to take when reading the chunk

work page
[34]

</planning_steps> <guidelines> - The instruction is NOT a rewrite of the chunk itself, but rather guidance on how to interpret it

Output must consist of a single <instruction> XML tag. </planning_steps> <guidelines> - The instruction is NOT a rewrite of the chunk itself, but rather guidance on how to interpret it. - Focus on directing attention to preference-relevant aspects of the content. - Keep instructions concise and actionable. - Do not add information not present in the chunk...

work page 1965

[1] [1]

The Llama 3 Herd of Models

URLhttps://aclanthology.org/2023. emnlp-main.398/. Grattafiori, A., Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., Letman, A., Mathur, A., Schelten, A., Vaughan, A., et al. The llama 3 herd of models.arXiv preprint arXiv:2407.21783, 2024. Gutierrez, B. J., Shu, Y ., Gu, Y ., Yasunaga, M., and Su, Y . HippoRAG: Neurobiologically inspired long...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.18653/v1/2021.emnlp-main 2023

[2] [2]

emnlp-main.243/

URLhttps://aclanthology.org/2021. emnlp-main.243/. Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V ., Goyal, N., K ¨uttler, H., Lewis, M., Yih, W.-t., Rockt¨aschel, T., et al. Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in neural information processing systems, 33:9459–9474, 2020. Li, X., Wang, S., Zeng, S., Wu, Y...

work page doi:10.18653/v1/2021.acl-long 2021

[3] [3]

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

URLhttps://aclanthology.org/2021. acl-long.353/. Li, Y ., Wen, H., Wang, W., Li, X., Yuan, Y ., Liu, G., Liu, J., Xu, W., Wang, X., Sun, Y ., et al. Personal llm agents: Insights and survey about the capability, efficiency and security.arXiv preprint arXiv:2401.05459, 2024b. Mysore, S., Lu, Z., Wan, M., Yang, L., Sarrafzadeh, B., Menezes, S., Baghaee, T.,...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.18653/v1/2024.customnlp4u-1 2021

[4] [4]

gpt-oss-120b & gpt-oss-20b Model Card

URLhttps://aclanthology.org/2024. customnlp4u-1.16/. Neverova, N., Wolf, C., Lacey, G., Fridman, L., Chandra, D., Barbello, B., and Taylor, G. Learning human identity from motion patterns.IEEE Access, 4:1810–1820, 2016. OpenAI. gpt-oss-120b & gpt-oss-20b model card, 2025. URLhttps://arxiv.org/abs/2508.10925. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wai...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.18653/v1/2020 2024

[5] [5]

soft prompts

URLhttps://openreview.net/forum? id=QWunLKbBGF. Zheng, L., Chiang, W.-L., Sheng, Y ., Li, T., Zhuang, S., Wu, Z., Zhuang, Y ., Li, Z., Lin, Z., Xing, E. P., Gonzalez, J. E., Stoica, I., and Zhang, H. Lmsys-chat-1m: A large- scale real-world llm conversation dataset, 2023. Zhong, W., Guo, L., Gao, Q., Ye, H., and Wang, Y . Mem- orybank: Enhancing large lan...

work page 2023

[6] [6]

Either the user preference or the question is missing, so the retrieval target cannot be precisely defined

work page

[7] [7]

Questions rarely induce preference conflicts, making violations unlikely and the retrieval task non-discriminative

work page

[8] [8]

I avoid electric vehicles,

No gold labels tying (preference, question) pairs to documents that both answer the query and satisfy preferences. In light of these limitations of existing datasets, this study makes extensive use of the PrefEval benchmark (Zhao et al., 2025). A.5. PrefEval Benchmark The Explicit Preference subset of PrefEval dataset (Zhao et al., 2025) focuses on prefer...

work page 2025

[9] [9]

a preference statement (clear like/dislike or constraint), and

work page

[10] [10]

a query that can easily elicit a default answer which would violate that preference unless the model takes it into account (e.g., recommending the best compact cars for city driving, where the most top options are electric vehicles),

work page

[11] [11]

This subset deliberately booby-traps the obvious answer: the quickest generic response is often preference-inconsistent

optionally, a short explanation/rationale highlighting why the query is risky with respect to the preference. This subset deliberately booby-traps the obvious answer: the quickest generic response is often preference-inconsistent. Strong performance therefore requires the model to (1) recognize the explicit constraint, (2) prioritize it alongside topical ...

work page

[12] [12]

Preference-Unaware Violation: The LLM provides generic recommendations that contradict the user’s stated prefer- ence due to unawareness of user preference

work page

[13] [13]

Preference Hallucination Violation: The response fabricates or misattributes preferences, diverging from the user’s true preference and violates the true preference

work page

[14] [14]

Inconsistent Violation: The response acknowledges the correct preference but generates contradicting response

work page

[15] [15]

role: content

Unhelpful Response: The response lacks relevant recommendations or fails to address the query due to poor recall of the user’s preference. B. Experimental Details 15 From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG B.1. Corpus of Preference Benchmarks This section describes the retrieval corpora used for indexing and retrieva...

work page arXiv 2025

[16] [16]

the question directly contradicts the user’s preference, such that any answer would inherently violate the preference

work page

[17] [17]

the question is already perfectly aligned with the preference, such that no additional reasoning about the preference is required

work page

[18] [18]

For PrefRQ, since the dataset is pre-filtered to contain highly subjective questions from the Researchy Questions corpus, only conditions (1) and (2) are checked

the question has a negligible probability of violating the preference under the PrefEval data generation prompt, i.e., whenP(answer|question)≪P(answer|preference,question), indicating that even without conditioning on the preference, natural answers rarely conflict with it For PrefELI5, all three conditions are applied. For PrefRQ, since the dataset is pr...

work page

[19] [19]

I prefer vegetarian meals

Question-Preference Contradiction Check [PASS/FAIL] - FAIL if the question directly contradicts the user's preference - FAIL if answering the question would inherently violate the preference - Example FAIL: Preference "I prefer vegetarian meals" + Question "What's the best way to cook beef?"

work page

[20] [20]

I love Italian food

Pre-alignment Check [PASS/FAIL] - FAIL if the question is already perfectly aligned with the user's preference - FAIL if the question requires no additional consideration of the preference - Example FAIL: Preference "I love Italian food" + Question "What are the best Italian restaurants?"

work page

[21] [21]

I prefer companies that allow unlimited sick days

Low Violation Check [PASS/FAIL] - FAIL if the question has a low probability of violating the preference - FAIL if P(answer|question) << P(answer|preference, question), which means without knowing the preference, naturally answering the question rarely violates the user's preference - Example FAIL: Preference "I prefer companies that allow unlimited sick ...

work page

[22] [22]

Understand all user preferences thoroughly

work page

[23] [23]

Read the given document chunk

work page

[24] [24]

If the chunk contains no content relevant to any of the preferences, decide: Discard

work page

[25] [25]

If the chunk is relevant to any preference, decide: Keep

work page

[26] [26]

Always explain the reason clearly

work page

[27] [27]

If Keep, specify exactly which preferences the chunk aligns with

work page

[28] [28]

</planning_steps> <guidelines> - Do not infer unstated preferences

Output must strictly follow the XML structure and include only XML. </planning_steps> <guidelines> - Do not infer unstated preferences. - When listing <relevant_preferences>, use the exact preference texts as provided by the user, do not paraphrase or modify. </guidelines> <response_requirements> - Every output must follow strict XML format. - The <reason...

work page

[29] [29]

Read the user's stated preferences

work page

[30] [30]

Read the document chunk

work page

[31] [31]

Read the given reason for why this chunk was marked as relevant

work page

[32] [32]

Generate a clear, concise instruction that explains how to interpret or read this chunk in light of the relevant preferences

work page

[33] [33]

The instruction should guide readers on what aspects to focus on or what perspective to take when reading the chunk

work page

[34] [34]

</planning_steps> <guidelines> - The instruction is NOT a rewrite of the chunk itself, but rather guidance on how to interpret it

Output must consist of a single <instruction> XML tag. </planning_steps> <guidelines> - The instruction is NOT a rewrite of the chunk itself, but rather guidance on how to interpret it. - Focus on directing attention to preference-relevant aspects of the content. - Keep instructions concise and actionable. - Do not add information not present in the chunk...

work page 1965