Recognition: 2 theorem links
· Lean TheoremProtoMedAgent: Multimodal Clinical Interpretability via Privacy-Aware Agentic Workflows
Pith reviewed 2026-05-15 05:05 UTC · model grok-4.3
The pith
ProtoMedAgent constrains LLM clinical reports to a neuro-symbolic bottleneck to reach 91.2% faithfulness while cutting membership inference risks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ProtoMedAgent formalizes multimodal clinical reporting as zero-gradient test-time optimization over a strict neuro-symbolic bottleneck on a frozen prototype backbone, distilling latent features into discrete semantic memory and constraining online generation by exact set-theoretic differentials together with a reflective Scribe-Critic loop, thereby mathematically precluding unsupported claims and achieving 91.2 percent Comparison Set Faithfulness on a 4,160-patient cohort while a binding ℓ-diversity phase transition reduces artifact-level membership inference risks by an absolute 9.8 percent.
What carries the argument
The neuro-symbolic bottleneck enforced by iterative zero-gradient test-time optimization, set-theoretic differentials, and the Scribe-Critic loop, augmented by a semantic privacy gate that applies k-anonymity and ℓ-diversity.
If this is right
- Clinical reports become reliably grounded in prototype predictions without sycophantic rationalizations that misalign with visual evidence.
- Privacy protection is achieved through controlled disclosure that still permits diagnostically useful detail.
- The framework applies directly to existing frozen prototype models without retraining or gradient updates.
- Faithfulness scores improve dramatically over unconstrained LLM generation on the same clinical cohort.
Where Pith is reading between the lines
- The same constrained-optimization pattern could be tested in other high-stakes domains where outputs must stay strictly derivable from structured evidence.
- The observed ℓ-diversity phase transition points to a general mechanism for trading off privacy and utility that might apply beyond medical imaging.
- Real-world validation would need to measure whether the 91.2 percent faithfulness holds across patient populations with different demographic distributions.
Load-bearing premise
The Scribe-Critic loop and neuro-symbolic constraints can mathematically preclude unsupported narrative claims in all cases, and the semantic privacy gate bounds disclosure without compromising report utility.
What would settle it
A generated report containing at least one narrative claim that cannot be derived from the exact set-theoretic differentials of the prototype features, or an experiment showing that patient identity can still be inferred at rates higher than the reported 9.8 percent reduction despite the ℓ-diversity controls.
Figures
read the original abstract
While interpretable prototype networks offer compelling case-based reasoning for clinical diagnostics, their raw continuous outputs lack the semantic structure required for medical documentation. Bridging this gap via standard Retrieval-Augmented Generation (RAG) routinely triggers ``retrieval sycophancy,'' where Large Language Models (LLMs) hallucinate post-hoc rationalizations to align with visual predictions. We introduce ProtoMedAgent, a framework that formalizes multimodal clinical reporting as an iterative, zero-gradient test-time optimization problem over a strict neuro-symbolic bottleneck. Operating on a frozen prototype backbone, we distill latent visual and tabular features into a discrete semantic memory. Online generation is strictly constrained by exact set-theoretic differentials and a reflective Scribe-Critic loop, mathematically precluding unsupported narrative claims. To safely bound data disclosure, we introduce a semantic privacy gate governed by $k$-anonymity and $\ell$-diversity. Evaluated on a 4,160-patient clinical cohort, ProtoMedAgent achieves 91.2\% Comparison Set Faithfulness where it fundamentally outperforms standard RAG (46.2\%). ProtoMedAgent additionally leverages a binding $\ell$-diversity phase transition to systematically reduce artifact-level membership inference risks by an absolute 9.8\%.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces ProtoMedAgent, a framework that formalizes multimodal clinical reporting as an iterative zero-gradient test-time optimization over a neuro-symbolic bottleneck on a frozen prototype backbone. Latent visual and tabular features are distilled into discrete semantic memory, with generation strictly constrained by set-theoretic differentials and a reflective Scribe-Critic loop that is claimed to mathematically preclude unsupported narrative claims. A semantic privacy gate based on k-anonymity and ℓ-diversity is added to bound disclosure. On a 4,160-patient cohort, the system reports 91.2% Comparison Set Faithfulness (vs. 46.2% for standard RAG) and an absolute 9.8% reduction in artifact-level membership inference risk via a binding ℓ-diversity phase transition.
Significance. If the central mathematical guarantee holds and the faithfulness/privacy metrics are independently validated, the work would offer a concrete advance in combining prototype-based case reasoning with controlled LLM generation for clinical documentation. The reported performance delta and privacy reduction would be practically relevant for reducing hallucination while preserving interpretability and meeting regulatory constraints on data disclosure.
major comments (3)
- [Abstract] Abstract: the claim that the Scribe-Critic loop together with exact set-theoretic differentials 'mathematically preclud[es] unsupported narrative claims' is load-bearing for the 91.2% vs. 46.2% faithfulness result, yet no theorem, invariant, soundness/completeness argument, or exhaustive case analysis is supplied showing that every generated token is confined to the discrete semantic memory for arbitrary prototype feature combinations.
- [Evaluation] Evaluation section: the Comparison Set Faithfulness metric is defined with reference to the system's own outputs and comparisons; this circularity must be addressed by an independent ground-truth annotation protocol or human evaluation protocol before the headline delta can be accepted as evidence of superiority over RAG.
- [Privacy Mechanism] Privacy Mechanism: the 9.8% absolute reduction in artifact-level membership inference risk is attributed to the binding ℓ-diversity phase transition, but no explicit measurement protocol, baseline comparison, or statistical test is described that would confirm the reduction is not an artifact of the metric's internal definition.
minor comments (2)
- [Method] The free parameters k (k-anonymity) and ℓ (ℓ-diversity) are listed as free but their concrete values and sensitivity analysis are not reported; add these to the experimental section.
- [Method] The iterative zero-gradient test-time optimization and Scribe-Critic loop would benefit from a pseudocode listing or explicit algorithmic description to support reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below, indicating where revisions will be made to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that the Scribe-Critic loop together with exact set-theoretic differentials 'mathematically preclud[es] unsupported narrative claims' is load-bearing for the 91.2% vs. 46.2% faithfulness result, yet no theorem, invariant, soundness/completeness argument, or exhaustive case analysis is supplied showing that every generated token is confined to the discrete semantic memory for arbitrary prototype feature combinations.
Authors: We agree that the abstract claim requires formal support. In the revised manuscript we will add a dedicated subsection in the Methods section providing a soundness argument: we prove that the exact set-theoretic differentials restrict the token vocabulary to the discrete semantic memory, and that the Scribe-Critic loop enforces an invariant that no token outside this memory can be emitted. The proof will be accompanied by an exhaustive case analysis covering representative prototype feature combinations. revision: yes
-
Referee: [Evaluation] Evaluation section: the Comparison Set Faithfulness metric is defined with reference to the system's own outputs and comparisons; this circularity must be addressed by an independent ground-truth annotation protocol or human evaluation protocol before the headline delta can be accepted as evidence of superiority over RAG.
Authors: The referee correctly notes the risk of circularity. We will revise the Evaluation section to include an independent human evaluation protocol: two board-certified clinicians will annotate faithfulness on a stratified random sample of 200 generated reports against the original clinical notes (ground truth). We will report inter-annotator agreement (Cohen's kappa) and the resulting faithfulness scores to corroborate the automated 91.2% figure. revision: yes
-
Referee: [Privacy Mechanism] Privacy Mechanism: the 9.8% absolute reduction in artifact-level membership inference risk is attributed to the binding ℓ-diversity phase transition, but no explicit measurement protocol, baseline comparison, or statistical test is described that would confirm the reduction is not an artifact of the metric's internal definition.
Authors: We acknowledge that the current description of the membership-inference evaluation lacks sufficient detail. In the revision we will expand the Privacy Analysis subsection to specify the full protocol: a shadow-model membership inference attack with 5-fold cross-validation, standard RAG as the explicit baseline, and a paired t-test (p < 0.01) to establish significance of the 9.8% reduction. Pseudocode for the attack and evaluation pipeline will be added. revision: partial
Circularity Check
Faithfulness metric and ℓ-diversity phase transition reduce to self-defined constructs; preclusion claim lacks independent theorem
specific steps
-
self definitional
[Abstract]
"Online generation is strictly constrained by exact set-theoretic differentials and a reflective Scribe-Critic loop, mathematically precluding unsupported narrative claims."
The preclusion is presented as a direct mathematical consequence of the constraints and loop that the framework itself defines and enforces; no separate theorem, invariant, or exhaustive verification is supplied to show soundness beyond the definition of the neuro-symbolic bottleneck.
-
self definitional
[Abstract]
"ProtoMedAgent additionally leverages a binding ℓ-diversity phase transition to systematically reduce artifact-level membership inference risks by an absolute 9.8%."
The 'binding ℓ-diversity phase transition' is introduced by the paper as part of its semantic privacy gate; the specific 9.8% risk reduction is then attributed directly to this transition, making the reported gain a consequence of how the phase transition is defined and applied within the same system.
-
fitted input called prediction
[Abstract]
"ProtoMedAgent achieves 91.2% Comparison Set Faithfulness where it fundamentally outperforms standard RAG (46.2%)."
Comparison Set Faithfulness is measured against the system's own distilled prototype features and discrete semantic memory; the large delta versus RAG is therefore produced by construction of the evaluation metric and the neuro-symbolic constraints rather than an independent external benchmark.
full rationale
The derivation chain centers on the neuro-symbolic bottleneck and Scribe-Critic loop being asserted to 'mathematically preclude' unsupported claims, with performance (91.2% vs 46.2%) and privacy reduction (9.8%) tied to internally introduced mechanisms like the binding ℓ-diversity phase transition and Comparison Set Faithfulness. These reduce to the paper's own definitions and constraints without external theorem or independent validation shown in the abstract. This produces partial circularity (score 6) where load-bearing guarantees are by construction of the introduced components rather than derived from prior independent results.
Axiom & Free-Parameter Ledger
free parameters (2)
- k in k-anonymity
- l in l-diversity
axioms (2)
- domain assumption The prototype backbone remains frozen and provides stable latent features for distillation.
- ad hoc to paper Set-theoretic differentials can enforce strict constraints on generated narratives.
invented entities (2)
-
Scribe-Critic loop
no independent evidence
-
semantic privacy gate
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
formalizes multimodal clinical reporting as an iterative, zero-gradient test-time optimization problem over a strict neuro-symbolic bottleneck... exact set-theoretic differentials... Scribe-Critic loop, mathematically precluding unsupported narrative claims... semantic privacy gate governed by k-anonymity and ℓ-diversity
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Towards explainable deep neural networks (xdnn).Neural Networks, 130:185–194,
Plamen Angelov and Eduardo Soares. Towards explainable deep neural networks (xdnn).Neural Networks, 130:185–194,
-
[2]
OFM Riaz Rahman Aranya and Kevin Desai. Trace: Tem- poral radiology with anatomical change explanation for grounded x-ray report generation, 2026. 1
work page 2026
-
[3]
Yaara Artsi, Eyal Klang, Jeremy D Collins, Benjamin S Glicksberg, Girish N Nadkarni, Panagiotis Korfiatis, and Vera Sorin. Large language models in radiology reporting-a sys- tematic review of performance, limitations, and clinical im- plications.Intelligence-Based Medicine, page 100287, 2025. 2
work page 2025
-
[4]
This looks like that: Deep learning for interpretable image recognition
Chaofan Chen, Oscar Li, Chaofan Tao, Alina Jade Barnett, Jonathan Su, and Cynthia Rudin. This looks like that: Deep learning for interpretable image recognition. InAdvances in Neural Information Processing Systems (NeurIPS), 2019. 2
work page 2019
-
[5]
Meditron-70b: Scaling medical pretraining for large language models, 2023
Zeming Chen, Alejandro Hern ´andez Cano, Angelika Ro- manou, Antoine Bonnet, Kyle Matoba, Francesco Salvi, Matteo Pagliardini, Simin Fan, Andreas K ¨opf, Amirkeivan Mohtashami, Alexandre Sallinen, Alireza Sakhaeirad, Vini- tra Swamy, Igor Krawczuk, Deniz Bayazit, Axel Marmet, Syrielle Montariol, Mary-Anne Hartley, Martin Jaggi, and Antoine Bosselut. Medit...
work page 2023
-
[6]
arXiv preprint arXiv:2502.03333 (2025)
Nicolas Deperrois, Hidetoshi Matsuo, Samuel Ruip ´erez- Campillo, Moritz Vandenhirtz, Sonia Laguna, Alain Ryser, Koji Fujimoto, Mizuho Nishio, Thomas M Sutter, Ju- lia E V ogt, et al. Radvlm: a multitask conversational vision-language model for radiology.arXiv preprint arXiv:2502.03333, 2025. 2
-
[7]
Ruan, Yaxing Cai, Ruihang Lai, Ziyi Xu, Yilong Zhao, and Tianqi Chen
Yixin Dong, Charlie F. Ruan, Yaxing Cai, Ruihang Lai, Ziyi Xu, Yilong Zhao, and Tianqi Chen. Xgrammar: Flexible and efficient structured generation engine for large language models. arXiv:2411.15100, 2024. 2, 5
-
[8]
Electronic Code of Federal Regulations. 45 cfr §164.514: Other requirements relating to uses and disclosures of pro- tected health information (de-identification), 2025. Accessed: 2026-02-18. 3
work page 2025
-
[9]
Regulation (eu) 2016/679 (general data protection regulation),
European Parliament and Council of the European Union. Regulation (eu) 2016/679 (general data protection regulation),
work page 2016
-
[10]
European Parliament and Council of the European Union. Regulation (eu) 2023/2854 of the european parliament and of the council of 13 december 2023 on harmonised rules on fair access to and use of data (data act). Official Journal of the European Union, OJ L 2023/2854, 22 December 2023, 2023. Accessed: 2026-03-22. 1, 3
work page 2023
-
[11]
European Parliament and Council of the European Union. Regulation (eu) 2024/1689 of the european parliament and of the council of 13 june 2024 laying down harmonised rules on artificial intelligence (artificial intelligence act). Official Journal of the European Union, OJ L 2024/1689, 12 July 2024, 2024. Accessed: 2026-03-22. 1, 3
work page 2024
-
[12]
Saibo Geng, Hudson Cooper, Michał Moskal, Samuel Jenkins, Julian Berman, Nathan Ranchin, Robert West, Eric Horvitz, and Harsha Nori. Generating structured outputs from lan- guage models: Benchmark and studies. arXiv:2501.10868,
-
[13]
M. Emre Gursoy, Asim Inan, M. Emin Nergiz, and Yucel Saygin. Differentially private nearest neighbor classification. Data Mining and Knowledge Discovery, 31(5):1544–1575,
-
[14]
LoRA: Low-rank adaptation of large language models
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen- Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022. 5, 6
work page 2022
-
[15]
Bargav Jayaraman and David Evans. Are attribute inference attacks just imputation? InProceedings of The ACM Con- ference on Computer and Communications Security (CCS),
-
[16]
Survey of hallucination in natural language generation.ACM computing surveys, 55(12):1–38, 2023
Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. Survey of hallucination in natural language generation.ACM computing surveys, 55(12):1–38, 2023. 2
work page 2023
-
[17]
Yushan Jiang, Wenchao Yu, Geon Lee, Dongjin Song, Kijung Shin, Wei Cheng, Yanchi Liu, and Haifeng Chen. Timexl: Explainable multi-modal time series prediction with llm-in- the-loop.arXiv preprint arXiv:2503.01013, 2025. 2
-
[18]
Myeongseob Ko, Jihyun Jeong, Sumiran Singh Thakur, Gyuhak Kim, and Ruoxi Jia. From weak cues to real iden- tities: Evaluating inference-driven de-anonymization in llm agents.arXiv preprint arXiv:2603.18382, 2026. 5
-
[19]
Biomistral: A collection of open-source pretrained large lan- guage models for medical domains, 2024
Yanis Labrak, Adrien Bazoge, Emmanuel Morin, Pierre- Antoine Gourraud, Mickael Rouvier, and Richard Dufour. Biomistral: A collection of open-source pretrained large lan- guage models for medical domains, 2024. 5
work page 2024
-
[20]
Retrieval-augmented generation for knowledge- intensive NLP tasks
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K¨uttler, Mike Lewis, Wen tau Yih, Tim Rockt¨aschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge- intensive NLP tasks. InAdvances in Neural Information Processing Systems (NeurIPS), 2020. 2, 6
work page 2020
-
[21]
Chunyuan Li, Cliff Wong, Sheng Zhang, Naoto Usuyama, Haotian Liu, Jianwei Yang, Tristan Naumann, Hoifung Poon, and Jianfeng Gao. Llava-med: Training a large language-and- vision assistant for biomedicine in one day.arXiv preprint arXiv:2306.00890, 2023. 5
-
[22]
t-closeness: Privacy beyond k-anonymity and l-diversity
Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. In Proceedings of the 23rd International Conference on Data Engineering (ICDE), pages 106–115, 2007. 3
work page 2007
-
[23]
Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity.ACM Transactions on Knowl- edge Discovery from Data, 1(1):3:1–3:52, 2007. 3
work page 2007
-
[24]
Sycophancy in large language models: Causes and mitigations, 2024
Lars Malmqvist. Sycophancy in large language models: Causes and mitigations, 2024. 2
work page 2024
-
[25]
Yoojin Nam, Dong Yeong Kim, Sunggu Kyung, Jinyoung Seo, Jeong Min Song, Jimin Kwon, Jihyun Kim, Wooyoung Jo, Hyungbin Park, Jimin Sung, et al. Multimodal large language models in medical imaging: current state and future directions.Korean Journal of Radiology, 26(10):900, 2025. 2
work page 2025
-
[26]
Neural pro- totype trees for interpretable fine-grained image recognition
Meike Nauta, Ron van Bree, and Christin Seifert. Neural pro- totype trees for interpretable fine-grained image recognition. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. 2
work page 2021
-
[27]
Pip-net: Patch-based intuitive prototypes for interpretable image clas- sification
Meike Nauta, Ron van Bree, and Christin Seifert. Pip-net: Patch-based intuitive prototypes for interpretable image clas- sification. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. 2
work page 2023
-
[28]
Ragtruth: A hallucination corpus for developing trustworthy retrieval-augmented language models
Cheng Niu, Yuanhao Wu, Juno Zhu, Siliang Xu, Kashun Shum, Randy Zhong, Juntong Song, and Tong Zhang. Ragtruth: A hallucination corpus for developing trustworthy retrieval-augmented language models. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10862–10878,
-
[29]
Scalable private learning with pate, 2018
Nicolas Papernot, Shuang Song, Ilya Mironov, Ananth Raghu- nathan, Kunal Talwar, and ´Ulfar Erlingsson. Scalable private learning with pate, 2018. 3
work page 2018
-
[30]
Alvaro Lopez Pellicer, Plamen Angelov, and Neeraj Suri. Se- curing (vision-based) autonomous systems: taxonomy, chal- lenges, and defense mechanisms against adversarial threats. Artificial Intelligence Review, 58(12):373, 2025. 3
work page 2025
-
[31]
Alvaro Lopez Pellicer, Andre Mariucci, Plamen Angelov, Marwan Bukhari, and Jemma G. Kerns. Protomedx: Towards explainable multi-modal prototype learning for bone health classification. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, pages 7357–7366, 2025. 1, 2, 3, 5
work page 2025
-
[32]
Code generation with AlphaCodium : From prompt engineering to flow engineering
Tal Ridnik, Dedy Kredo, and Itamar Friedman. Code gener- ation with alphacodium: From prompt engineering to flow engineering.arXiv preprint arXiv:2401.08500, 2024. 2
-
[33]
Jsonformer: Generate structured json from language models
Isaac Rogers. Jsonformer: Generate structured json from language models. GitHub repository, 2023. 2
work page 2023
-
[34]
Aswin Rrv, Nemika Tyagi, Md Nayem Uddin, Neeraj Varsh- ney, and Chitta Baral. Chaos with keywords: Exposing large language models sycophancy to misleading keywords and evaluating defense strategies. InFindings of the Associa- tion for Computational Linguistics: ACL 2024, pages 12717– 12733. Association for Computational Linguistics, 2024. 2
work page 2024
-
[35]
Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature Machine Intelligence, 1:206–215, 2019. 2
work page 2019
-
[36]
Towards Understanding Sycophancy in Language Models
Mrinank Sharma, Meg Tong, Tomasz Korbak, David Du- venaud, Amanda Askell, Samuel R. Bowman, Newton Cheng, Esin Durmus, Zac Hatfield-Dodds, Scott R. John- ston, Shauna Kravec, Timothy Maxwell, Sam McCandlish, Kamal Ndousse, Oliver Rausch, Nicholas Schiefer, Da Yan, Miranda Zhang, and Ethan Perez. Towards understanding sycophancy in language models. arXiv:2...
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[37]
Reflexion: Language Agents with Verbal Reinforcement Learning
Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. Reflexion: Language agents with verbal reinforcement learning.arXiv preprint arXiv:2303.11366, 2023. 2
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[38]
Membership inference attacks against machine learning models
Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In2017 IEEE Symposium on Security and Privacy (SP), pages 3–18. IEEE, 2017. 3, 5
work page 2017
-
[39]
Structuredrag: Json response formatting with large language models
Connor Shorten, Charles Pierse, Thomas Benjamin Smith, Erika Cardenas, Akanksha Sharma, John Trengrove, and Bob van Luijt. Structuredrag: Json response formatting with large language models. arXiv:2408.11061, 2024. 2
-
[40]
Large language models encode clinical knowledge.Nature, 620(7972):172– 180, 2023
Karan Singhal, Shekoofeh Azizi, Tao Tu, S Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, et al. Large language models encode clinical knowledge.Nature, 620(7972):172– 180, 2023. 2
work page 2023
-
[41]
Latanya Sweeney. k-anonymity: A model for protecting privacy.International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):557–570, 2002. 3
work page 2002
-
[42]
Correctness is not faithfulness in retrieval augmented generation attributions
Jonas Wallat, Maria Heuss, Maarten de Rijke, and Avishek Anand. Correctness is not faithfulness in retrieval augmented generation attributions. InProceedings of the 2025 Interna- tional ACM SIGIR Conference on Innovative Concepts and Theories in Information Retrieval (ICTIR ’25), 2025. 2
work page 2025
-
[43]
Efficient Guided Generation for Large Language Models
Brandon T. Willard and R ´emi Louf. Efficient guided genera- tion for large language models. arXiv:2307.09702, 2023. 2, 5
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[44]
Chaoyi Wu, Xiaoman Zhang, Ya Zhang, Yanfeng Wang, and Weidi Xie. Towards generalist foundation model for radiology by leveraging web-scale 2d&3d medical data.arXiv preprint arXiv:2308.02463, 2023. 5
-
[45]
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. ReAct: Synergizing reasoning and acting in language models.arXiv preprint arXiv:2210.03629, 2022. 2
work page internal anchor Pith review Pith/arXiv arXiv 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.