A Compound AI Agent for Conversational Grant Discovery

Mayank Kejriwal; Zhisheng Tang

arxiv: 2605.02366 · v1 · submitted 2026-05-04 · 💻 cs.AI

A Compound AI Agent for Conversational Grant Discovery

Zhisheng Tang , Mayank Kejriwal This is my paper

Pith reviewed 2026-05-08 19:25 UTC · model grok-4.3

classification 💻 cs.AI

keywords compound AIgrant discoveryconversational agentsReActLLM agentsresearch fundinginformation retrievalhybrid search

0 comments

The pith

A compound AI system aggregates nearly 12,000 grants from scattered portals and answers researcher queries through a single conversational interface.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Research funding discovery is fragmented across dozens of agency portals with incompatible search tools and data formats. The paper describes a compound AI setup that deploys LLM-equipped browser agents to autonomously gather, normalize, and index opportunities into a unified, biweekly-updated database. A ReAct-based conversational layer then interprets a researcher's description or uploaded PDF, combines structured search with selective web lookup, and streams results with visible reasoning steps to avoid hallucination. Users can refine constraints across multiple turns without rewriting their core query. The approach is presented as cutting manual search time from 30-45 minutes down to under 10 minutes and is already serving thousands of researchers.

Core claim

The authors claim that a compound AI agent unifies fragmented grant discovery by coupling an aggregation layer of LLM browser agents that collect and normalize almost 12,000 federal and nonprofit opportunities with a ReAct-based query layer that interprets research context, performs hybrid search, and delivers transparent conversational results, thereby reducing discovery time from 30-45 minutes of manual portal navigation to under 10 minutes.

What carries the argument

The compound AI agent formed by an LLM-equipped browser aggregation layer that maintains a unified opportunity database and a ReAct-based conversational query layer that performs context-aware hybrid retrieval.

Load-bearing premise

LLM browser agents can reliably scrape and normalize data from many different grant portals without errors or omissions, and the ReAct layer can interpret researcher context and retrieve matching opportunities without hallucinating or overlooking relevant grants.

What would settle it

A controlled test in which the same set of researchers conduct identical grant searches both manually across the original portals and through the system, then compare the time taken and the completeness of the returned opportunities.

Figures

Figures reproduced from arXiv: 2605.02366 by Mayank Kejriwal, Zhisheng Tang.

**Figure 1.** Figure 1: Distribution of funding opportunities across U.S. view at source ↗

**Figure 3.** Figure 3: Comparison of our conversational grant discovery system (Pane A) with traditional web search (Pane B). Our view at source ↗

read the original abstract

Research funding discovery remains fundamentally fragmented: researchers navigate disparate agency portals (e.g., in the United States, NSF, NIH, DARPA, Grants.gov, and many others) with heterogeneous interfaces, search capabilities, and data schemas. We present a compound AI system that unifies this landscape through two tightly coupled components: (1) an aggregation layer that autonomously collects, normalizes, and indexes almost 12,000 federal and nonprofit opportunities from fragmented sources via LLM-equipped browser agents, maintaining a biweekly-updated unified database; and (2) an agentic ReAct-based query processing layer that interprets research context (including from PDF documents) and employs hybrid search combining a structured index with selective web search to retrieve relevant opportunities - while avoiding LLM hallucination. The conversational interface supports iterative refinement through multi-turn interactions, allowing researchers to progressively apply constraints without reformulating their core research description. Results stream in real time with full transparency of intermediate reasoning, enabling appropriate calibration of user trust. Currently used by almost 3,000+ users, our approach demonstrates the feasibility of compound AI in reducing grant discovery time from 30--45 minutes (manual, fragmented portal searches) to under 10 minutes (unified, conversational search).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper describes a deployed LLM-agent system for grant discovery but provides no data to support its time-saving claims.

read the letter

The main takeaway is a working compound AI tool that pulls grant opportunities from scattered portals using browser agents, normalizes them into a single index of nearly 12,000 items updated biweekly, and then runs a ReAct layer for conversational search that takes research context from text or PDFs and shows its steps to the user. It already has over 3,000 users. That is the concrete thing the paper contributes: a specific, live implementation for this administrative task rather than new algorithms or theory. The design choices around hybrid search and visible reasoning make sense for keeping users in the loop and reducing blind reliance on the model. The scale of aggregation and the multi-turn refinement interface are practical steps forward for this narrow domain. The techniques themselves are established, but putting them together at this level for grant discovery is new. The central problem is the missing evaluation. The claim that the system cuts discovery time from 30-45 minutes to under 10 minutes stands without any user timing data, accuracy metrics, error logs from the browser agents, or checks on whether relevant grants are missed or data is normalized incorrectly. Adoption numbers show interest but do not substitute for those measurements. Without them it is difficult to judge how reliable the system actually is across heterogeneous sources. This work is for researchers who hunt for funding and for people building applied agent systems. A reader looking for an example of chaining LLM components to solve a real retrieval problem could extract useful architecture details. It deserves peer review because the system is built and running, so referees can require the authors to add the necessary quantitative tests and make the contribution clearer. I would send it rather than desk reject.

Referee Report

3 major / 1 minor

Summary. The manuscript describes a compound AI system for conversational grant discovery. It consists of an aggregation layer using LLM-equipped browser agents to autonomously collect, normalize, and index nearly 12,000 federal and nonprofit grant opportunities from heterogeneous sources, with biweekly updates to a unified database. A ReAct-based query processing layer interprets research context (including from PDF documents) and employs hybrid search (structured index plus selective web search) to retrieve relevant opportunities while avoiding hallucinations. The conversational interface supports multi-turn iterative refinement with real-time streaming results and transparency of reasoning. The authors claim this reduces grant discovery time from 30-45 minutes (manual searches) to under 10 minutes and report current use by over 3,000 users, demonstrating feasibility of compound AI for this task.

Significance. If the performance and reliability claims are substantiated through rigorous evaluation, the work would illustrate a practical, deployed application of compound AI agents that combines browser automation for data aggregation with agentic reasoning for query handling in a fragmented real-world domain. The unified database and conversational interface directly address researcher pain points in funding discovery, and the reported user adoption indicates practical interest. However, the absence of any quantitative validation limits assessment of the claimed efficiency gains and system robustness.

major comments (3)

[Abstract] Abstract: The central claim that the system reduces grant discovery time from 30-45 minutes to under 10 minutes is load-bearing for the feasibility demonstration but is unsupported by any user study, timed comparisons, session logs, A/B testing, or quantitative metrics.
[Abstract] Abstract: No evaluation metrics, error analysis, precision/recall figures, or validation procedures are described for the aggregation layer's LLM-equipped browser agents in collecting and normalizing data from heterogeneous portals, nor for the ReAct layer's context interpretation, retrieval completeness, or hallucination avoidance.
[Abstract] Abstract: The manuscript provides no methodology section, experimental results, or details on the hybrid search implementation, database maintenance process, or how transparency in reasoning is achieved to support user trust calibration.

minor comments (1)

[Abstract] Abstract: The phrase 'compound AI' is invoked without a concise definition or reference to prior literature on the term, which may hinder readers new to the concept.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We recognize that the manuscript is primarily a system description of a deployed compound AI application and lacks the quantitative evaluations and methodological details expected in empirical work. We will revise the manuscript to qualify unsupported claims, add a methodology section, and discuss limitations and observed performance.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that the system reduces grant discovery time from 30-45 minutes to under 10 minutes is load-bearing for the feasibility demonstration but is unsupported by any user study, timed comparisons, session logs, A/B testing, or quantitative metrics.

Authors: We agree that the specific time-reduction figures lack rigorous quantitative support and are based on informal internal testing and anecdotal user feedback rather than controlled studies. In the revised manuscript, we will qualify this statement in the abstract to indicate it reflects observed benefits from deployment and early user reports, remove the precise numerical claims, and add a forward-looking note on planned user studies to validate efficiency gains. revision: partial
Referee: [Abstract] Abstract: No evaluation metrics, error analysis, precision/recall figures, or validation procedures are described for the aggregation layer's LLM-equipped browser agents in collecting and normalizing data from heterogeneous portals, nor for the ReAct layer's context interpretation, retrieval completeness, or hallucination avoidance.

Authors: The manuscript focuses on architectural description and real-world deployment rather than formal benchmarking. We will add a dedicated 'Validation and Limitations' section describing internal checks for data normalization, the hybrid search strategy for reducing hallucinations (structured index plus selective web retrieval), and observed error patterns from usage logs. We will explicitly note the absence of precision/recall metrics as a current limitation and outline future evaluation plans. revision: partial
Referee: [Abstract] Abstract: The manuscript provides no methodology section, experimental results, or details on the hybrid search implementation, database maintenance process, or how transparency in reasoning is achieved to support user trust calibration.

Authors: We will expand the manuscript with a new Methodology section that details the aggregation pipeline (LLM browser agents, normalization rules, and biweekly update process), the hybrid search implementation (structured index construction, ReAct-based query interpretation from text/PDFs, and triggers for web search), and the transparency mechanisms (real-time streaming of reasoning steps). This will support reproducibility and explain how users can calibrate trust. revision: yes

Circularity Check

0 steps flagged

No circularity: descriptive system paper with no derivations or fitted predictions

full rationale

The paper presents a built compound AI system for grant discovery, consisting of an LLM-browser aggregation layer and a ReAct query layer. No mathematical models, equations, first-principles derivations, or statistical predictions appear in the provided text. Claims such as time reduction from 30-45 to under 10 minutes are stated as feasibility demonstrations based on system existence and user adoption (3,000+ users), without any fitted parameters, self-referential definitions, or load-bearing self-citations that reduce to inputs by construction. None of the enumerated circularity patterns apply, as there is no derivation chain to inspect.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on assumptions about the capabilities of current LLM agents in web navigation and reasoning tasks, which are domain assumptions not independently verified in the abstract.

axioms (2)

domain assumption LLM-based browser agents can autonomously and accurately extract and normalize grant data from diverse web sources.
This is required for the aggregation layer to function as described.
domain assumption The ReAct-based agent can interpret research context and perform hybrid search without introducing hallucinations.
Central to the query processing layer's reliability.

pith-pipeline@v0.9.0 · 5512 in / 1501 out tokens · 47396 ms · 2026-05-08T19:25:25.270653+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

[1]

Alkaissi and S

H. Alkaissi and S. I. McFarlane. 2023. Artificial hallucinations in ChatGPT: implications in scientific writing.Cureus15, 2 (2023)

work page 2023
[2]

Amano, V

T. Amano, V. Ramírez-Castañeda, V. Berdejo-Espinola, I. Borokini, S. Chowdhury, and M. Golivets. 2023. The manifold costs of being a non-native English speaker in science.PLOS Biology21, 7 (2023), e3002184

work page 2023
[3]

Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N Bennett, Kori Inkpen, et al. 2019. Guidelines for human-AI interaction. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–13

work page 2019
[4]

Sandeep Avula, Bogeum Choi, and Jaime Arguello. 2022. The effects of system initiative during conversational collaborative search.Proceedings of the ACM on Human-Computer Interaction6, CSCW1, 1–30

work page 2022
[5]

Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z Gajos. 2021. To trust or to think: cognitive forcing functions can reduce overreliance on AI in AI-assisted decision-making.Proceedings of the ACM on Human-Computer Interaction5, CSCW1, 1–21

work page 2021
[6]

R. C. Godwin, J. J. DeBerry, B. M. Wagener, D. E. Berkowitz, and R. L. Melvin

work page
[7]

Grant drafting support with guided generative AI software.SoftwareX27 (2024), 101784

work page 2024
[8]

Huang and M

J. Huang and M. Tan. 2023. The role of ChatGPT in scientific communication: writing better scientific review articles.American Journal of Cancer Research13, 4 (2023), 1148–1154

work page 2023
[9]

Katsnelson

A. Katsnelson. 2022. Poor English skills? New AIs help researchers to write better. Nature609, 7925 (2022), 208–209

work page 2022
[10]

René F Kizilcec. 2016. How much information? Effects of transparency on trust in an algorithmic interface. InProceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 2390–2395

work page 2016
[11]

Lavrič and A

F. Lavrič and A. Škraba. 2023. Brainstorming will never be the same again—a hu- man group supported by artificial intelligence.Machine Learning and Knowledge Extraction5, 4 (2023), 1282–1301

work page 2023
[12]

Vera Liao and S

Q. Vera Liao and S. Shyam Sundar. 2022. Designing for responsible trust in AI systems: A communication perspective. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. 1257–1268

work page 2022
[13]

McGowan, Y

A. McGowan, Y. Gui, M. Dobbs, S. Shuster, M. Cotter, and A. Selloni. 2023. Chat- GPT and Bard exhibit spontaneous citation fabrication during psychiatry litera- ture search.Psychiatry Research326 (2023), 115334

work page 2023
[14]

Filip Radlinski and Nick Craswell. 2017. A theoretical framework for conversa- tional search. InProceedings of the 2017 Conference on Conference Human Infor- mation Interaction and Retrieval. 117–126

work page 2017
[15]

Seckel, B

E. Seckel, B. Y. Stephens, and F. Rodriguez. 2024. Ten simple rules to leverage large language models for getting grants.PLoS Computational Biology20, 3 (2024), e1011863

work page 2024
[16]

Milad Shokouhi and Luo Si. 2011. Federated search.Foundations and Trends®in Information Retrieval5, 1 (2011), 1–102

work page 2011
[17]

E. Sohn. 2020. Secrets to writing a winning grant.Nature577, 7788 (2020), 133–135

work page 2020
[18]

Song and Y

C. Song and Y. Song. 2023. Enhancing academic writing skills and motivation: as- sessing the efficacy of ChatGPT in AI-assisted language learning for EFL students. Frontiers in Psychology14 (2023), 1260843

work page 2023
[19]

Alexandra Vtyurina, Denis Savenkov, Eugene Agichtein, and Charles L. A. Clarke

work page
[20]

In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems

Exploring conversational search with humans, assistants, and wizards. In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems. 2187–2193

work page 2017
[21]

Shuai Wang, Ekaterina Khramtsova, Shengyao Zhuang, and Guido Zuccon. 2024. Feb4rag: Evaluating federated search in the context of retrieval augmented gener- ation. InProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. 763–773

work page 2024
[22]

Matei Zaharia, Omar Khattab, Lingjiao Chen, Jared Quincy Davis, Heather Miller, Chris Potts, James Zou, Michael Carbin, Jonathan Frankle, Naveen Rao, and Ali Ghodsi. 2024. The Shift from Models to Compound AI Systems. BAIR Blog. https://bair.berkeley.edu/blog/2024/02/18/compound-ai-systems/

work page 2024

[1] [1]

Alkaissi and S

H. Alkaissi and S. I. McFarlane. 2023. Artificial hallucinations in ChatGPT: implications in scientific writing.Cureus15, 2 (2023)

work page 2023

[2] [2]

Amano, V

T. Amano, V. Ramírez-Castañeda, V. Berdejo-Espinola, I. Borokini, S. Chowdhury, and M. Golivets. 2023. The manifold costs of being a non-native English speaker in science.PLOS Biology21, 7 (2023), e3002184

work page 2023

[3] [3]

Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N Bennett, Kori Inkpen, et al. 2019. Guidelines for human-AI interaction. InProceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–13

work page 2019

[4] [4]

Sandeep Avula, Bogeum Choi, and Jaime Arguello. 2022. The effects of system initiative during conversational collaborative search.Proceedings of the ACM on Human-Computer Interaction6, CSCW1, 1–30

work page 2022

[5] [5]

Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z Gajos. 2021. To trust or to think: cognitive forcing functions can reduce overreliance on AI in AI-assisted decision-making.Proceedings of the ACM on Human-Computer Interaction5, CSCW1, 1–21

work page 2021

[6] [6]

R. C. Godwin, J. J. DeBerry, B. M. Wagener, D. E. Berkowitz, and R. L. Melvin

work page

[7] [7]

Grant drafting support with guided generative AI software.SoftwareX27 (2024), 101784

work page 2024

[8] [8]

Huang and M

J. Huang and M. Tan. 2023. The role of ChatGPT in scientific communication: writing better scientific review articles.American Journal of Cancer Research13, 4 (2023), 1148–1154

work page 2023

[9] [9]

Katsnelson

A. Katsnelson. 2022. Poor English skills? New AIs help researchers to write better. Nature609, 7925 (2022), 208–209

work page 2022

[10] [10]

René F Kizilcec. 2016. How much information? Effects of transparency on trust in an algorithmic interface. InProceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 2390–2395

work page 2016

[11] [11]

Lavrič and A

F. Lavrič and A. Škraba. 2023. Brainstorming will never be the same again—a hu- man group supported by artificial intelligence.Machine Learning and Knowledge Extraction5, 4 (2023), 1282–1301

work page 2023

[12] [12]

Vera Liao and S

Q. Vera Liao and S. Shyam Sundar. 2022. Designing for responsible trust in AI systems: A communication perspective. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. 1257–1268

work page 2022

[13] [13]

McGowan, Y

A. McGowan, Y. Gui, M. Dobbs, S. Shuster, M. Cotter, and A. Selloni. 2023. Chat- GPT and Bard exhibit spontaneous citation fabrication during psychiatry litera- ture search.Psychiatry Research326 (2023), 115334

work page 2023

[14] [14]

Filip Radlinski and Nick Craswell. 2017. A theoretical framework for conversa- tional search. InProceedings of the 2017 Conference on Conference Human Infor- mation Interaction and Retrieval. 117–126

work page 2017

[15] [15]

Seckel, B

E. Seckel, B. Y. Stephens, and F. Rodriguez. 2024. Ten simple rules to leverage large language models for getting grants.PLoS Computational Biology20, 3 (2024), e1011863

work page 2024

[16] [16]

Milad Shokouhi and Luo Si. 2011. Federated search.Foundations and Trends®in Information Retrieval5, 1 (2011), 1–102

work page 2011

[17] [17]

E. Sohn. 2020. Secrets to writing a winning grant.Nature577, 7788 (2020), 133–135

work page 2020

[18] [18]

Song and Y

C. Song and Y. Song. 2023. Enhancing academic writing skills and motivation: as- sessing the efficacy of ChatGPT in AI-assisted language learning for EFL students. Frontiers in Psychology14 (2023), 1260843

work page 2023

[19] [19]

Alexandra Vtyurina, Denis Savenkov, Eugene Agichtein, and Charles L. A. Clarke

work page

[20] [20]

In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems

Exploring conversational search with humans, assistants, and wizards. In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems. 2187–2193

work page 2017

[21] [21]

Shuai Wang, Ekaterina Khramtsova, Shengyao Zhuang, and Guido Zuccon. 2024. Feb4rag: Evaluating federated search in the context of retrieval augmented gener- ation. InProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. 763–773

work page 2024

[22] [22]

Matei Zaharia, Omar Khattab, Lingjiao Chen, Jared Quincy Davis, Heather Miller, Chris Potts, James Zou, Michael Carbin, Jonathan Frankle, Naveen Rao, and Ali Ghodsi. 2024. The Shift from Models to Compound AI Systems. BAIR Blog. https://bair.berkeley.edu/blog/2024/02/18/compound-ai-systems/

work page 2024