IntrAgent uses a two-stage pipeline of section ranking and iterative reading to perform content-grounded literature information retrieval, achieving 13.2% higher accuracy than RAG and agent baselines on the new IntraBench benchmark.
Financial report chunking for effective retrieval augmented generation
10 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
SHM-Agents is an LLM-plus-specialist-agent framework that claims to execute a wide range of SHM tasks end-to-end via natural language on data from a long-span cable-stayed bridge.
Tree reasoning outperforms vector search on complex document queries but a hybrid approach balances results across tiers, with validation showing an 11.7-point gap on real finance documents.
Systematic tests show that specific PDF parsers combined with overlapping chunking strategies better preserve structure and improve RAG answer correctness on financial QA benchmarks including the new TableQuest dataset.
RefineRAG achieves 90% attack success on NQ by generating toxic seeds then optimizing them via retriever-in-the-loop word refinement, outperforming prior methods on effectiveness and naturalness.
MultiFinRAG is a multimodal RAG framework that improves accuracy on financial QA tasks involving text, tables, and images by 19 percentage points over ChatGPT-4o while running on commodity hardware.
MimirRAG, a multi-agent RAG framework with metadata integration and table-aware chunking, reaches 89.3% accuracy on FinanceBench and outperforms prior baselines for financial document retrieval.
Structured memory improves precision on deterministic financial calculations while retrieval-augmented generation outperforms in conversational settings, supporting a hybrid deployment framework for resource-constrained SMEs.
Cluster-based semantic chunking does not outperform fixed-size or recursive chunking for RAG on academic theses, and RAGAs faithfulness shows limited reliability in this setup.
citing papers explorer
-
IntrAgent: An LLM Agent for Content-Grounded Information Retrieval through Literature Review
IntrAgent uses a two-stage pipeline of section ranking and iterative reading to perform content-grounded literature information retrieval, achieving 13.2% higher accuracy than RAG and agent baselines on the new IntraBench benchmark.
-
SHM-Agents: A Generalist-Specialist Integrated Agent System for Structural Health Monitoring
SHM-Agents is an LLM-plus-specialist-agent framework that claims to execute a wide range of SHM tasks end-to-end via natural language on data from a long-span cable-stayed bridge.
-
Empirical Evaluation of PDF Parsing and Chunking for Financial Question Answering with RAG
Systematic tests show that specific PDF parsers combined with overlapping chunking strategies better preserve structure and improve RAG answer correctness on financial QA benchmarks including the new TableQuest dataset.
-
RefineRAG: Word-Level Poisoning Attacks via Retriever-Guided Text Refinement
RefineRAG achieves 90% attack success on NQ by generating toxic seeds then optimizing them via retriever-in-the-loop word refinement, outperforming prior methods on effectiveness and naturalness.
-
MimirRAG: A Multi-Agent RAG Framework for Financial Data Retrieval with Metadata Integration
MimirRAG, a multi-agent RAG framework with metadata integration and table-aware chunking, reaches 89.3% accuracy on FinanceBench and outperforms prior baselines for financial document retrieval.
-
Architecture Matters More Than Scale: A Comparative Study of Retrieval and Memory Augmentation for Financial QA Under SME Compute Constraints
Structured memory improves precision on deterministic financial calculations while retrieval-augmented generation outperforms in conversational settings, supporting a hybrid deployment framework for resource-constrained SMEs.
-
Evaluating Chunking Strategies for Retrieval-Augmented Generation on Academic Texts
Cluster-based semantic chunking does not outperform fixed-size or recursive chunking for RAG on academic theses, and RAGAs faithfulness shows limited reliability in this setup.