EVGeoQA benchmark and GeoRover framework show LLMs can use tools for sub-tasks in dynamic geo-spatial exploration but struggle with long-range planning, with an emergent ability to improve via historical trajectory summaries.
InProceedings of the 5th ACM International Confer- ence on AI in Finance, pages 266–273
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4roles
background 2polarities
background 2representative citing papers
CoRM-RAG uses a cognitive perturbation protocol to simulate biases and trains an Evidence Critic to retrieve documents that support correct decisions even under adversarial query changes.
CoE applies vision-language models directly to document screenshots to deliver pixel-level bounding-box attribution for evidence in iterative retrieval-augmented generation, outperforming text baselines on visual-layout tasks.
SCOUT uses token saliency analysis to detect both standard and contextually-plausible backdoor attacks in language models while maintaining clean accuracy.
citing papers explorer
-
EVGeoQA: Benchmarking LLMs on Dynamic, Multi-Objective Geo-Spatial Exploration
EVGeoQA benchmark and GeoRover framework show LLMs can use tools for sub-tasks in dynamic geo-spatial exploration but struggle with long-range planning, with an emergent ability to improve via historical trajectory summaries.
-
Beyond Semantic Relevance: Counterfactual Risk Minimization for Robust Retrieval-Augmented Generation
CoRM-RAG uses a cognitive perturbation protocol to simulate biases and trains an Evidence Critic to retrieve documents that support correct decisions even under adversarial query changes.
-
Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation
CoE applies vision-language models directly to document screenshots to deliver pixel-level bounding-box attribution for evidence in iterative retrieval-augmented generation, outperforming text baselines on visual-layout tasks.
-
SCOUT: A Defense Against Data Poisoning Attacks in Fine-Tuned Language Models
SCOUT uses token saliency analysis to detect both standard and contextually-plausible backdoor attacks in language models while maintaining clean accuracy.