CacheClip accelerates RAG prefill by up to 3.33x via auxiliary-model-guided selective KV recomputation while retaining 85-91% of full-attention quality on NIAH and LongBench.
Large language models in healthcare and medical domain: A review
2 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 2representative citing papers
All five tested LLMs deviated from US race-stratified disease distributions in synthetic case generation, while retrieval-based agentic workflows improved mean p-value by 0.0348, median p-value by 0.1166, and mean difference by 0.0949 for DeepSeek V3 in diagnosis ranking.
citing papers explorer
-
CacheClip: Accelerating RAG with Effective KV Cache Reuse
CacheClip accelerates RAG prefill by up to 3.33x via auxiliary-model-guided selective KV recomputation while retaining 85-91% of full-attention quality on NIAH and LongBench.
-
First, Do No Harm (With LLMs): Mitigating Racial Bias via Agentic Workflows
All five tested LLMs deviated from US race-stratified disease distributions in synthetic case generation, while retrieval-based agentic workflows improved mean p-value by 0.0348, median p-value by 0.1166, and mean difference by 0.0949 for DeepSeek V3 in diagnosis ranking.