ReSequel uses LLMs guided by metadata-derived templates and sampling-based verification to rewrite SQL queries, delivering up to 16x workload speedups over native DBMSs and 22x over prior LLM baselines across eight benchmarks and three systems.
Chang and Longling Geng
6 Pith papers cite this work. Polarity classification is still indexing.
years
2026 6representative citing papers
ClawVM introduces a harness-managed virtual memory system for LLM agents that ensures deterministic residency and durability of state under token budgets by using typed pages and validated writeback.
AsymCache combines Multi-Segment Attention, position-aware eviction, and adaptive chunking to cut TTFT by up to 2.03x and TPOT by up to 1.71x versus recent baselines in LLM serving.
SPIN enforces DAG-valid plans and prefix-based stopping for LLM agents, cutting executed tasks from 1061 to 623 and tool calls from 11.81 to 6.82 per run on AssetOpsBench while raising success from 0.638 to 0.706.
ProfiLLM deploys tool-augmented LLM agents to generate reusable global knowledge and utility-selected user profiles, delivering up to 6.14% AUC lift and measurable GMV gains in DiDi's live dispatcher.
Relational engines achieve faster SQL+vector-search queries on GPU than CPU when using compact vector indexes and fast interconnects, reversing the CPU-only design in current systems.
citing papers explorer
-
ReSequel: Robust LLM-assisted Query Rewriting and Optimization using Templatization and Sampling
ReSequel uses LLMs guided by metadata-derived templates and sampling-based verification to rewrite SQL queries, delivering up to 16x workload speedups over native DBMSs and 22x over prior LLM baselines across eight benchmarks and three systems.
-
ClawVM: Harness-Managed Virtual Memory for Stateful Tool-Using LLM Agents
ClawVM introduces a harness-managed virtual memory system for LLM agents that ensures deterministic residency and durability of state under token budgets by using typed pages and validated writeback.
-
Multi-Segment Attention: Enabling Efficient KV-Cache Management for Faster Large Language Model Serving
AsymCache combines Multi-Segment Attention, position-aware eviction, and adaptive chunking to cut TTFT by up to 2.03x and TPOT by up to 1.71x versus recent baselines in LLM serving.
-
SPIN: Structural LLM Planning via Iterative Navigation for Industrial Tasks
SPIN enforces DAG-valid plans and prefix-based stopping for LLM agents, cutting executed tasks from 1061 to 623 and tool calls from 11.81 to 6.82 per run on AssetOpsBench while raising success from 0.638 to 0.706.
-
ProfiLLM: Utility-Aligned Agentic User Profiling for Industrial Ride-Hailing Dispatch
ProfiLLM deploys tool-augmented LLM agents to generate reusable global knowledge and utility-selected user profiles, delivering up to 6.14% AUC lift and measurable GMV gains in DiDi's live dispatcher.
-
To GPU or Not to GPU: Vector Search in Relational Engines
Relational engines achieve faster SQL+vector-search queries on GPU than CPU when using compact vector indexes and fast interconnects, reversing the CPU-only design in current systems.