LiveBench is a contamination-limited LLM benchmark with auto-scored challenging tasks from recent sources across math, coding, reasoning and more, where top models score below 70%.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
ConStruM improves LLM-based schema matching by using a context tree and global similarity hypergraph to assemble query-specific evidence packs from available schema metadata.
LLM agents enable universal interoperability by serving as automatic translators and adapters between proprietary digital services.
citing papers explorer
-
LiveBench: A Challenging, Contamination-Limited LLM Benchmark
LiveBench is a contamination-limited LLM benchmark with auto-scored challenging tasks from recent sources across math, coding, reasoning and more, where top models score below 70%.
-
ConStruM: A Structure-Guided LLM Framework for Context-Aware Schema Matching
ConStruM improves LLM-based schema matching by using a context tree and global similarity hypergraph to assemble query-specific evidence packs from available schema metadata.
-
LLM Agents Are the Antidote to Walled Gardens
LLM agents enable universal interoperability by serving as automatic translators and adapters between proprietary digital services.