Embedding Temporal Logic (ETL) performs runtime monitoring directly in learned embedding spaces using distance-based predicates composed with temporal operators, supported by conformal calibration for reliable predicate evaluation.
Title resolution pending
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 9roles
background 1polarities
background 1representative citing papers
VLMs as judges exhibit informativeness bias by favoring detailed but image-inconsistent answers; BIRCH mitigates it by first correcting answers against the image, reducing bias up to 17% and improving performance up to 9.8%.
OProver-32B achieves top Pass@32 scores on MiniF2F, ProverBench, and PutnamBench by combining continued pretraining with iterative agentic proving, retrieval, SFT on repairs, and RL on unresolved cases using a 6.86M-proof dataset.
PA-BDM adapts block diffusion by switching to causal intra-block denoising and dynamically committing reliable prefixes to KV cache, yielding higher accuracy and 71.6% higher throughput than a comparable baseline on document benchmarks.
GOMA refines frozen multimodal embeddings via modality-aware graph signal smoothing on attributed graphs to improve retrieval while avoiding over-smoothing.
Deep Pre-Alignment uses a small VLM perceiver instead of ViT to pre-align visual features with LLM text space, yielding 1.9-3.0 point gains on multimodal benchmarks and 32.9% less language forgetting.
High OCR accuracy on standard metrics does not guarantee strong downstream RAG performance because structural and semantic errors cause retrieval and generation failures on challenging industrial documents.
citing papers explorer
-
Runtime Monitoring of Perception-Based Autonomous Systems via Embedding Temporal Logic
Embedding Temporal Logic (ETL) performs runtime monitoring directly in learned embedding spaces using distance-based predicates composed with temporal operators, supported by conformal calibration for reliable predicate evaluation.
-
When Vision-Language Models Judge Without Seeing: Exposing Informativeness Bias
VLMs as judges exhibit informativeness bias by favoring detailed but image-inconsistent answers; BIRCH mitigates it by first correcting answers against the image, reducing bias up to 17% and improving performance up to 9.8%.
-
OProver: A Unified Framework for Agentic Formal Theorem Proving
OProver-32B achieves top Pass@32 scores on MiniF2F, ProverBench, and PutnamBench by combining continued pretraining with iterative agentic proving, retrieval, SFT on repairs, and RL on unresolved cases using a 6.86M-proof dataset.
-
Prefix-Adaptive Block Diffusion for Efficient Document Recognition
PA-BDM adapts block diffusion by switching to causal intra-block denoising and dynamically committing reliable prefixes to KV cache, yielding higher accuracy and 71.6% higher throughput than a comparable baseline on document benchmarks.
-
GOMA: Toward Structure-Driven Multimodal Alignment from a Graph Signal Smoothing Perspective
GOMA refines frozen multimodal embeddings via modality-aware graph signal smoothing on attributed graphs to improve retrieval while avoiding over-smoothing.
-
Deep Pre-Alignment for VLMs
Deep Pre-Alignment uses a small VLM perceiver instead of ViT to pre-align visual features with LLM text space, yielding 1.9-3.0 point gains on multimodal benchmarks and 32.9% less language forgetting.
-
When Good OCR Is Not Enough: Benchmarking OCR Robustness for Retrieval-Augmented Generation
High OCR accuracy on standard metrics does not guarantee strong downstream RAG performance because structural and semantic errors cause retrieval and generation failures on challenging industrial documents.
- Stability Implies Redundancy: Delta Attention Selective Halting for Efficient Long-Context Prefilling
- MACS: Modality-Aware Capacity Scaling for Efficient Multimodal MoE Inference