Canonical reference

Dataflow: An llm-driven framework for unified data preparation and workflow automation in the era of data-centric ai

Hao Liang, Xiaochen Ma, Zhou Liu, Zhen Hao Wong, Zhengyang Zhao, Zimo Meng, Runming He, Chengyu Shen, Qifeng Cai, Zhaoyang Han, et al · 2025 · arXiv 2512.16676

Canonical reference. 80% of citing Pith papers cite this work as background.

8 Pith papers citing it

Background 80% of classified citations

read on arXiv browse 8 citing papers

citation-role summary

background 4 baseline 1

citation-polarity summary

background 4 baseline 1

representative citing papers

TraceAV-Bench: Benchmarking Multi-Hop Trajectory Reasoning over Long Audio-Visual Videos

cs.CV · 2026-05-08 · unverdicted · novelty 8.0

TraceAV-Bench is the first benchmark for multi-hop trajectory reasoning over long audio-visual videos, showing top models reach only 51-68% accuracy with substantial room for improvement.

PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications

cs.DC · 2026-05-18 · unverdicted · novelty 7.0

PopPy combines an ahead-of-time compiler and runtime to extract parallelism from Python compound AI applications, delivering up to 6.4x end-to-end speedups while preserving sequential semantics.

K12-KGraph: A Curriculum-Aligned Knowledge Graph for Benchmarking and Training Educational LLMs

cs.CL · 2026-05-10 · conditional · novelty 7.0

K12-KGraph is a textbook-derived knowledge graph that powers a new benchmark revealing LLMs' poor curriculum cognition and a small training corpus that outperforms general instruction data on educational tasks.

Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora

cs.SE · 2026-04-27 · unverdicted · novelty 6.0

Structured knowledge extracted from corpora enables test-driven data engineering for LLMs by mapping training data to source code, model training to compilation, benchmarking to unit testing, and failures to targeted data repairs, demonstrated across 16 disciplines.

TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration

cs.AI · 2026-04-15 · unverdicted · novelty 6.0

TREX automates the LLM training lifecycle via collaborative agents and tree-based exploration, delivering consistent performance gains across 10 real-world fine-tuning tasks in FT-Bench.

AAFLOW: Scalable Patterns for Agentic AI Workflows

cs.DC · 2026-05-04 · unverdicted · novelty 5.0

AAFLOW is a unified distributed runtime that models agentic workflows as operators with a zero-copy data plane using Apache Arrow and Cylon, achieving up to 4.64x pipeline speedup through improved data flow and batching.

DataArc-SynData-Toolkit: A Unified Closed-Loop Framework for Multi-Path, Multimodal, and Multilingual Data Synthesis

cs.LG · 2026-05-02 · unverdicted · novelty 3.0

DataArc-SynData-Toolkit is an open-source, configuration-driven framework that unifies synthetic data generation for multimodal, multilingual, and multi-task LLM training with improved usability and quality control.

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

cs.CV · 2026-04-06

citing papers explorer

Showing 8 of 8 citing papers.

TraceAV-Bench: Benchmarking Multi-Hop Trajectory Reasoning over Long Audio-Visual Videos cs.CV · 2026-05-08 · unverdicted · none · ref 6
TraceAV-Bench is the first benchmark for multi-hop trajectory reasoning over long audio-visual videos, showing top models reach only 51-68% accuracy with substantial room for improvement.
PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications cs.DC · 2026-05-18 · unverdicted · none · ref 45
PopPy combines an ahead-of-time compiler and runtime to extract parallelism from Python compound AI applications, delivering up to 6.4x end-to-end speedups while preserving sequential semantics.
K12-KGraph: A Curriculum-Aligned Knowledge Graph for Benchmarking and Training Educational LLMs cs.CL · 2026-05-10 · conditional · none · ref 14
K12-KGraph is a textbook-derived knowledge graph that powers a new benchmark revealing LLMs' poor curriculum cognition and a small training corpus that outperforms general instruction data on educational tasks.
Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora cs.SE · 2026-04-27 · unverdicted · none · ref 28
Structured knowledge extracted from corpora enables test-driven data engineering for LLMs by mapping training data to source code, model training to compilation, benchmarking to unit testing, and failures to targeted data repairs, demonstrated across 16 disciplines.
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration cs.AI · 2026-04-15 · unverdicted · none · ref 25
TREX automates the LLM training lifecycle via collaborative agents and tree-based exploration, delivering consistent performance gains across 10 real-world fine-tuning tasks in FT-Bench.
AAFLOW: Scalable Patterns for Agentic AI Workflows cs.DC · 2026-05-04 · unverdicted · none · ref 7
AAFLOW is a unified distributed runtime that models agentic workflows as operators with a zero-copy data plane using Apache Arrow and Cylon, achieving up to 4.64x pipeline speedup through improved data flow and batching.
DataArc-SynData-Toolkit: A Unified Closed-Loop Framework for Multi-Path, Multimodal, and Multilingual Data Synthesis cs.LG · 2026-05-02 · unverdicted · none · ref 6
DataArc-SynData-Toolkit is an open-source, configuration-driven framework that unifies synthetic data generation for multimodal, multilingual, and multi-task LLM training with improved usability and quality control.
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models cs.CV · 2026-04-06 · unreviewed · ref 75

Dataflow: An llm-driven framework for unified data preparation and workflow automation in the era of data-centric ai

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer