PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications

David Mell; Konstantinos Kallas; Osbert Bastani; Stephen Mell; Steve Zdancewic

arxiv: 2605.18697 · v1 · pith:CVTDHRF2new · submitted 2026-05-18 · 💻 cs.DC · cs.AI· cs.PL

PopPy: Opportunistically Exploiting Parallelism in Python Compound AI Applications

Stephen Mell , David Mell , Konstantinos Kallas , Steve Zdancewic , Osbert Bastani This is my paper

Pith reviewed 2026-05-20 07:59 UTC · model grok-4.3

classification 💻 cs.DC cs.AIcs.PL

keywords compound AIPython parallelismdynamic dispatchruntime optimizationexternal model callsend-to-end latencyopportunistic parallelization

0 comments

The pith

PopPy finds safe ways to run Python compound AI calls in parallel for up to 6.4 times faster end-to-end execution while keeping sequential behavior unchanged.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops PopPy to locate parallel execution opportunities inside Python programs that string together calls to machine learning models and other slow external services. These compound applications spend most of their time waiting on the external parts, so ordinary language compilers offer little help. PopPy pairs an ahead-of-time compiler with a runtime that tracks dynamic method selection and variable changes to decide when independent external calls can safely overlap. It requires only minimal extra annotations from the programmer. A reader would care because many practical AI tools are built this way and shorter total run times would make them more responsive without forcing developers to rewrite their code or accept different results.

Core claim

PopPy uncovers parallelization opportunities in Python applications that invoke heavy external components by using an ahead-of-time compiler paired with a runtime. The system manages language complexity, dynamic dispatch, and variable mutation to extract parallelism while maintaining the original sequential semantics. Experiments on real-world compound AI applications demonstrate end-to-end speedups reaching 6.4 times over standard Python execution with only minimal developer input.

What carries the argument

An ahead-of-time compiler combined with a runtime that tracks dependencies to safely parallelize external calls in dynamic Python code.

If this is right

End-to-end latency of compound AI applications drops because independent external model calls can execute at the same time.
Programs that use dynamic dispatch and mutate variables still receive speedups without losing correctness.
Developers obtain the gains after adding only small annotations rather than restructuring their code.
The original sequential behavior of the program remains exactly the same.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same opportunistic tracking of dependencies could be applied to other dynamic languages that orchestrate external AI services.
Integration into standard Python runtimes might eventually make this style of parallelism available without any code changes at all.
Testing the approach on larger or more varied workloads could expose additional safe parallel patterns not yet considered.

Load-bearing premise

The runtime can accurately detect all data dependencies and mutation effects across the supported Python fragment so that parallel schedules never alter observable results.

What would settle it

Running PopPy on the evaluated compound AI applications and observing either no improvement over ordinary Python or any change in final outputs compared with sequential execution.

Figures

Figures reproduced from arXiv: 2605.18697 by David Mell, Konstantinos Kallas, Osbert Bastani, Stephen Mell, Steve Zdancewic.

**Figure 2.** Figure 2: Illustration of the execution of get_values with states = ("a", "a", "b", "b"), after queueing all external calls but before any have resolved. Internal code execution tree (left): Each code block that is executed at runtime is shown, including duplicates from different loop iterations; block borders are colored to correspond to source blocks in [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 1.** Figure 1: Tree-of-Thoughts [74] implementation in Python. The @_ lines are the annotations that need to be added to a program to be supported by PopPy. L1-L29 are provided by the application developer, while L32-L35 are provided by library developers. The colored bars indicate specific code blocks referenced in [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 3.** Figure 3: The PopPy system architecture. Code (static) and processes (dynamic) are shown in boxes; transformations are shown in bubbles. all external calls but before any have finished, is depicted in [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: An example of the second compiler phase. Mutation, sequencing, and control-flow are converted to function calls. In 𝜆 𝑂 , load, store, and ite are external functions. the program state during execution. The call to store takes a memory state M0 and returns a new memory state, but where the key x has value "foo". Sequencing of External Calls. The second challenge is that 𝜆 𝑂 has no execution order—i.e., it… view at source ↗

**Figure 5.** Figure 5: Median speedup of PopPy execution over Python across 10 trials. From CaMeL (C-𝑛) we only include applications that make at least one LLM call. Error bars show minimum to maximum speedup across trials. an LLM to both expand and score search nodes. The example in [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: A single execution trace of ToT (with 2 steps of search and beam size 3), showing selected external calls. Dashed lines indicate the time between queueing and dispatch; solid lines indicate the time between dispatch and resolution. (LLM calls dispatch immediately; print calls resolve immediately). Calls are sorted from top to bottom by the order in which they would be executed under sequential execution.… view at source ↗

**Figure 7.** Figure 7: Absolute execution time overhead of PopPy vs the Python execution time, for each benchmark (median over 10 trials). Overhead is the time spent inside the 𝜆 𝑂 interpreter, with all external calls annotated as sequential. formance of PopPy, we zoom in on the precise timeline of external calls in a ToT execution with 2 steps of search and a beam size of 3 ( [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

read the original abstract

Compound AI applications, which compose calls to ML models using a general-purpose programming language like Python, are widely used for a variety of user-facing tasks, from software engineering to enterprise automation, making their end-to-end latency a critical bottleneck. In contrast to traditional applications, execution time is dominated by the external components, which cannot be handled by traditional language optimization systems, like optimizing compilers. To address this problem, we develop PopPy, a system that can uncover parallelization opportunities in Python applications that invoke these heavy external components, including those used in compound AI applications. PopPy supports a very expressive fragment of Python and requires minimal developer input to uncover parallelism. It combines an ahead-of-time compiler with a runtime, addressing three key challenges in extracting parallelism from Python applications: language complexity, dynamic dispatch, and variable mutation. On a set of real-world compound AI applications, PopPy achieves up to $6.4\times$ speedups in end-to-end execution time compared to standard Python execution while preserving the sequential program semantics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PopPy combines AOT compilation and runtime checks to parallelize Python code around external ML calls, but the soundness of its dependency analysis for mutations and dispatch is the key open question.

read the letter

PopPy targets a real pain point: Python programs that spend most of their time calling out to ML models or other heavy external services. The system tries to find safe parallelization opportunities with only minimal developer hints, using an ahead-of-time compiler plus runtime support to cope with Python's dynamic features. On a handful of real compound AI applications the authors report up to 6.4× end-to-end speedups while claiming to preserve sequential semantics. That combination of focus and reported numbers is the main takeaway worth passing along.

Referee Report

2 major / 2 minor

Summary. The paper presents PopPy, a system designed to uncover and exploit parallelism in Python compound AI applications that invoke external ML models and other heavy components. By combining an ahead-of-time compiler with a runtime system, PopPy addresses challenges of language complexity, dynamic dispatch, and variable mutation with minimal developer input. The main result is that on real-world compound AI applications, it achieves up to 6.4× speedups in end-to-end execution time compared to standard Python while preserving the sequential program semantics.

Significance. If the soundness of the dependency tracking holds, this work has the potential to significantly impact the performance of latency-sensitive compound AI systems by automatically parallelizing independent external calls without requiring changes to the program's observable behavior. The support for an expressive fragment of Python and the evaluation on real-world applications are positive aspects. The opportunistic nature of the parallelization could make it practical for developers working on software engineering and enterprise automation tasks.

major comments (2)

§4.2: The description of the combined AOT and runtime analysis for tracking variable mutations does not include a formal argument or sufficient test cases demonstrating that all possible mutation effects under dynamic dispatch are captured; this is load-bearing for the semantic preservation claim since a missed dependence would lead to incorrect results rather than just reduced performance.
§5.1: The evaluation on real-world applications reports speedups but provides limited details on the methodology, such as the specific benchmarks used, number of runs, or error bars, which are necessary to substantiate the 6.4× claim.

minor comments (2)

Abstract: Consider adding the number of applications evaluated to give context to the 'up to 6.4×' speedup.
§3: The notation used for describing the supported Python fragment could benefit from additional examples to improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful review and positive assessment of PopPy's potential impact. We address each major comment below and describe the changes planned for the revised manuscript.

read point-by-point responses

Referee: §4.2: The description of the combined AOT and runtime analysis for tracking variable mutations does not include a formal argument or sufficient test cases demonstrating that all possible mutation effects under dynamic dispatch are captured; this is load-bearing for the semantic preservation claim since a missed dependence would lead to incorrect results rather than just reduced performance.

Authors: We agree that soundness of the mutation tracking is critical to the semantic-preservation guarantee. The current manuscript explains the AOT conservative analysis for identifying mutation sites together with runtime checks that resolve dynamic dispatch and actual mutations at execution time. We did not supply a formal proof or an exhaustive test suite in the submission. In the revision we will expand §4.2 with a new subsection containing additional concrete test cases that exercise a range of dynamic-dispatch and mutation patterns; we will also articulate the key invariants maintained by the combined analysis. A complete mechanized proof for the full supported Python fragment is beyond the scope of this systems paper and will be noted as future work. revision: partial
Referee: §5.1: The evaluation on real-world applications reports speedups but provides limited details on the methodology, such as the specific benchmarks used, number of runs, or error bars, which are necessary to substantiate the 6.4× claim.

Authors: We thank the referee for highlighting this presentational gap. The revised §5.1 will explicitly list each benchmark (including source repositories or application names), state that each measurement is the median of ten runs, and include error bars showing the min/max or standard deviation across those runs. These additions will directly support the reported speedups. revision: yes

Circularity Check

0 steps flagged

No circularity detected in system architecture or evaluation claims

full rationale

The paper presents PopPy as an implemented engineering system that combines ahead-of-time compilation with runtime analysis to identify parallelization opportunities in an expressive Python fragment. Claims of up to 6.4× speedups rest on empirical measurements against standard Python execution on real-world compound AI applications, not on any mathematical derivation, fitted parameter renamed as prediction, or self-citation chain. No equations, uniqueness theorems, or ansatzes are invoked that reduce to the paper's own inputs by construction. The soundness of dependency tracking for mutation and dynamic dispatch is an engineering precondition evaluated externally via application benchmarks rather than internally derived.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

The central claim rests on the correctness of the PopPy implementation for handling Python language features and external calls; no free parameters, axioms, or invented entities beyond the system itself are described in the abstract.

invented entities (1)

PopPy system no independent evidence
purpose: Uncover and exploit parallelization opportunities in Python applications with external components
The paper introduces PopPy as the core new artifact that addresses language complexity, dynamic dispatch, and variable mutation.

pith-pipeline@v0.9.0 · 5721 in / 1117 out tokens · 29761 ms · 2026-05-20T07:59:29.432117+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/ArrowOfTime.lean arrow_from_z unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

PopPy combines an ahead-of-time compiler with a runtime, addressing three key challenges in extracting parallelism from Python applications: language complexity, dynamic dispatch, and variable mutation.
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_injective unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We define the equivalence relation over traces ≡A such that t1 ≡A t2 iff t1 can be transformed into t2 via external call reorderings allowed by A.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

84 extracted references · 84 canonical work pages · 6 internal anchors

[1]

Official Repo of Tree of Thoughts

2023. Official Repo of Tree of Thoughts . https://github.com/princeton- nlp/tree-of-thought-llm

work page 2023
[2]

Duane A. Adams. 1968. A Computation Model with Data-Sequenced Control. Technical Report. Stanford University. Technical Report CGTM 45

work page 1968
[3]

Duane A. Adams. 1969. A Computation Model with Data Flow Sequenc- ing. Ph. D. Dissertation

work page 1969
[4]

Jason Ansel, Edward Yang, Horace He, Natalia Gimelshein, Animesh Jain, Michael Voznesensky, Bin Bao, Peter Bell, David Berard, Evgeni Burovski, et al. 2024. Pytorch 2: Faster machine learning through dynamic python bytecode transformation and graph compilation. In Proceedings of the 29th ACM international conference on architectural support for programmin...

work page 2024
[5]

Anthropic. 2024. The Claude 3 Model Family: Opus, Sonnet, Haiku. https://www-cdn.anthropic.com/ de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_ Claude_3.pdf

work page 2024
[6]

Anthropic. 2025. How we built our multi-agent research system.https: //www.anthropic.com/engineering/multi-agent-research-system . Ac- cessed: 2026-04-01

work page 2025
[7]

Sotiris Apostolakis, Ziyang Xu, Greg Chan, Simone Campanoni, and David I August. 2020. Perspective: A sensible approach to speculative automatic parallelization. In Proceedings of the Twenty-Fifth Interna- tional Conference on Architectural Support for Programming Languages and Operating Systems. 351–367

work page 2020
[8]

Andrew W. Appel. 1991. Compiling with Continuations. Cambridge University Press

work page 1991
[9]

Rishiyur S Nikhil Arvind. 1992. Id: a language with implicit parallelism. In A Comparative Study of Parallel Programming Languages . Elsevier, 169–215

work page 1992
[10]

Stefanos Baziotis, Daniel Kang, and Charith Mendis. 2024. Dias: Dy- namic rewriting of Pandas code. Proceedings of the ACM on Manage- ment of Data 2, 1 (2024), 1–27

work page 2024
[11]

Luca Beurer-Kellner, Marc Fischer, and Martin T. Vechev. 2023. Prompt- ing Is Programming: A Query Language for Large Language Models. In Proceedings of the 44th ACM SIGPLAN International Conference on Programming Language Design and Implementation . ACM, 1946–1969. https://doi.org/10.1145/3591300

work page doi:10.1145/3591300 2023
[12]

Carl Friedrich Bolz, Antonio Cuni, Maciej Fijalkowski, and Armin Rigo. 2009. Tracing the Meta-Level: PyPy’s Tracing JIT Compiler. In Proceedings of the 4th Workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems (ICOOOLPS). ACM, 18–25. https://doi.org/10.1145/1565824.1565827

work page doi:10.1145/1565824.1565827 2009
[13]

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, et al. 2021. Jax: Autograd and xla. Astrophysics Source Code Library (2021), ascl–2111

work page 2021
[14]

Brandt Bucher and Savannah Ostrowski. 2024. PEP 744: JIT Com- pilation. https://peps.python.org/pep-0744/. Python Enhancement Proposal, Draft status

work page 2024
[15]

Harrison Chase. 2023. LangChain. https://github.com/langchain- ai/langchain

work page 2023
[16]

Gohar Irfan Chaudhry, Esha Choukse, Íñigo Goiri, Rodrigo Fonseca, Adam Belay, and Ricardo Bianchini. 2025. Towards Resource-Efficient Compound AI Systems. In Proceedings of the 2025 Workshop on Hot Topics in Operating Systems (Banff, AB, Canada) (HotOS ’25). As- sociation for Computing Machinery, New York, NY, USA, 218–224. https://doi.org/10.1145/3713082.3730377

work page doi:10.1145/3713082.3730377 2025
[17]

Alonzo Church. 1941. The Calculi of Lambda-Conversion . Annals of Mathematics Studies, Vol. 6. Princeton University Press

work page 1941
[18]

Fu, Stefano Ermon, Atri Rudra, and Christopher Ré

Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, and Christopher Ré

work page
[19]

In Advances in Neural Information Processing Systems , Vol

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. In Advances in Neural Information Processing Systems , Vol. 35. 16344–16359

work page
[20]

Davis and Robert M

Alan L. Davis and Robert M. Keller. 1982. Data Flow Program Graphs. Computer 15, 02 (2 1982), 26–41

work page 1982
[21]

Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Carlini, Daniel Fabian, Christoph Kern, Chongyang Shi, Andreas Terzis, and Florian Tramèr. 2026. Defeating Prompt Injec- tions by Design. arXiv preprint arXiv:2503.18813. In IEEE Confer- ence on Secure and Trustworthy Machine Learning (SaTML) . https: //arxiv.org/abs/2503.18813

work page internal anchor Pith review Pith/arXiv arXiv 2026
[22]

Edoardo Debenedetti, Jie Zhang, Mislav Balunovic, Luca Beurer- Kellner, Marc Fischer, and Florian Tramèr. 2024. Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents. Advances in Neural Information Processing Systems 37 (2024), 82895–82920

work page 2024
[23]

Jack B. Dennis. 1974. First version of a data flow procedure language. In Programming Symposium, B. Robinet (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 362–376

work page 1974
[24]

Yu Feng, Phu Mon Htut, Zheng Qi, Wei Xiao, Manuel Mager, Nikolaos Pappas, Kishaloy Halder, Yang Li, Yassine Benajiba, and Dan Roth

work page
[25]

InFindings of the Association for Computational Linguistics: EMNLP 2025 , Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, and Violet Peng (Eds.)

Rethinking LLM Uncertainty: A Multi-Agent Approach to Esti- mating Black-Box Model Uncertainty. InFindings of the Association for Computational Linguistics: EMNLP 2025 , Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, and Violet Peng (Eds.). Asso- ciation for Computational Linguistics, Suzhou, China, 12349–12375. https://doi.org/10.18653/v1...

work page doi:10.18653/v1/2025.findings-emnlp.660 2025
[26]

Yu Feng, Ben Zhou, Weidong Lin, and Dan Roth. 2025. BIRD: A Trust- worthy Bayesian Inference Framework for Large Language Models. In The Thirteenth International Conference on Learning Representations . https://openreview.net/forum?id=fAAaT826Vv

work page 2025
[27]

Feo, David C

John T. Feo, David C. Cann, and Rodney R. Oldehoeft. 1990. A report on the sisal language project. J. Parallel and Distrib. Comput. 10, 4 (1990), 349–366. https://doi.org/10.1016/0743-7315(90)90035-N Data-flow Processing

work page doi:10.1016/0743-7315(90)90035-n 1990
[28]

Duba, and Matthias Felleisen

Cormac Flanagan, Amr Sabry, Bruce F. Duba, and Matthias Felleisen

work page
[29]

In Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation (Albuquerque, New Mexico, USA) (PLDI ’93)

The essence of compiling with continuations. In Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation (Albuquerque, New Mexico, USA) (PLDI ’93). As- sociation for Computing Machinery, New York, NY, USA, 237–247. https://doi.org/10.1145/155090.155113

work page doi:10.1145/155090.155113 1993
[30]

Gemini Team, Google. 2023. Gemini: A Family of Highly Capable Multimodal Models. arXiv preprint arXiv:2312.11805 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[31]

GitHub Staff. 2025. Octoverse: A new developer joins GitHub every second as AI leads TypeScript to #1.https://github.blog/news-insights/ octoverse/. Accessed: 2026-03-30

work page 2025
[32]

Halstead

Robert H. Halstead. 1985. MULTILISP: a language for concurrent symbolic computation. ACM Trans. Program. Lang. Syst. 7, 4 (Oct. 1985), 501–538. https://doi.org/10.1145/4472.4478

work page doi:10.1145/4472.4478 1985
[33]

Kang He and Kaushik Roy. 2025. LogicTree: Structured Proof Ex- ploration for Coherent and Rigorous Logical Reasoning with Large Language Models. arXiv preprint arXiv:2504.14089 (2025)

work page arXiv 2025
[34]

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

Carlos E. Jimenez, John Yang, S. Friedman, et al. 2024. SWE-bench: Can Language Models Resolve Real-World GitHub Issues? arXiv preprint arXiv:2310.06770 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[35]

Tian Jin, Ellie Y Cheng, Zachary Ankner, Nikunj Saunshi, Blake M Elias, Amir Yazdanbakhsh, Jonathan Ragan-Kelley, Suvinay Subrama- nian, and Michael Carbin. 2025. Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding. In Forty-second International Conference on Machine Learn- ing. https://openreview.net...

work page 2025
[36]

Michael Jungmair, Alexis Engelke, and Jana Giceva. 2024. HiPy: Ex- tracting High-Level Semantics from Python Code for Data Process- ing. Proc. ACM Program. Lang. 8, OOPSLA2, Article 297 (Oct. 2024), 27 pages. https://doi.org/10.1145/3689737 13

work page doi:10.1145/3689737 2024
[37]

Konstantinos Kallas, Tammam Mustafa, Jan Bielak, Dimitris Karnikis, Thurston HY Dang, Michael Greenberg, and Nikos Vasilakis. 2022. Practically correct,Just-in-Time shell script parallelization. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22). 769–785

work page 2022
[38]

Karp and Raymond Miller

Richard M. Karp and Raymond Miller. 1966. Properties of a Model for Parallel Computation: Determinacy, Termination, Queueing. SlAM J. of Applied Mathematics 14, 6 (11 1966), 1390–1411

work page 1966
[39]

Karp and Raymond Miller

Richard M. Karp and Raymond Miller. 1969. Parallel Program Schemata. J. Comput. System Sci. 3 (1969), 147–195

work page 1969
[40]

Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, and Christopher Potts

Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan, Saiful Haq, Ashutosh Sharma, Thomas T. Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, and Christopher Potts. 2024. DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines. The Twelfth International Conference on Learning Representations

work page 2024
[41]

Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. 2015. Numba: A llvm-based python jit compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC . 1–6

work page 2015
[42]

LangChain Inc. 2024. LangGraph: Build Resilient Language Agents as Graphs. https://github.com/langchain-ai/langgraph

work page 2024
[43]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al . 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems 33 (2020), 9459–9474

work page 2020
[44]

Shuo Li, Sangdon Park, Insup Lee, and Osbert Bastani. 2024. TRAQ: Trustworthy Retrieval Augmented Question Answering via Conformal Prediction. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Lan- guage Technologies (Volume 1: Long Papers), Kevin Duh, Helena Gomez, and Steven Betha...

work page doi:10.18653/v1/2024 2024
[45]

Hao Liang, Xiaochen Ma, Zhou Liu, Zhen Hao Wong, Zhengyang Zhao, Zimo Meng, Runming He, Chengyu Shen, Qifeng Cai, Zhaoyang Han, et al. 2025. DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI. arXiv preprint arXiv:2512.16676 (2025)

work page arXiv 2025
[46]

Mingdao Liu, Aohan Zeng, Bowen Wang, Peng Zhang, Jie Tang, and Yuxiao Dong. 2024. APAR: LLMs Can Do Auto-Parallel Auto- Regressive Decoding. arXiv:2401.06761 [cs.CL] https://arxiv.org/abs/ 2401.06761

work page arXiv 2024
[47]

Shail Aditya Arvind Jan-Willem Maessen, Lennart Augustsson, and Rishiyur S Nikhil. 1995. Semantics of pH: A parallel dialect of Haskell. In In Proceedings from the Haskell Workshop (at FPCA 95) . 35–49

work page 1995
[48]

James R McGraw. 1982. The VAL language: Description and analysis. ACM Transactions on Programming Languages and Systems (TOPLAS) 4, 1 (1982), 44–82

work page 1982
[49]

Stephen Mell, Konstantinos Kallas, Steve Zdancewic, and Osbert Bastani. 2025. Opportunistically Parallel Lambda Calculus. Proc. ACM Program. Lang. 9, OOPSLA2, Article 365 (Oct. 2025), 27 pages. https://doi.org/10.1145/3763143

work page doi:10.1145/3763143 2025
[50]

Tammam Mustafa, Konstantinos Kallas, Pratyush Das, and Nikos Vasi- lakis. 2023. DiSh: Dynamic Shell-Script Distribution. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). 341–356

work page 2023
[51]

n8n.io. 2026. n8n: Fair-code workflow automation platform with native AI capabilities. https://github.com/n8n-io/n8n

work page 2026
[52]

Ziyi Ni, Yifan Li, Ning Yang, Dou Shen, Pin Lyu, and Daxiang Dong

work page
[53]

In Findings of the Association for Computational Linguistics: ACL 2025

Tree-of-code: A self-growing tree framework for end-to-end code generation and execution in complex tasks. In Findings of the Association for Computational Linguistics: ACL 2025 . 9804–9819

work page 2025
[54]

Xuefei Ning, Zinan Lin, Zixuan Zhou, Zifu Wang, Huazhong Yang, and Yu Wang. 2024. Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation. In The Twelfth International Conference on Learn- ing Representations. https://openreview.net/forum?id=mqVgBbNCm9

work page 2024
[55]

OpenAI. 2023. GPT-4 Technical Report . Technical Report. OpenAI. arXiv preprint arXiv:2303.08774

work page internal anchor Pith review Pith/arXiv arXiv 2023
[56]

OpenAI. 2025. OpenAI Agents SDK. https://github.com/openai/openai- agents-python

work page 2025
[57]

Shoumik Palkar, James J Thomas, Anil Shanbhag, Deepak Narayanan, Holger Pirk, Malte Schwarzkopf, Saman Amarasinghe, and Matei Zaharia. 2017. Weld: A common runtime for high performance data analytics. (2017)

work page 2017
[58]

Shoumik Palkar and Matei Zaharia. 2019. Optimizing Data-Intensive Computations in Existing Libraries with Split Annotations. In Pro- ceedings of the 27th ACM Symposium on Operating Systems Princi- ples (Huntsville, ON, Canada) (SOSP ’19). ACM, 291–305. https: //doi.org/10.1145/3341301.3359652

work page doi:10.1145/3341301.3359652 2019
[59]

Joe Gibbs Politz, Alejandro Martinez, Mae Milano, Sumner Warren, Daniel Patterson, Junsong Li, Anand Chitipothu, and Shriram Krish- namurthi. 2013. Python: the full monty. SIGPLAN Not. 48, 10 (Oct. 2013), 217–232. https://doi.org/10.1145/2544173.2509536

work page doi:10.1145/2544173.2509536 2013
[60]

Deepti Raghavan, Sadjad Fouladi, Philip Levis, and Matei Zaharia

work page
[61]

In 2020 USENIX Annual Technical Conference (USENIX ATC 20)

POSH: A Data-Aware Shell. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 617–631

work page 2020
[62]

Rodriguez

Jorge E. Rodriguez. 1969. A Graph Model for Parallel Computations . Ph. D. Dissertation. MIT-LCS-TR64

work page 1969
[63]

doi: 10.1038/s41586-023-06924-6

Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M. Pawan Kumar, Emilien Dupont, Francisco J. R. Ruiz, Jordan S. Ellenberg, Pengming Wang, Omar Fawzi, Pushmeet Kohli, and Alhussein Fawzi. 2024. Mathematical discoveries from program search with large language models. Nature 625, 7995 (2024), 468–475. https://doi.org/10.10...

work page doi:10.1038/s41586-023-06924-6 2024
[64]

Keshav Santhanam, Deepti Raghavan, Muhammad Shahir Rahman, Thejas Venkatesh, Neha Kunjal, Pratiksha Thaker, Philip Levis, and Matei Zaharia. 2024. ALTO: An Efficient Network Orchestrator for Compound AI Systems. In Proceedings of the 4th Workshop on Machine Learning and Systems (Athens, Greece) (EuroMLSys ’24). Association for Computing Machinery, New Yor...

work page doi:10.1145/3642970.3655844 2024
[65]

Sastry and Roy D.C

A.V.S. Sastry and Roy D.C. Ju. 1998. A New Algorithm for Scalar Register Promotion Based on SSA Form. PLDI ’98: Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation (1998)

work page 1998
[66]

Ariya Shajii, Gabriel Ramirez, Haris Smajlović, Jessica Ray, Bonnie Berger, Saman Amarasinghe, and Ibrahim Numanagić. 2023. Codon: A Compiler for High-Performance Pythonic Applications and DSLs. In Proceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction (Montréal, QC, Canada) (CC 2023). Association for Computing Machinery, Ne...

work page doi:10.1145/3578360.3580275 2023
[67]

Jonathan Silva, Qin Ma, Jordi Cabot, Pierre Kelsen, and Henderik A. Proper. 2024. Application of the Tree-of-Thoughts Framework to LLM-Enabled Domain Modeling. In Conceptual Modeling: 43rd Inter- national Conference, ER 2024, Pittsburgh, PA, USA, October 28–31, 2024, Proceedings (Pittsburg, PA, USA). Springer-Verlag, Berlin, Heidelberg, 94–111. https://do...

work page doi:10.1007/978-3-031-75872-0_6 2024
[68]

Leonhard Spiegelberg, Rahul Yesantharao, Malte Schwarzkopf, and Tim Kraska. 2021. Tuplex: Data Science in Python at Native Code Speed. In Proceedings of the 2021 International Conference on Man- agement of Data (Virtual Event, China) (SIGMOD ’21). Association for Computing Machinery, New York, NY, USA, 1718–1731. https: //doi.org/10.1145/3448016.3457244

work page doi:10.1145/3448016.3457244 2021
[69]

David Suris et al. 2023. ViperGPT: Visual Inference via Python Execu- tion for Reasoning. arXiv preprint arXiv:2303.08128 (2023). 14

work page internal anchor Pith review Pith/arXiv arXiv 2023
[70]

Le, He He, and Minh-Thang Luong

Trieu Trinh, Yuhuai Wu, Quoc V. Le, He He, and Minh-Thang Luong

work page
[71]

Nature 625 (2024), 476–482

Solving Olympiad Geometry without Human Demonstrations. Nature 625 (2024), 476–482. https://doi.org/10.1038/s41586-023-06747- 5

work page doi:10.1038/s41586-023-06747- 2024
[72]

Nikos Vasilakis, Konstantinos Kallas, Konstantinos Mamouras, Achilles Benetopoulos, and Lazar Cvetković. 2021. PaSh: Light-Touch Data-Parallel Shell Processing. In Proceedings of the Sixteenth Euro- pean Conference on Computer Systems (Online Event, United Kingdom) (EuroSys ’21). Association for Computing Machinery, New York, NY, USA, 49–66. https://doi.o...

work page doi:10.1145/3447786.3456228 2021
[73]

Nikos Vasilakis, Ben Karel, Yash Palkhiwala, John Sonchack, André DeHon, and Jonathan M. Smith. 2019. Ignis: Scaling Distribution- Oblivious Systems with Light-Touch Distribution. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (Phoenix, AZ, USA) (PLDI 2019). ACM, 1010–1026. https://doi.org/10.1145/33142...

work page doi:10.1145/3314221.3314586 2019
[74]

Philip Wadler. 1990. Comprehending monads. InProceedings of the 1990 ACM Conference on LISP and Functional Programming (Nice, France) (LFP ’90). Association for Computing Machinery, New York, NY, USA, 61–78. https://doi.org/10.1145/91556.91592

work page doi:10.1145/91556.91592 1990
[75]

Whiting and Robert S

Paul G. Whiting and Robert S. V. Pascoe. 1994. A history of data-flow languages. IEEE Annals of the History of Computing 16 (1994), 38–59. https://api.semanticscholar.org/CorpusID:7384421

work page 1994
[76]

Willard et al

Brandon T. Willard et al. 2023. Guidance: A Guidance Language for Controlling Large Language Models . https://github.com/guidance- ai/guidance

work page 2023
[77]

Mengdi Wu, Xinhao Cheng, Shengyu Liu, Chunan Shi, Jianan Ji, Kit Ao, Praveen Velliengiri, Xupeng Miao, Oded Padon, and Zhihao Jia. 2025. Mirage: A Multi-Level Superoptimizer for Tensor Programs. In 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25). USENIX Association, 1–18

work page 2025
[78]

Wenjiang Xu, Cindy Wang, Rui Fang, Mingkang Zhang, Lusong Li, Jing Xu, Jiayuan Gu, Zecui Zeng, and Rui Chen. 2025. Embodied Tree of Thoughts: Deliberate Manipulation Planning with Embodied World Model. arXiv:2512.08188 [cs.RO] https://arxiv.org/abs/2512.08188

work page arXiv 2025
[79]

John Yang, Carlos E Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. 2024. Swe-agent: Agent- computer interfaces enable automated software engineering.Advances in Neural Information Processing Systems 37 (2024), 50528–50652

work page 2024
[80]

Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Tom Griffiths, Yuan Cao, and Karthik Narasimhan. 2023. Tree of thoughts: Deliberate problem solving with large language models. Advances in neural information processing systems 36 (2023), 11809–11822

work page 2023

Showing first 80 references.

[1] [1]

Official Repo of Tree of Thoughts

2023. Official Repo of Tree of Thoughts . https://github.com/princeton- nlp/tree-of-thought-llm

work page 2023

[2] [2]

Duane A. Adams. 1968. A Computation Model with Data-Sequenced Control. Technical Report. Stanford University. Technical Report CGTM 45

work page 1968

[3] [3]

Duane A. Adams. 1969. A Computation Model with Data Flow Sequenc- ing. Ph. D. Dissertation

work page 1969

[4] [4]

Jason Ansel, Edward Yang, Horace He, Natalia Gimelshein, Animesh Jain, Michael Voznesensky, Bin Bao, Peter Bell, David Berard, Evgeni Burovski, et al. 2024. Pytorch 2: Faster machine learning through dynamic python bytecode transformation and graph compilation. In Proceedings of the 29th ACM international conference on architectural support for programmin...

work page 2024

[5] [5]

Anthropic. 2024. The Claude 3 Model Family: Opus, Sonnet, Haiku. https://www-cdn.anthropic.com/ de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_ Claude_3.pdf

work page 2024

[6] [6]

Anthropic. 2025. How we built our multi-agent research system.https: //www.anthropic.com/engineering/multi-agent-research-system . Ac- cessed: 2026-04-01

work page 2025

[7] [7]

Sotiris Apostolakis, Ziyang Xu, Greg Chan, Simone Campanoni, and David I August. 2020. Perspective: A sensible approach to speculative automatic parallelization. In Proceedings of the Twenty-Fifth Interna- tional Conference on Architectural Support for Programming Languages and Operating Systems. 351–367

work page 2020

[8] [8]

Andrew W. Appel. 1991. Compiling with Continuations. Cambridge University Press

work page 1991

[9] [9]

Rishiyur S Nikhil Arvind. 1992. Id: a language with implicit parallelism. In A Comparative Study of Parallel Programming Languages . Elsevier, 169–215

work page 1992

[10] [10]

Stefanos Baziotis, Daniel Kang, and Charith Mendis. 2024. Dias: Dy- namic rewriting of Pandas code. Proceedings of the ACM on Manage- ment of Data 2, 1 (2024), 1–27

work page 2024

[11] [11]

Luca Beurer-Kellner, Marc Fischer, and Martin T. Vechev. 2023. Prompt- ing Is Programming: A Query Language for Large Language Models. In Proceedings of the 44th ACM SIGPLAN International Conference on Programming Language Design and Implementation . ACM, 1946–1969. https://doi.org/10.1145/3591300

work page doi:10.1145/3591300 2023

[12] [12]

Carl Friedrich Bolz, Antonio Cuni, Maciej Fijalkowski, and Armin Rigo. 2009. Tracing the Meta-Level: PyPy’s Tracing JIT Compiler. In Proceedings of the 4th Workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems (ICOOOLPS). ACM, 18–25. https://doi.org/10.1145/1565824.1565827

work page doi:10.1145/1565824.1565827 2009

[13] [13]

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, et al. 2021. Jax: Autograd and xla. Astrophysics Source Code Library (2021), ascl–2111

work page 2021

[14] [14]

Brandt Bucher and Savannah Ostrowski. 2024. PEP 744: JIT Com- pilation. https://peps.python.org/pep-0744/. Python Enhancement Proposal, Draft status

work page 2024

[15] [15]

Harrison Chase. 2023. LangChain. https://github.com/langchain- ai/langchain

work page 2023

[16] [16]

Gohar Irfan Chaudhry, Esha Choukse, Íñigo Goiri, Rodrigo Fonseca, Adam Belay, and Ricardo Bianchini. 2025. Towards Resource-Efficient Compound AI Systems. In Proceedings of the 2025 Workshop on Hot Topics in Operating Systems (Banff, AB, Canada) (HotOS ’25). As- sociation for Computing Machinery, New York, NY, USA, 218–224. https://doi.org/10.1145/3713082.3730377

work page doi:10.1145/3713082.3730377 2025

[17] [17]

Alonzo Church. 1941. The Calculi of Lambda-Conversion . Annals of Mathematics Studies, Vol. 6. Princeton University Press

work page 1941

[18] [18]

Fu, Stefano Ermon, Atri Rudra, and Christopher Ré

Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, and Christopher Ré

work page

[19] [19]

In Advances in Neural Information Processing Systems , Vol

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. In Advances in Neural Information Processing Systems , Vol. 35. 16344–16359

work page

[20] [20]

Davis and Robert M

Alan L. Davis and Robert M. Keller. 1982. Data Flow Program Graphs. Computer 15, 02 (2 1982), 26–41

work page 1982

[21] [21]

Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Carlini, Daniel Fabian, Christoph Kern, Chongyang Shi, Andreas Terzis, and Florian Tramèr. 2026. Defeating Prompt Injec- tions by Design. arXiv preprint arXiv:2503.18813. In IEEE Confer- ence on Secure and Trustworthy Machine Learning (SaTML) . https: //arxiv.org/abs/2503.18813

work page internal anchor Pith review Pith/arXiv arXiv 2026

[22] [22]

Edoardo Debenedetti, Jie Zhang, Mislav Balunovic, Luca Beurer- Kellner, Marc Fischer, and Florian Tramèr. 2024. Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents. Advances in Neural Information Processing Systems 37 (2024), 82895–82920

work page 2024

[23] [23]

Jack B. Dennis. 1974. First version of a data flow procedure language. In Programming Symposium, B. Robinet (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 362–376

work page 1974

[24] [24]

Yu Feng, Phu Mon Htut, Zheng Qi, Wei Xiao, Manuel Mager, Nikolaos Pappas, Kishaloy Halder, Yang Li, Yassine Benajiba, and Dan Roth

work page

[25] [25]

InFindings of the Association for Computational Linguistics: EMNLP 2025 , Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, and Violet Peng (Eds.)

Rethinking LLM Uncertainty: A Multi-Agent Approach to Esti- mating Black-Box Model Uncertainty. InFindings of the Association for Computational Linguistics: EMNLP 2025 , Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, and Violet Peng (Eds.). Asso- ciation for Computational Linguistics, Suzhou, China, 12349–12375. https://doi.org/10.18653/v1...

work page doi:10.18653/v1/2025.findings-emnlp.660 2025

[26] [26]

Yu Feng, Ben Zhou, Weidong Lin, and Dan Roth. 2025. BIRD: A Trust- worthy Bayesian Inference Framework for Large Language Models. In The Thirteenth International Conference on Learning Representations . https://openreview.net/forum?id=fAAaT826Vv

work page 2025

[27] [27]

Feo, David C

John T. Feo, David C. Cann, and Rodney R. Oldehoeft. 1990. A report on the sisal language project. J. Parallel and Distrib. Comput. 10, 4 (1990), 349–366. https://doi.org/10.1016/0743-7315(90)90035-N Data-flow Processing

work page doi:10.1016/0743-7315(90)90035-n 1990

[28] [28]

Duba, and Matthias Felleisen

Cormac Flanagan, Amr Sabry, Bruce F. Duba, and Matthias Felleisen

work page

[29] [29]

In Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation (Albuquerque, New Mexico, USA) (PLDI ’93)

The essence of compiling with continuations. In Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation (Albuquerque, New Mexico, USA) (PLDI ’93). As- sociation for Computing Machinery, New York, NY, USA, 237–247. https://doi.org/10.1145/155090.155113

work page doi:10.1145/155090.155113 1993

[30] [30]

Gemini Team, Google. 2023. Gemini: A Family of Highly Capable Multimodal Models. arXiv preprint arXiv:2312.11805 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[31] [31]

GitHub Staff. 2025. Octoverse: A new developer joins GitHub every second as AI leads TypeScript to #1.https://github.blog/news-insights/ octoverse/. Accessed: 2026-03-30

work page 2025

[32] [32]

Halstead

Robert H. Halstead. 1985. MULTILISP: a language for concurrent symbolic computation. ACM Trans. Program. Lang. Syst. 7, 4 (Oct. 1985), 501–538. https://doi.org/10.1145/4472.4478

work page doi:10.1145/4472.4478 1985

[33] [33]

Kang He and Kaushik Roy. 2025. LogicTree: Structured Proof Ex- ploration for Coherent and Rigorous Logical Reasoning with Large Language Models. arXiv preprint arXiv:2504.14089 (2025)

work page arXiv 2025

[34] [34]

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

Carlos E. Jimenez, John Yang, S. Friedman, et al. 2024. SWE-bench: Can Language Models Resolve Real-World GitHub Issues? arXiv preprint arXiv:2310.06770 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[35] [35]

Tian Jin, Ellie Y Cheng, Zachary Ankner, Nikunj Saunshi, Blake M Elias, Amir Yazdanbakhsh, Jonathan Ragan-Kelley, Suvinay Subrama- nian, and Michael Carbin. 2025. Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding. In Forty-second International Conference on Machine Learn- ing. https://openreview.net...

work page 2025

[36] [36]

Michael Jungmair, Alexis Engelke, and Jana Giceva. 2024. HiPy: Ex- tracting High-Level Semantics from Python Code for Data Process- ing. Proc. ACM Program. Lang. 8, OOPSLA2, Article 297 (Oct. 2024), 27 pages. https://doi.org/10.1145/3689737 13

work page doi:10.1145/3689737 2024

[37] [37]

Konstantinos Kallas, Tammam Mustafa, Jan Bielak, Dimitris Karnikis, Thurston HY Dang, Michael Greenberg, and Nikos Vasilakis. 2022. Practically correct,Just-in-Time shell script parallelization. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22). 769–785

work page 2022

[38] [38]

Karp and Raymond Miller

Richard M. Karp and Raymond Miller. 1966. Properties of a Model for Parallel Computation: Determinacy, Termination, Queueing. SlAM J. of Applied Mathematics 14, 6 (11 1966), 1390–1411

work page 1966

[39] [39]

Karp and Raymond Miller

Richard M. Karp and Raymond Miller. 1969. Parallel Program Schemata. J. Comput. System Sci. 3 (1969), 147–195

work page 1969

[40] [40]

Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, and Christopher Potts

Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan, Saiful Haq, Ashutosh Sharma, Thomas T. Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, and Christopher Potts. 2024. DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines. The Twelfth International Conference on Learning Representations

work page 2024

[41] [41]

Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. 2015. Numba: A llvm-based python jit compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC . 1–6

work page 2015

[42] [42]

LangChain Inc. 2024. LangGraph: Build Resilient Language Agents as Graphs. https://github.com/langchain-ai/langgraph

work page 2024

[43] [43]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al . 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems 33 (2020), 9459–9474

work page 2020

[44] [44]

Shuo Li, Sangdon Park, Insup Lee, and Osbert Bastani. 2024. TRAQ: Trustworthy Retrieval Augmented Question Answering via Conformal Prediction. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Lan- guage Technologies (Volume 1: Long Papers), Kevin Duh, Helena Gomez, and Steven Betha...

work page doi:10.18653/v1/2024 2024

[45] [45]

Hao Liang, Xiaochen Ma, Zhou Liu, Zhen Hao Wong, Zhengyang Zhao, Zimo Meng, Runming He, Chengyu Shen, Qifeng Cai, Zhaoyang Han, et al. 2025. DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI. arXiv preprint arXiv:2512.16676 (2025)

work page arXiv 2025

[46] [46]

Mingdao Liu, Aohan Zeng, Bowen Wang, Peng Zhang, Jie Tang, and Yuxiao Dong. 2024. APAR: LLMs Can Do Auto-Parallel Auto- Regressive Decoding. arXiv:2401.06761 [cs.CL] https://arxiv.org/abs/ 2401.06761

work page arXiv 2024

[47] [47]

Shail Aditya Arvind Jan-Willem Maessen, Lennart Augustsson, and Rishiyur S Nikhil. 1995. Semantics of pH: A parallel dialect of Haskell. In In Proceedings from the Haskell Workshop (at FPCA 95) . 35–49

work page 1995

[48] [48]

James R McGraw. 1982. The VAL language: Description and analysis. ACM Transactions on Programming Languages and Systems (TOPLAS) 4, 1 (1982), 44–82

work page 1982

[49] [49]

Stephen Mell, Konstantinos Kallas, Steve Zdancewic, and Osbert Bastani. 2025. Opportunistically Parallel Lambda Calculus. Proc. ACM Program. Lang. 9, OOPSLA2, Article 365 (Oct. 2025), 27 pages. https://doi.org/10.1145/3763143

work page doi:10.1145/3763143 2025

[50] [50]

Tammam Mustafa, Konstantinos Kallas, Pratyush Das, and Nikos Vasi- lakis. 2023. DiSh: Dynamic Shell-Script Distribution. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). 341–356

work page 2023

[51] [51]

n8n.io. 2026. n8n: Fair-code workflow automation platform with native AI capabilities. https://github.com/n8n-io/n8n

work page 2026

[52] [52]

Ziyi Ni, Yifan Li, Ning Yang, Dou Shen, Pin Lyu, and Daxiang Dong

work page

[53] [53]

In Findings of the Association for Computational Linguistics: ACL 2025

Tree-of-code: A self-growing tree framework for end-to-end code generation and execution in complex tasks. In Findings of the Association for Computational Linguistics: ACL 2025 . 9804–9819

work page 2025

[54] [54]

Xuefei Ning, Zinan Lin, Zixuan Zhou, Zifu Wang, Huazhong Yang, and Yu Wang. 2024. Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation. In The Twelfth International Conference on Learn- ing Representations. https://openreview.net/forum?id=mqVgBbNCm9

work page 2024

[55] [55]

OpenAI. 2023. GPT-4 Technical Report . Technical Report. OpenAI. arXiv preprint arXiv:2303.08774

work page internal anchor Pith review Pith/arXiv arXiv 2023

[56] [56]

OpenAI. 2025. OpenAI Agents SDK. https://github.com/openai/openai- agents-python

work page 2025

[57] [57]

Shoumik Palkar, James J Thomas, Anil Shanbhag, Deepak Narayanan, Holger Pirk, Malte Schwarzkopf, Saman Amarasinghe, and Matei Zaharia. 2017. Weld: A common runtime for high performance data analytics. (2017)

work page 2017

[58] [58]

Shoumik Palkar and Matei Zaharia. 2019. Optimizing Data-Intensive Computations in Existing Libraries with Split Annotations. In Pro- ceedings of the 27th ACM Symposium on Operating Systems Princi- ples (Huntsville, ON, Canada) (SOSP ’19). ACM, 291–305. https: //doi.org/10.1145/3341301.3359652

work page doi:10.1145/3341301.3359652 2019

[59] [59]

Joe Gibbs Politz, Alejandro Martinez, Mae Milano, Sumner Warren, Daniel Patterson, Junsong Li, Anand Chitipothu, and Shriram Krish- namurthi. 2013. Python: the full monty. SIGPLAN Not. 48, 10 (Oct. 2013), 217–232. https://doi.org/10.1145/2544173.2509536

work page doi:10.1145/2544173.2509536 2013

[60] [60]

Deepti Raghavan, Sadjad Fouladi, Philip Levis, and Matei Zaharia

work page

[61] [61]

In 2020 USENIX Annual Technical Conference (USENIX ATC 20)

POSH: A Data-Aware Shell. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 617–631

work page 2020

[62] [62]

Rodriguez

Jorge E. Rodriguez. 1969. A Graph Model for Parallel Computations . Ph. D. Dissertation. MIT-LCS-TR64

work page 1969

[63] [63]

doi: 10.1038/s41586-023-06924-6

Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M. Pawan Kumar, Emilien Dupont, Francisco J. R. Ruiz, Jordan S. Ellenberg, Pengming Wang, Omar Fawzi, Pushmeet Kohli, and Alhussein Fawzi. 2024. Mathematical discoveries from program search with large language models. Nature 625, 7995 (2024), 468–475. https://doi.org/10.10...

work page doi:10.1038/s41586-023-06924-6 2024

[64] [64]

Keshav Santhanam, Deepti Raghavan, Muhammad Shahir Rahman, Thejas Venkatesh, Neha Kunjal, Pratiksha Thaker, Philip Levis, and Matei Zaharia. 2024. ALTO: An Efficient Network Orchestrator for Compound AI Systems. In Proceedings of the 4th Workshop on Machine Learning and Systems (Athens, Greece) (EuroMLSys ’24). Association for Computing Machinery, New Yor...

work page doi:10.1145/3642970.3655844 2024

[65] [65]

Sastry and Roy D.C

A.V.S. Sastry and Roy D.C. Ju. 1998. A New Algorithm for Scalar Register Promotion Based on SSA Form. PLDI ’98: Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation (1998)

work page 1998

[66] [66]

Ariya Shajii, Gabriel Ramirez, Haris Smajlović, Jessica Ray, Bonnie Berger, Saman Amarasinghe, and Ibrahim Numanagić. 2023. Codon: A Compiler for High-Performance Pythonic Applications and DSLs. In Proceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction (Montréal, QC, Canada) (CC 2023). Association for Computing Machinery, Ne...

work page doi:10.1145/3578360.3580275 2023

[67] [67]

Jonathan Silva, Qin Ma, Jordi Cabot, Pierre Kelsen, and Henderik A. Proper. 2024. Application of the Tree-of-Thoughts Framework to LLM-Enabled Domain Modeling. In Conceptual Modeling: 43rd Inter- national Conference, ER 2024, Pittsburgh, PA, USA, October 28–31, 2024, Proceedings (Pittsburg, PA, USA). Springer-Verlag, Berlin, Heidelberg, 94–111. https://do...

work page doi:10.1007/978-3-031-75872-0_6 2024

[68] [68]

Leonhard Spiegelberg, Rahul Yesantharao, Malte Schwarzkopf, and Tim Kraska. 2021. Tuplex: Data Science in Python at Native Code Speed. In Proceedings of the 2021 International Conference on Man- agement of Data (Virtual Event, China) (SIGMOD ’21). Association for Computing Machinery, New York, NY, USA, 1718–1731. https: //doi.org/10.1145/3448016.3457244

work page doi:10.1145/3448016.3457244 2021

[69] [69]

David Suris et al. 2023. ViperGPT: Visual Inference via Python Execu- tion for Reasoning. arXiv preprint arXiv:2303.08128 (2023). 14

work page internal anchor Pith review Pith/arXiv arXiv 2023

[70] [70]

Le, He He, and Minh-Thang Luong

Trieu Trinh, Yuhuai Wu, Quoc V. Le, He He, and Minh-Thang Luong

work page

[71] [71]

Nature 625 (2024), 476–482

Solving Olympiad Geometry without Human Demonstrations. Nature 625 (2024), 476–482. https://doi.org/10.1038/s41586-023-06747- 5

work page doi:10.1038/s41586-023-06747- 2024

[72] [72]

Nikos Vasilakis, Konstantinos Kallas, Konstantinos Mamouras, Achilles Benetopoulos, and Lazar Cvetković. 2021. PaSh: Light-Touch Data-Parallel Shell Processing. In Proceedings of the Sixteenth Euro- pean Conference on Computer Systems (Online Event, United Kingdom) (EuroSys ’21). Association for Computing Machinery, New York, NY, USA, 49–66. https://doi.o...

work page doi:10.1145/3447786.3456228 2021

[73] [73]

Nikos Vasilakis, Ben Karel, Yash Palkhiwala, John Sonchack, André DeHon, and Jonathan M. Smith. 2019. Ignis: Scaling Distribution- Oblivious Systems with Light-Touch Distribution. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (Phoenix, AZ, USA) (PLDI 2019). ACM, 1010–1026. https://doi.org/10.1145/33142...

work page doi:10.1145/3314221.3314586 2019

[74] [74]

Philip Wadler. 1990. Comprehending monads. InProceedings of the 1990 ACM Conference on LISP and Functional Programming (Nice, France) (LFP ’90). Association for Computing Machinery, New York, NY, USA, 61–78. https://doi.org/10.1145/91556.91592

work page doi:10.1145/91556.91592 1990

[75] [75]

Whiting and Robert S

Paul G. Whiting and Robert S. V. Pascoe. 1994. A history of data-flow languages. IEEE Annals of the History of Computing 16 (1994), 38–59. https://api.semanticscholar.org/CorpusID:7384421

work page 1994

[76] [76]

Willard et al

Brandon T. Willard et al. 2023. Guidance: A Guidance Language for Controlling Large Language Models . https://github.com/guidance- ai/guidance

work page 2023

[77] [77]

Mengdi Wu, Xinhao Cheng, Shengyu Liu, Chunan Shi, Jianan Ji, Kit Ao, Praveen Velliengiri, Xupeng Miao, Oded Padon, and Zhihao Jia. 2025. Mirage: A Multi-Level Superoptimizer for Tensor Programs. In 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25). USENIX Association, 1–18

work page 2025

[78] [78]

Wenjiang Xu, Cindy Wang, Rui Fang, Mingkang Zhang, Lusong Li, Jing Xu, Jiayuan Gu, Zecui Zeng, and Rui Chen. 2025. Embodied Tree of Thoughts: Deliberate Manipulation Planning with Embodied World Model. arXiv:2512.08188 [cs.RO] https://arxiv.org/abs/2512.08188

work page arXiv 2025

[79] [79]

John Yang, Carlos E Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. 2024. Swe-agent: Agent- computer interfaces enable automated software engineering.Advances in Neural Information Processing Systems 37 (2024), 50528–50652

work page 2024

[80] [80]

Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Tom Griffiths, Yuan Cao, and Karthik Narasimhan. 2023. Tree of thoughts: Deliberate problem solving with large language models. Advances in neural information processing systems 36 (2023), 11809–11822

work page 2023