JEDI: Java Evaluation of Declarative and Imperative Queries

Filippo Schiavio; Walter Binder

arxiv: 2605.23543 · v1 · pith:LSLNFYVDnew · submitted 2026-05-22 · 💻 cs.PL · cs.SE

JEDI: Java Evaluation of Declarative and Imperative Queries

Filippo Schiavio , Walter Binder This is my paper

Pith reviewed 2026-05-25 02:30 UTC · model grok-4.3

classification 💻 cs.PL cs.SE

keywords Java Stream APIbenchmark suiteSQL conversionparallelization strategiesperformance comparisondeclarative vs imperativebest practicescode generation

0 comments

The pith

Automatically converting SQL benchmarks into Java creates multiple Stream API and imperative implementations to identify efficient parallelization strategies based on data characteristics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents JEDI as a benchmark suite created by translating existing SQL benchmarks into Java code that exercises the Stream API. Multiple target implementations are generated for each query, covering declarative stream expressions with varied parallelization approaches alongside imperative baselines. The central aim is to measure and compare their runtimes so that inefficient patterns can be spotted and concrete recommendations offered to developers writing stream-based code. A reader would care because the Stream API is promoted for simplifying parallel work yet lacks focused benchmarks, leaving both library optimizers and everyday programmers without clear guidance on when and how to parallelize.

Core claim

JEDI is built by automatically converting SQL benchmarks into Java benchmarks that support both stream-based and imperative implementations for the same query. Performance measurements across these variants, with emphasis on different parallelization strategies, reveal the most efficient approaches as a function of the characteristics of the processed data. The generated imperative code supplies a baseline that researchers and Java implementers can use when optimizing the Stream API itself.

What carries the argument

The code generator that produces, from each SQL benchmark, a family of Java implementations including stream-based variants with different parallelization choices and corresponding imperative versions.

If this is right

Developers obtain concrete rules for choosing parallelization tactics according to data size, distribution, and operation type.
Library maintainers receive an imperative baseline against which Stream API improvements can be measured.
Inefficient stream coding patterns become visible through systematic comparison rather than anecdotal observation.
The same conversion process can be reused to keep the benchmark suite current as new SQL workloads appear.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The generator could be pointed at other declarative query languages to produce comparable Java Stream benchmarks.
Performance differences observed on synthetic SQL-derived data might shift when the same queries run over real application data sets.
Results could inform static analysis tools that warn developers about likely inefficient stream usage before execution.

Load-bearing premise

That code produced by automatically translating SQL benchmarks yields Java performance behavior that matches how developers actually use the Stream API in practice.

What would settle it

A side-by-side run of the generated benchmarks against a collection of hand-written Stream API code taken from real open-source Java projects that shows substantially different bottleneck locations or ranking of parallel strategies.

Figures

Figures reproduced from arXiv: 2605.23543 by Filippo Schiavio, Walter Binder.

**Figure 2.** Figure 2: Code generated for the example query (Figure 1). [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Query with 7 conjuncted predicates. (few ms), while on ARM ( [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Execution time [ms] (y-axis) of the Distinct query varying the number of distinct elements (x-axis). [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 7.** Figure 7: Execution time [ms] (y-axis) of the queries in Figure 5 (OneField) and Figure 6 (ManyFields) on JDK for different [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

read the original abstract

The Java Stream API aims at increasing developer productivity thanks to an easy-to-read declarative syntax to express computations. It also simplifies parallel computing, providing a high-level abstraction on top of common parallelization aspects. Unfortunately, there is a lack of benchmarks specifically targeting stream-based applications. Such a lack of benchmarks makes it difficult for researchers and developers of the Java class library to optimize the Stream API. Moreover, in the absence of dedicated benchmarks, it is difficult to analyze the performance of streams to suggest developers how to write efficient code using the API. In this work we present JEDI, a benchmark suite that targets the Stream API. JEDI is automatically generated by converting SQL benchmarks into Java benchmarks. Our code generator supports targets different implementations (both stream-based and imperative) for the same query. The ultimate goal of our benchmark suite -- and the main contribution of this work -- is to analyze the performance of the different implementations to spot inefficient code structures and better alternatives, suggesting best practices to Java developers. Among the multiple implementations we generate, we focus on different parallelization strategies and explain the most efficient parallelization strategies based on characteristics of the processed data. Finally, the code generation producing imperative code defines of a baseline that can guide researchers and Java implementers to optimize the Stream API.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

JEDI generates SQL-derived Java stream benchmarks with multiple parallel variants, but offers no validation that the code is representative or that results will yield reliable best practices.

read the letter

The paper introduces JEDI, a benchmark suite for the Java Stream API built by converting existing SQL benchmarks into Java code. The generator produces both declarative stream versions and imperative equivalents, with variants that differ in parallelization strategy. The stated goal is to use these to identify inefficient patterns and recommend better approaches to developers and library implementers. This generation method from SQL is the concrete new piece, and it directly targets the acknowledged shortage of stream-specific benchmarks. The motivation section is clear about why such benchmarks matter for both research and practice. The idea of shipping multiple implementations per query for controlled comparison is reasonable on its face. The soft spot is the missing link between generation and usable results. The abstract and description give no evidence that the translated queries avoid artifacts such as forced intermediate collections or missed short-circuiting that would distort measured speedups or scaling behavior. There is also no mention of checking the generated streams against hand-written idiomatic code or of running any actual performance measurements. Without those steps, the claim that the suite will reliably spot inefficient structures rests on an untested assumption about representativeness. The imperative baselines face the same issue: if they inherit translation quirks, they may not serve as a clean reference point. This work is aimed at researchers building or tuning stream libraries and at developers who want data-driven guidance on parallel streams. Someone already working on benchmark generation or Java performance tooling could extract the conversion approach and try it themselves. The paper is coherent on its own terms and engages the right prior work on benchmarks, so it clears the bar for serious refereeing even though the current version is mostly a description of the generator rather than a completed evaluation.

Referee Report

2 major / 1 minor

Summary. The paper presents JEDI, a benchmark suite for the Java Stream API that is automatically generated by converting existing SQL benchmarks into Java. The generator produces multiple implementations of each query (stream-based declarative versions with different parallelization strategies, plus imperative baselines). The central claim is that this suite will enable performance analysis to identify inefficient code structures, recommend best practices to developers, and serve as a baseline for Stream API optimization.

Significance. A validated, representative benchmark suite targeting Stream API usage and parallelization could address a documented gap in Java performance evaluation and support both library improvements and developer guidance. The approach of deriving benchmarks from external SQL sources avoids ad-hoc invention, but the significance cannot be assessed until the generated code is shown to be free of translation artifacts and the promised performance analysis is executed.

major comments (2)

[Abstract] Abstract: The central claim that the generated suite 'will spot inefficient code structures and better alternatives' and 'explain the most efficient parallelization strategies' is not supported by any empirical data, validation of the generated code, or comparison to hand-written idiomatic Stream usage; the manuscript describes the generator but supplies no performance measurements, error bars, or fidelity checks.
[Abstract] Abstract (final paragraph): The assumption that automatically converted SQL queries produce Java Stream code whose runtime behavior and bottlenecks accurately mirror real-world usage (and that the generated imperative versions form a valid baseline) is load-bearing for all downstream claims, yet no conversion rules, validation against hand-written code, or checks for artifacts (e.g., unnatural intermediate collections or missed short-circuiting) are provided.

minor comments (1)

[Abstract] Abstract, last sentence: 'defines of a baseline' is a grammatical error and should read 'defines a baseline'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract and the importance of empirical support. The manuscript primarily describes the JEDI generator and benchmark structure; we address the two major comments below and will revise the paper accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that the generated suite 'will spot inefficient code structures and better alternatives' and 'explain the most efficient parallelization strategies' is not supported by any empirical data, validation of the generated code, or comparison to hand-written idiomatic Stream usage; the manuscript describes the generator but supplies no performance measurements, error bars, or fidelity checks.

Authors: We agree that the current manuscript supplies no performance measurements, error bars, or fidelity checks, and that the abstract's forward-looking claims about spotting inefficiencies and explaining strategies are not yet backed by data in this submission. The paper's contribution is the automated generation of multiple implementations (declarative streams with varying parallelization plus imperative baselines) from SQL sources. In revision we will rewrite the abstract to limit claims to what is demonstrated (the generator and suite construction) and add a short experimental section with initial runtime measurements on a representative subset of queries, including basic statistical reporting. revision: yes
Referee: [Abstract] Abstract (final paragraph): The assumption that automatically converted SQL queries produce Java Stream code whose runtime behavior and bottlenecks accurately mirror real-world usage (and that the generated imperative versions form a valid baseline) is load-bearing for all downstream claims, yet no conversion rules, validation against hand-written code, or checks for artifacts (e.g., unnatural intermediate collections or missed short-circuiting) are provided.

Authors: The conversion logic resides in the open-source generator released with the paper, but the manuscript itself does not enumerate the rules or report explicit validation steps against hand-written code or artifact checks. This is a substantive gap for establishing representativeness. We will add a subsection detailing the principal SQL-to-Stream and SQL-to-imperative translation rules and describe any manual or automated checks already performed to avoid common pitfalls such as forced materialization or loss of short-circuiting semantics. revision: yes

Circularity Check

0 steps flagged

No circularity: benchmark generation and comparison are independent of fitted results or self-referential definitions

full rationale

The paper presents an engineering contribution: automatic conversion of external SQL benchmarks into multiple Java implementations (Stream API and imperative) to enable performance measurement and identification of best practices. No derivation chain, prediction, or uniqueness claim reduces by construction to its own inputs. The baseline is defined by the generation process itself but is not presented as a 'prediction' or result derived from fitted data. No self-citations are load-bearing for the central claims. The approach is self-contained against external SQL sources and direct runtime measurements.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are described in the abstract; the work relies on standard SQL benchmarks and the existing Java Stream API.

pith-pipeline@v0.9.0 · 5752 in / 1002 out tokens · 23975 ms · 2026-05-25T02:30:23.574342+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages

[1]

Matteo Basso, Eduardo Rosales, Filippo Schiavio, Andrea Rosà, and Walter Binder

work page
[2]

InEuro-Par 2022: Parallel Processing - 28th International Conference on Parallel and Distributed Computing

Accurate Fork-Join Profiling on the Java Virtual Machine. InEuro-Par 2022: Parallel Processing - 28th International Conference on Parallel and Distributed Computing. Springer, 35–50. doi:10.1007/978-3-031-12597-3_3

work page doi:10.1007/978-3-031-12597-3_3 2022
[3]

Matteo Basso, Filippo Schiavio, Andrea Rosà, and Walter Binder. 2022. Optimizing Parallel Java Streams. In26th International Conference on Engineering of Complex Computer Systems, ICECCS 2022, Hiroshima, Japan, March 26-30, 2022. IEEE, 23–32. doi:10.1109/ICECCS54210.2022.00012

work page doi:10.1109/iceccs54210.2022.00012 2022
[4]

Stephen M Blackburn, Zixian Cai, Rui Chen, Xi Yang, John Zhang, and John Zigman. 2025. Rethinking Java Performance Analysis. InProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1, ASPLOS 2025, Rotterdam, Netherlands, 30 March 2025 - 3 April 2025. ACM. doi:10.1145/3669940.3707217

work page doi:10.1145/3669940.3707217 2025
[5]

S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dinck- lage, and B. Wiedermann. 2006. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. InOOPSLA ’06: Pro...

work page doi:10.1145/1167473.1167488 2006
[6]

Peter Boncz, Thomas Neumann, and Orri Erling. 2013. TPC-H analyzed: Hidden messages and lessons learned from an influential benchmark. InTechnology Conference on Performance Evaluation and Benchmarking. Springer, 61–76

work page 2013
[7]

Ann Campbell

G. Ann Campbell. 2018. Cognitive complexity: an overview and evaluation. In Proceedings of the 2018 International Conference on Technical Debt(Gothenburg, Sweden)(TechDebt ’18). Association for Computing Machinery, New York, NY, USA, 57–58. doi:10.1145/3194164.3194186

work page doi:10.1145/3194164.3194186 2018
[8]

Diego Costa, Cor-Paul Bezemer, Philipp Leitner, and Artur Andrzejak. 2019. What’s wrong with my benchmark results? Studying bad practices in JMH bench- marks.IEEE Transactions on Software Engineering47, 7 (2019), 1452–1467

work page 2019
[9]

Markus Dreseler, Martin Boissier, Tilmann Rabl, and Matthias Uflacker. 2020. Quantifying TPC-H choke points and their optimizations.Proceedings of the VLDB Endowment13, 8 (2020), 1206–1220

work page 2020
[10]

Filippo Schiavio. 2025. JEDI - Java Evaluation of Declarative vs Imperative queries. http://github.com/usi-dag/JEDI

work page 2025
[11]

Filippo Schiavio. 2025. S2S - SQL To Stream. http://github.com/usi-dag/S2S

work page 2025
[12]

2006.Java concurrency in practice

Brian Goetz. 2006.Java concurrency in practice. Pearson Education

work page 2006
[13]

JetBrains. 2022. IntelliJ IDEA – the IDE for Pro Java and Kotlin Development. https://www.jetbrains.com/idea/

work page 2022
[14]

Loveleen Kaur and Ashutosh Mishra. 2019. Cognitive complexity as a quantifier of version to version Java-based source code change: An empirical probe.Informa- tion and Software Technology106 (2019), 31–48. doi:10.1016/j.infsof.2018.09.002

work page doi:10.1016/j.infsof.2018.09.002 2019
[15]

Timo Kersten, Viktor Leis, Alfons Kemper, Thomas Neumann, Andrew Pavlo, and Peter Boncz. 2018. Everything you always wanted to know about compiled and vectorized queries but were afraid to ask.Proceedings of the VLDB Endowment 11, 13 (2018), 2209–2222

work page 2018
[16]

Raffi Khatchadourian, Yiming Tang, and Mehdi Bagherzadeh. 2020. Safe auto- mated refactoring for intelligent parallelization of Java 8 streams.Sci. Comput. Program.195 (2020), 102476. doi:10.1016/J.SCICO.2020.102476

work page doi:10.1016/j.scico.2020.102476 2020
[17]

Raffi Khatchadourian, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed. 2018. [Engineering Paper] A Tool for Optimizing Java 8 Stream Software via Auto- mated Refactoring. In18th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2018, Madrid, Spain, September 23-24, 2018. IEEE Computer Society, 34–39. doi:10.1109/SCAM.2018.00011

work page doi:10.1109/scam.2018.00011 2018
[18]

Raffi Khatchadourian, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed. 2019. Safe automated refactoring for intelligent parallelization of Java 8 streams. In Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019, Joanne M. Atlee, Tevfik Bultan, and Jon Whittle (Eds.). IEEE / ACM, 619–630....

work page doi:10.1109/icse.2019.00072 2019
[19]

Raffi Khatchadourian, Yiming Tang, Mehdi Bagherzadeh, and Baishakhi Ray. 2020. An Empirical Study on the Use and Misuse of Java 8 Streams. InFundamental Approaches to Software Engineering - 23rd International Conference, FASE 2020 (Lecture Notes in Computer Science, Vol. 12076), Heike Wehrheim and Jordi Cabot (Eds.). Springer, 97–118. doi:10.1007/978-3-03...

work page doi:10.1007/978-3-030-45234-6_5 2020
[20]

McCabe. 2022. McCabe IQ - Software Metrics Glossary. http://www.mccabe. com/iq_research_metrics.htm

work page 2022
[21]

Thomas J McCabe. 1976. A complexity measure.IEEE Transactions on software Engineering4 (1976), 308–320

work page 1976
[22]

Nils Mehlhorn and Stefan Hanenberg. 2022. Imperative versus Declarative Collection Processing: An RCT on the Understandability of Traditional Loops versus the Stream API in Java. In44th IEEE/ACM 44th International Conference on Software Engineering, ICSE 2022, Pittsburgh, PA, USA, May 25-27, 2022. ACM, 1157–1168. doi:10.1145/3510003.3519016

work page doi:10.1145/3510003.3519016 2022
[23]

Michael Duigou. 2022. Java Microbenchmarking Harness. http://openjdk.java. net/projects/code-tools/jmh/

work page 2022
[24]

Anders Møller and Oskar Haarklou Veileborg. 2020. Eliminating Abstraction Overhead of Java Stream Pipelines Using Ahead-of-Time Program Optimization. Proc. ACM Program. Lang.4, OOPSLA (2020), 1–29

work page 2020
[25]

Thomas Neumann. 2011. Efficiently Compiling Efficient Query Plans for Modern Hardware.Proc. VLDB Endow.4, 9 (2011), 539–550

work page 2011
[26]

Thomas Neumann and Michael J. Freitag. 2020. Umbra: A Disk-Based System with In-Memory Performance. InCIDR

work page 2020
[27]

Joshua Nostas, Juan Pablo Sandoval Alcocer, Diego Elias Costa, and Alexandre Bergel. 2021. How Do Developers Use the Java Stream API?. InComputational Science and Its Applications – ICCSA 2021, Osvaldo Gervasi, Beniamino Murgante, Sanjay Misra, Chiara Garau, Ivan Blečić, David Taniar, Bernady O. Apduhan, Ana Maria A. C. Rocha, Eufemia Tarantino, and Carme...

work page 2021
[28]

Oracle. 2022. Ergonomics. https://docs.oracle.com/javase/8/docs/technotes/ guides/vm/gctuning/ergonomics.html

work page 2022
[29]

Oracle. 2022. GraalVM. https://www.graalvm.org/

work page 2022
[30]

Oracle. 2022. Java Software | Oracle. https://www.oracle.com/java/

work page 2022
[31]

Oracle. 2022. Processing Data with Java SE 8 Streams, Part 1. https://www.oracle. com/technical-resources/articles/java/ma14-java-se-8-streams.html

work page 2022
[32]

Oracle. 2022. Stream (JDK 24) - distinct. https://docs.oracle.com/en/java/javase/ 24/docs/api/java.base/java/util/stream/Stream.html#distinct()

work page 2022
[33]

Oracle. 2024. BiConsumer (Java SE 23; JDK 23). https://docs.oracle.com/en/java/ javase/23/docs/api/java.base/java/util/function/BiConsumer.html

work page 2024
[34]

Oracle. 2024. Consumer (Java SE 23; JDK 23). https://docs.oracle.com/en/java/ javase/23/docs/api/java.base/java/util/function/Consumer.html

work page 2024
[35]

Oracle. 2024. java.util.stream (Java SE 23; JDK 23). https://docs. oracle.com/en/java/javase/23/docs/api/java.base/java/util/stream/package- summary.html#Ordering

work page 2024
[36]

Aleksandar Prokopec, Andrea Rosà, David Leopoldseder, Gilles Duboscq, Petr Tuma, Martin Studener, Lubomír Bulej, Yudi Zheng, Alex Villazón, Doug Simon, Thomas Würthinger, and Walter Binder. 2020. Renaissance: Benchmarking Suite for Parallel Applications on the JVM. InSoftware Engineering 2020, Fachtagung des GI-Fachbereichs Softwaretechnik (LNI, Vol. P-30...

work page doi:10.18420/se2020_44 2020
[37]

Mark Raasveldt and Hannes Mühleisen. 2019. Duckdb: an embeddable analytical database. InProceedings of the 2019 international conference on management of data. 1981–1984

work page 2019
[38]

Eduardo Rosales, Matteo Basso, Andrea Rosà, and Walter Binder. 2023. Large- scale characterization of Java streams.Softw. Pract. Exp.53, 9 (2023), 1763–1792. doi:10.1002/SPE.3213

work page doi:10.1002/spe.3213 2023
[39]

Eduardo Rosales, Matteo Basso, Andrea Rosà, and Walter Binder. 2023. Profiling and Optimizing Java Streams.Art Sci. Eng. Program.7, 3 (2023). doi:10.22152/ PROGRAMMING-JOURNAL.ORG/2023/7/10

work page 2023
[40]

Filippo Schiavio, Daniele Bonetta, and Walter Binder. 2021. Language-Agnostic Integrated Queries in a Managed Polyglot Runtime.Proc. VLDB Endow.14, 8 (2021), 1414–1426. doi:10.14778/3457390.3457405

work page doi:10.14778/3457390.3457405 2021
[41]

Filippo Schiavio, Daniele Bonetta, and Walter Binder. 2023. DynQ: a dynamic query engine with query-reuse capabilities embedded in a polyglot runtime. VLDB J.32, 5 (2023), 1111–1135. doi:10.1007/S00778-023-00784-2

work page doi:10.1007/s00778-023-00784-2 2023
[42]

Filippo Schiavio, Andrea Rosà, and Walter Binder. 2022. SQL to Stream with S2S: An Automatic Benchmark Generator for the Java Stream API. InProceedings of the 21st ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences, GPCE 2022, Auckland, New Zealand, December 6-7, 2022, Bernhard Scholz and Yukiyoshi Kameyama (Eds.). AC...

work page arXiv 2022
[43]

Andreas Sewe, Mira Mezini, Aibek Sarimbekov, and Walter Binder. 2011. Da capo con scala: Design and analysis of a scala benchmark suite for the java virtual machine. InProceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications. 657–676

work page 2011
[44]

SPEC. 1998. SpecJVM2008. https://www.spec.org/jvm2008/

work page 1998
[45]

SPEC. 2008. SpecJVM98. https://www.spec.org/jvm98/

work page 2008
[46]

Ruby Y Tahboub, Grégory M Essertel, and Tiark Rompf. 2018. How to architect a query compiler, revisited. InProceedings of the 2018 International Conference on Management of Data. 307–322

work page 2018
[47]

2011.JUnit in Action, 2nd Edition

Petar Tahchiev, Felipe Leme, Vincent Massol, and Gary Gregory. 2011.JUnit in Action, 2nd Edition. Manning Publications Company. doi:10.21019/9781582121994. ch9

work page doi:10.21019/9781582121994 2011
[48]

Kian-Lee Tan, Qingchao Cai, Beng Chin Ooi, Weng-Fai Wong, Chang Yao, and Hao Zhang. 2015. In-memory databases: Challenges and opportunities from software and hardware perspectives.ACM Sigmod Record44, 2 (2015), 35–40

work page 2015
[49]

Hiroto Tanaka, Shinsuke Matsumoto, and Shinji Kusumoto. 2019. A study on the current status of functional idioms in Java.IEICE Transactions on Information and Systems102, 12 (2019), 2414–2422

work page 2019
[50]

Yiming Tang, Raffi Khatchadourian, Mehdi Bagherzadeh, and Syed Ahmed. 2018. Towards safe refactoring for intelligent parallelization of Java 8 streams. InPro- ceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018, Michel Chau- dron, Ivica Crnkovic, Marsha Chechi...

work page doi:10.1145/3183440.3195098 2018
[51]

TPC. 2024. TPC-H - Homepage. http://www.tpc.org/tpch/

work page 2024
[52]

2014.Java 8 in Action: Lambdas, Streams, and functional-style programming

Raoul-Gabriel Urma, Mario Fusco, and Alan Mycroft. 2014.Java 8 in Action: Lambdas, Streams, and functional-style programming. Manning Publications Co

work page 2014
[53]

Thomas Würthinger, Christian Wimmer, Andreas Wöß, Lukas Stadler, Gilles Duboscq, Christian Humer, Gregor Richards, Doug Simon, and Mario Wolczko

work page
[54]

InProceedings of the 2013 ACM international symposium on New ideas, new paradigms, and reflections on programming & software

One VM to rule them all. InProceedings of the 2013 ACM international symposium on New ideas, new paradigms, and reflections on programming & software. 187–204

work page 2013
[55]

Hao Zhang, Gang Chen, Beng Chin Ooi, Kian-Lee Tan, and Meihui Zhang. 2015. In-memory big data management and processing: A survey.IEEE Transactions on Knowledge and Data Engineering27, 7 (2015), 1920–1948

work page 2015

[1] [1]

Matteo Basso, Eduardo Rosales, Filippo Schiavio, Andrea Rosà, and Walter Binder

work page

[2] [2]

InEuro-Par 2022: Parallel Processing - 28th International Conference on Parallel and Distributed Computing

Accurate Fork-Join Profiling on the Java Virtual Machine. InEuro-Par 2022: Parallel Processing - 28th International Conference on Parallel and Distributed Computing. Springer, 35–50. doi:10.1007/978-3-031-12597-3_3

work page doi:10.1007/978-3-031-12597-3_3 2022

[3] [3]

Matteo Basso, Filippo Schiavio, Andrea Rosà, and Walter Binder. 2022. Optimizing Parallel Java Streams. In26th International Conference on Engineering of Complex Computer Systems, ICECCS 2022, Hiroshima, Japan, March 26-30, 2022. IEEE, 23–32. doi:10.1109/ICECCS54210.2022.00012

work page doi:10.1109/iceccs54210.2022.00012 2022

[4] [4]

Stephen M Blackburn, Zixian Cai, Rui Chen, Xi Yang, John Zhang, and John Zigman. 2025. Rethinking Java Performance Analysis. InProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1, ASPLOS 2025, Rotterdam, Netherlands, 30 March 2025 - 3 April 2025. ACM. doi:10.1145/3669940.3707217

work page doi:10.1145/3669940.3707217 2025

[5] [5]

S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dinck- lage, and B. Wiedermann. 2006. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. InOOPSLA ’06: Pro...

work page doi:10.1145/1167473.1167488 2006

[6] [6]

Peter Boncz, Thomas Neumann, and Orri Erling. 2013. TPC-H analyzed: Hidden messages and lessons learned from an influential benchmark. InTechnology Conference on Performance Evaluation and Benchmarking. Springer, 61–76

work page 2013

[7] [7]

Ann Campbell

G. Ann Campbell. 2018. Cognitive complexity: an overview and evaluation. In Proceedings of the 2018 International Conference on Technical Debt(Gothenburg, Sweden)(TechDebt ’18). Association for Computing Machinery, New York, NY, USA, 57–58. doi:10.1145/3194164.3194186

work page doi:10.1145/3194164.3194186 2018

[8] [8]

Diego Costa, Cor-Paul Bezemer, Philipp Leitner, and Artur Andrzejak. 2019. What’s wrong with my benchmark results? Studying bad practices in JMH bench- marks.IEEE Transactions on Software Engineering47, 7 (2019), 1452–1467

work page 2019

[9] [9]

Markus Dreseler, Martin Boissier, Tilmann Rabl, and Matthias Uflacker. 2020. Quantifying TPC-H choke points and their optimizations.Proceedings of the VLDB Endowment13, 8 (2020), 1206–1220

work page 2020

[10] [10]

Filippo Schiavio. 2025. JEDI - Java Evaluation of Declarative vs Imperative queries. http://github.com/usi-dag/JEDI

work page 2025

[11] [11]

Filippo Schiavio. 2025. S2S - SQL To Stream. http://github.com/usi-dag/S2S

work page 2025

[12] [12]

2006.Java concurrency in practice

Brian Goetz. 2006.Java concurrency in practice. Pearson Education

work page 2006

[13] [13]

JetBrains. 2022. IntelliJ IDEA – the IDE for Pro Java and Kotlin Development. https://www.jetbrains.com/idea/

work page 2022

[14] [14]

Loveleen Kaur and Ashutosh Mishra. 2019. Cognitive complexity as a quantifier of version to version Java-based source code change: An empirical probe.Informa- tion and Software Technology106 (2019), 31–48. doi:10.1016/j.infsof.2018.09.002

work page doi:10.1016/j.infsof.2018.09.002 2019

[15] [15]

Timo Kersten, Viktor Leis, Alfons Kemper, Thomas Neumann, Andrew Pavlo, and Peter Boncz. 2018. Everything you always wanted to know about compiled and vectorized queries but were afraid to ask.Proceedings of the VLDB Endowment 11, 13 (2018), 2209–2222

work page 2018

[16] [16]

Raffi Khatchadourian, Yiming Tang, and Mehdi Bagherzadeh. 2020. Safe auto- mated refactoring for intelligent parallelization of Java 8 streams.Sci. Comput. Program.195 (2020), 102476. doi:10.1016/J.SCICO.2020.102476

work page doi:10.1016/j.scico.2020.102476 2020

[17] [17]

Raffi Khatchadourian, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed. 2018. [Engineering Paper] A Tool for Optimizing Java 8 Stream Software via Auto- mated Refactoring. In18th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2018, Madrid, Spain, September 23-24, 2018. IEEE Computer Society, 34–39. doi:10.1109/SCAM.2018.00011

work page doi:10.1109/scam.2018.00011 2018

[18] [18]

Raffi Khatchadourian, Yiming Tang, Mehdi Bagherzadeh, and Syed Ahmed. 2019. Safe automated refactoring for intelligent parallelization of Java 8 streams. In Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019, Joanne M. Atlee, Tevfik Bultan, and Jon Whittle (Eds.). IEEE / ACM, 619–630....

work page doi:10.1109/icse.2019.00072 2019

[19] [19]

Raffi Khatchadourian, Yiming Tang, Mehdi Bagherzadeh, and Baishakhi Ray. 2020. An Empirical Study on the Use and Misuse of Java 8 Streams. InFundamental Approaches to Software Engineering - 23rd International Conference, FASE 2020 (Lecture Notes in Computer Science, Vol. 12076), Heike Wehrheim and Jordi Cabot (Eds.). Springer, 97–118. doi:10.1007/978-3-03...

work page doi:10.1007/978-3-030-45234-6_5 2020

[20] [20]

McCabe. 2022. McCabe IQ - Software Metrics Glossary. http://www.mccabe. com/iq_research_metrics.htm

work page 2022

[21] [21]

Thomas J McCabe. 1976. A complexity measure.IEEE Transactions on software Engineering4 (1976), 308–320

work page 1976

[22] [22]

Nils Mehlhorn and Stefan Hanenberg. 2022. Imperative versus Declarative Collection Processing: An RCT on the Understandability of Traditional Loops versus the Stream API in Java. In44th IEEE/ACM 44th International Conference on Software Engineering, ICSE 2022, Pittsburgh, PA, USA, May 25-27, 2022. ACM, 1157–1168. doi:10.1145/3510003.3519016

work page doi:10.1145/3510003.3519016 2022

[23] [23]

Michael Duigou. 2022. Java Microbenchmarking Harness. http://openjdk.java. net/projects/code-tools/jmh/

work page 2022

[24] [24]

Anders Møller and Oskar Haarklou Veileborg. 2020. Eliminating Abstraction Overhead of Java Stream Pipelines Using Ahead-of-Time Program Optimization. Proc. ACM Program. Lang.4, OOPSLA (2020), 1–29

work page 2020

[25] [25]

Thomas Neumann. 2011. Efficiently Compiling Efficient Query Plans for Modern Hardware.Proc. VLDB Endow.4, 9 (2011), 539–550

work page 2011

[26] [26]

Thomas Neumann and Michael J. Freitag. 2020. Umbra: A Disk-Based System with In-Memory Performance. InCIDR

work page 2020

[27] [27]

Joshua Nostas, Juan Pablo Sandoval Alcocer, Diego Elias Costa, and Alexandre Bergel. 2021. How Do Developers Use the Java Stream API?. InComputational Science and Its Applications – ICCSA 2021, Osvaldo Gervasi, Beniamino Murgante, Sanjay Misra, Chiara Garau, Ivan Blečić, David Taniar, Bernady O. Apduhan, Ana Maria A. C. Rocha, Eufemia Tarantino, and Carme...

work page 2021

[28] [28]

Oracle. 2022. Ergonomics. https://docs.oracle.com/javase/8/docs/technotes/ guides/vm/gctuning/ergonomics.html

work page 2022

[29] [29]

Oracle. 2022. GraalVM. https://www.graalvm.org/

work page 2022

[30] [30]

Oracle. 2022. Java Software | Oracle. https://www.oracle.com/java/

work page 2022

[31] [31]

Oracle. 2022. Processing Data with Java SE 8 Streams, Part 1. https://www.oracle. com/technical-resources/articles/java/ma14-java-se-8-streams.html

work page 2022

[32] [32]

Oracle. 2022. Stream (JDK 24) - distinct. https://docs.oracle.com/en/java/javase/ 24/docs/api/java.base/java/util/stream/Stream.html#distinct()

work page 2022

[33] [33]

Oracle. 2024. BiConsumer (Java SE 23; JDK 23). https://docs.oracle.com/en/java/ javase/23/docs/api/java.base/java/util/function/BiConsumer.html

work page 2024

[34] [34]

Oracle. 2024. Consumer (Java SE 23; JDK 23). https://docs.oracle.com/en/java/ javase/23/docs/api/java.base/java/util/function/Consumer.html

work page 2024

[35] [35]

Oracle. 2024. java.util.stream (Java SE 23; JDK 23). https://docs. oracle.com/en/java/javase/23/docs/api/java.base/java/util/stream/package- summary.html#Ordering

work page 2024

[36] [36]

Aleksandar Prokopec, Andrea Rosà, David Leopoldseder, Gilles Duboscq, Petr Tuma, Martin Studener, Lubomír Bulej, Yudi Zheng, Alex Villazón, Doug Simon, Thomas Würthinger, and Walter Binder. 2020. Renaissance: Benchmarking Suite for Parallel Applications on the JVM. InSoftware Engineering 2020, Fachtagung des GI-Fachbereichs Softwaretechnik (LNI, Vol. P-30...

work page doi:10.18420/se2020_44 2020

[37] [37]

Mark Raasveldt and Hannes Mühleisen. 2019. Duckdb: an embeddable analytical database. InProceedings of the 2019 international conference on management of data. 1981–1984

work page 2019

[38] [38]

Eduardo Rosales, Matteo Basso, Andrea Rosà, and Walter Binder. 2023. Large- scale characterization of Java streams.Softw. Pract. Exp.53, 9 (2023), 1763–1792. doi:10.1002/SPE.3213

work page doi:10.1002/spe.3213 2023

[39] [39]

Eduardo Rosales, Matteo Basso, Andrea Rosà, and Walter Binder. 2023. Profiling and Optimizing Java Streams.Art Sci. Eng. Program.7, 3 (2023). doi:10.22152/ PROGRAMMING-JOURNAL.ORG/2023/7/10

work page 2023

[40] [40]

Filippo Schiavio, Daniele Bonetta, and Walter Binder. 2021. Language-Agnostic Integrated Queries in a Managed Polyglot Runtime.Proc. VLDB Endow.14, 8 (2021), 1414–1426. doi:10.14778/3457390.3457405

work page doi:10.14778/3457390.3457405 2021

[41] [41]

Filippo Schiavio, Daniele Bonetta, and Walter Binder. 2023. DynQ: a dynamic query engine with query-reuse capabilities embedded in a polyglot runtime. VLDB J.32, 5 (2023), 1111–1135. doi:10.1007/S00778-023-00784-2

work page doi:10.1007/s00778-023-00784-2 2023

[42] [42]

Filippo Schiavio, Andrea Rosà, and Walter Binder. 2022. SQL to Stream with S2S: An Automatic Benchmark Generator for the Java Stream API. InProceedings of the 21st ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences, GPCE 2022, Auckland, New Zealand, December 6-7, 2022, Bernhard Scholz and Yukiyoshi Kameyama (Eds.). AC...

work page arXiv 2022

[43] [43]

Andreas Sewe, Mira Mezini, Aibek Sarimbekov, and Walter Binder. 2011. Da capo con scala: Design and analysis of a scala benchmark suite for the java virtual machine. InProceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications. 657–676

work page 2011

[44] [44]

SPEC. 1998. SpecJVM2008. https://www.spec.org/jvm2008/

work page 1998

[45] [45]

SPEC. 2008. SpecJVM98. https://www.spec.org/jvm98/

work page 2008

[46] [46]

Ruby Y Tahboub, Grégory M Essertel, and Tiark Rompf. 2018. How to architect a query compiler, revisited. InProceedings of the 2018 International Conference on Management of Data. 307–322

work page 2018

[47] [47]

2011.JUnit in Action, 2nd Edition

Petar Tahchiev, Felipe Leme, Vincent Massol, and Gary Gregory. 2011.JUnit in Action, 2nd Edition. Manning Publications Company. doi:10.21019/9781582121994. ch9

work page doi:10.21019/9781582121994 2011

[48] [48]

Kian-Lee Tan, Qingchao Cai, Beng Chin Ooi, Weng-Fai Wong, Chang Yao, and Hao Zhang. 2015. In-memory databases: Challenges and opportunities from software and hardware perspectives.ACM Sigmod Record44, 2 (2015), 35–40

work page 2015

[49] [49]

Hiroto Tanaka, Shinsuke Matsumoto, and Shinji Kusumoto. 2019. A study on the current status of functional idioms in Java.IEICE Transactions on Information and Systems102, 12 (2019), 2414–2422

work page 2019

[50] [50]

Yiming Tang, Raffi Khatchadourian, Mehdi Bagherzadeh, and Syed Ahmed. 2018. Towards safe refactoring for intelligent parallelization of Java 8 streams. InPro- ceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018, Michel Chau- dron, Ivica Crnkovic, Marsha Chechi...

work page doi:10.1145/3183440.3195098 2018

[51] [51]

TPC. 2024. TPC-H - Homepage. http://www.tpc.org/tpch/

work page 2024

[52] [52]

2014.Java 8 in Action: Lambdas, Streams, and functional-style programming

Raoul-Gabriel Urma, Mario Fusco, and Alan Mycroft. 2014.Java 8 in Action: Lambdas, Streams, and functional-style programming. Manning Publications Co

work page 2014

[53] [53]

Thomas Würthinger, Christian Wimmer, Andreas Wöß, Lukas Stadler, Gilles Duboscq, Christian Humer, Gregor Richards, Doug Simon, and Mario Wolczko

work page

[54] [54]

InProceedings of the 2013 ACM international symposium on New ideas, new paradigms, and reflections on programming & software

One VM to rule them all. InProceedings of the 2013 ACM international symposium on New ideas, new paradigms, and reflections on programming & software. 187–204

work page 2013

[55] [55]

Hao Zhang, Gang Chen, Beng Chin Ooi, Kian-Lee Tan, and Meihui Zhang. 2015. In-memory big data management and processing: A survey.IEEE Transactions on Knowledge and Data Engineering27, 7 (2015), 1920–1948

work page 2015