Symetra: Visual Analytics for the Parameter Tuning Process of Symbolic Execution Engines

Donghee Hong; Jaemin Jo; Minjong Kim; Sooyoung Cha

arxiv: 2604.05349 · v1 · submitted 2026-04-07 · 💻 cs.HC · cs.SE

Symetra: Visual Analytics for the Parameter Tuning Process of Symbolic Execution Engines

Donghee Hong , Minjong Kim , Sooyoung Cha , Jaemin Jo This is my paper

Pith reviewed 2026-05-10 20:09 UTC · model grok-4.3

classification 💻 cs.HC cs.SE

keywords visual analyticssymbolic executionparameter tuninghuman-in-the-loopbranch coveragetest case generationKLEEconfiguration analysis

0 comments

The pith

Symetra visual analytics lets experts tune symbolic execution parameters to outperform automated methods in branch coverage.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Symetra as a visual analytics system for tuning the many parameters of symbolic execution engines such as KLEE. It supplies two overviews of how parameters shape branch coverage values and patterns, then adds collective analysis so users can compare groups of configurations and spot meaningful differences. Case studies show experts using the system interpret parameter effects, locate complementary settings, and reach higher branch coverage while spending fewer tuning steps than fully automated approaches alone.

Core claim

Symetra supports Human-in-the-Loop parameter tuning by offering two complementary overviews of parameter impacts on branch coverage and patterns, along with collective analysis to contrast groups of configurations, leading experts to identify better configurations than fully automated approaches in terms of branch coverage and tuning efficiency.

What carries the argument

Two complementary overviews of parameter impacts on branch coverage values and patterns, combined with collective analysis for contrasting groups of configurations.

Load-bearing premise

The visual analytics features and collective analysis will reliably let users discover better configurations than automated tuners.

What would settle it

A controlled study in which experts using Symetra achieve no statistically higher branch coverage or fewer tuning iterations than an automated tuner on the same set of programs.

Figures

Figures reproduced from arXiv: 2604.05349 by Donghee Hong, Jaemin Jo, Minjong Kim, Sooyoung Cha.

**Figure 1.** Figure 1: Human-in-the-Loop Parameter Tuning. The gray box represents an experiment where a tuner iteratively optimizes configurations to maximize the branch coverage of the given program. The results of an experiment are trials, pairs of a configuration and its branch coverage, represented as branch coverage vectors (Vi) and branch coverage values (Ci), which are visualized in Symetra. The user then analyzes the re… view at source ↗

**Figure 2.** Figure 2: The interface of Symetra. The header (A) shows statistics and the legend for branch coverage values. Two overviews, the Parameter View (B) and the Coverage View (C), provide overviews of parameters (i.e., the input to the engine) and branch coverage (i.e., the output from the engine). The user is making a trial group in the Trial View (D), and the Trial Group View (E) provides a summary of created trial gr… view at source ↗

**Figure 3.** Figure 3: Branch coverage values are shown through point coloring (A) or density map contours (B) using the global colormap (Figure 2-a1). Points can be color-coded by a selected parameter to analyze coverage-parameter correlations, with color schemes adapting to parameter types: UIS (C) and SF (D). (grep, NB = 8,225, NT = 2,200) tify clusters of trials that cover similar branches ( [PITH_FULL_IMAGE:figures/ful… view at source ↗

**Figure 4.** Figure 4: Using the Parameter View (A), P1 examined the effects of the MF and SA parameters. In the Coverage View (B), he identified two groups, Good1 and Good2, that exhibited high overall coverage and similar coverage vectors. The Comparison View (C) revealed that these groups had complementary values for S. By hovering over S=random, he observed that most trials were located in low-coverage regions, though some o… view at source ↗

**Figure 5.** Figure 5: P2 examined the effects of SF and ST in the Parameter View (A). He chose a trial group of interest, Top 10%, in the Trial Group View (B) and compared it with a group of failed trials, Failed, in the Comparison View (C), to investigate parameter values that possibly caused failures. In the Coverage View (D), he color-coded the trials (points) by SF and hovered over Top 10% to highlight only the points in th… view at source ↗

read the original abstract

Symbolic execution engines such as KLEE automatically generate test cases to maximize branch coverage, but their numerous parameters make it difficult to understand the parameters' impact, leading the user to rely on suboptimal default configurations. While automated tuners have shown promising results, they provide limited insights into why certain configurations work well, motivating the need for Human-in-the-Loop approaches. In this work, we present a visual analytics system, Symetra, designed to support Human-in-the-Loop parameter tuning of symbolic execution engines. To handle a large number of parameters and their configurations, we provide two complementary overviews of their impact on branch coverage values and patterns. Building on these overviews, our system enables collective analysis, allowing the user to contrast groups of configurations and identify differences that may affect branch coverage. We also report on case studies and a Human-in-the-Loop tuning process, demonstrating that experts not only interpreted parameter impacts and identified complementary configurations, but also improved upon fully automated approaches in both branch coverage and tuning efficiency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Symetra introduces targeted visual overviews and collective analysis for tuning symbolic execution parameters, but the case studies do not isolate the tool's contribution from expert knowledge or extra time.

read the letter

The paper's main contribution is a visual analytics system called Symetra that gives two complementary overviews of parameter impacts on branch coverage, then lets users contrast groups of configurations to spot differences. This addresses a real pain point: symbolic execution engines like KLEE have many parameters, defaults are often suboptimal, and pure automated tuners give little insight into why a config works. The system design is straightforward and domain-specific, which is useful for practitioners who need to understand rather than just optimize blindly. The case studies show experts using the tool to interpret impacts, find complementary setups, and beat automated baselines on coverage and efficiency, which is a positive signal that human-in-the-loop can add value here. That part is worth noting because it moves beyond pure automation claims common in the testing literature. The evaluation is the clear weak point. The reported gains rest on case studies without enough detail on controls: no specific automated tuner baseline with its search budget, no independent runs, no participant count, and no comparison where experts get only the automated output versus the full visual interface. Without those, the improvements could easily come from domain knowledge or simply spending more time rather than the overviews and collective analysis features. The abstract and description do not provide quantitative metrics or falsifiable details that would let a reader verify the attribution. This work is aimed at researchers in software testing and visual analytics who care about usability of complex tools. A reader looking for concrete system ideas or examples of human-in-the-loop parameter tuning would find the descriptions and features helpful to build on. It is coherent on its own terms and shows honest engagement with the limitations of automated approaches. I would send it to peer review so the authors can add the missing controls and baselines; the core idea is practical enough to justify referee time even if the current evidence needs tightening.

Referee Report

1 major / 1 minor

Summary. The paper presents Symetra, a visual analytics system for human-in-the-loop parameter tuning of symbolic execution engines such as KLEE. It provides two complementary overviews of parameter impacts on branch coverage values and patterns, plus collective analysis features to contrast groups of configurations. Case studies are reported in which experts interpret parameter effects, identify complementary configurations, and achieve improvements over fully automated tuning in both branch coverage and tuning efficiency.

Significance. If the claimed improvements can be attributed to the visual analytics and collective analysis features rather than expert knowledge or uncontrolled factors, the work would advance human-computer interaction approaches to software testing by making parameter tuning more interpretable and effective than black-box automated methods alone.

major comments (1)

[Case studies] Case studies section: the central claim that experts 'improved upon fully automated approaches in both branch coverage and tuning efficiency' using Symetra is not supported by a controlled comparison. No quantitative baseline is provided for the automated tuner (specific algorithm, search budget, or number of independent runs), no participant count or expertise details are given, and there is no condition isolating Symetra's overviews and collective-analysis features (e.g., experts given only automated outputs versus the full interface). This prevents crediting the observed gains specifically to the proposed Human-in-the-Loop mechanisms.

minor comments (1)

[Abstract and System Overview] The abstract and system description would benefit from explicit statements of the number of parameters handled, the exact visual encodings used in the two overviews, and any quantitative metrics collected during the case studies.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for the constructive feedback and the opportunity to address the concerns about the evaluation in our case studies. We provide point-by-point responses below and indicate where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: Case studies section: the central claim that experts 'improved upon fully automated approaches in both branch coverage and tuning efficiency' using Symetra is not supported by a controlled comparison. No quantitative baseline is provided for the automated tuner (specific algorithm, search budget, or number of independent runs), no participant count or expertise details are given, and there is no condition isolating Symetra's overviews and collective-analysis features (e.g., experts given only automated outputs versus the full interface). This prevents crediting the observed gains specifically to the proposed Human-in-the-Loop mechanisms.

Authors: We agree that the case studies do not include a controlled experimental design that isolates the effects of Symetra's visual overviews and collective analysis features from expert knowledge or other factors. The studies were designed as qualitative demonstrations of the human-in-the-loop tuning process in practice, showing how experts interpreted parameter impacts and identified complementary configurations to improve upon an automated baseline. We will revise the manuscript to add the missing details on the automated tuner (specific algorithm, search budget, and number of independent runs) and on the participants (count and expertise). We will also tone down the central claim to state that improvements were observed during expert use of the full Symetra system rather than attributing them solely to the visual analytics components. A full controlled user study with isolated conditions remains outside the scope of the current work and is noted as future research. revision: partial

standing simulated objections not resolved

The absence of a controlled comparison condition that isolates Symetra's interface features from expert knowledge alone, as this would require new experimental data not collected in the reported case studies.

Circularity Check

0 steps flagged

No circularity: system description and case studies contain no derivations, equations, or fitted predictions that reduce to inputs by construction.

full rationale

The paper presents a visual analytics tool (Symetra) for exploring parameter impacts on symbolic execution engines, supported by overviews, collective analysis features, and case-study outcomes. No mathematical derivation chain, parameter fitting, or predictive model is claimed or executed; the central claims rest on qualitative expert interpretations and observed improvements in branch coverage and efficiency during human-in-the-loop sessions. These are empirical reports, not reductions of a result to its own fitted inputs or self-citations. The absence of any load-bearing self-citation chain, ansatz smuggling, or renaming of known results keeps the work self-contained against external benchmarks such as automated tuners.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

This is a system-description paper in HCI/SE; the central claim rests on the assumption that visual interfaces meaningfully aid human interpretation of parameter effects and that case-study participants are representative of target users. No free parameters or invented scientific entities are introduced.

axioms (1)

domain assumption Users can effectively interpret visual overviews of parameter impacts to guide configuration choices
Foundational premise for the human-in-the-loop design described in the abstract.

pith-pipeline@v0.9.0 · 5479 in / 1129 out tokens · 27397 ms · 2026-05-10T20:09:05.806650+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

4 extracted references · 4 canonical work pages

[1]

C., DEMETRESCUC., LENTIS., NICCHIS., SAN- TUCCIG.: Symnav: Visually assisting symbolic execution

[ABB∗19] ANGELINIM., BLASILLIG., BORZACCHIELLOL., COPPA E., D’ELIAD. C., DEMETRESCUC., LENTIS., NICCHIS., SAN- TUCCIG.: Symnav: Visually assisting symbolic execution. In2019 IEEE Symposium on Visualization for Cyber Security (VizSec)(2019), pp. 1–11.doi:10.1109/VizSec48167.2019.9161524. 3 [air24]AIRBNB: visx - a collection of expressive, low-level visuali...

work page doi:10.1109/vizsec48167.2019.9161524 2019
[2]

Akiba, S

7 [ASY∗19] AKIBAT., SANOS., YANASET., OHTAT., KOYAMAM.: Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining(New York, NY , USA, 2019), KDD ’19, Association for Computing Machinery, p. 2623–2631.doi: 10.1145/3292500.3330701. 2 [BCD∗18] BALDONIR....

work page doi:10.1145/3292500.3330701 2019
[3]

2, 6 [Now19] NOWACKM.: Fine-grain memory object representation in symbolic execution

URL: https://github.com/microsoft/nni. 2, 6 [Now19] NOWACKM.: Fine-grain memory object representation in symbolic execution. In2019 34th IEEE/ACM International Confer- ence on Automated Software Engineering (ASE)(2019), pp. 912–923. doi:10.1109/ASE.2019.00089. 2 [PKR19] PANDEYA., KOTCHARLAKOTAP. R. G., ROYS.: Deferred concretization in symbolic execution ...

work page doi:10.1109/ase.2019.00089 2019
[4]

J., VEERAMACHANENIK., QUH.: Atmseer: Increasing trans- parency and controllability in automated machine learning

7 [WMJ∗19] WANGQ., MINGY., JINZ., SHENQ., LIUD., SMITH M. J., VEERAMACHANENIK., QUH.: Atmseer: Increasing trans- parency and controllability in automated machine learning. InProceed- ings of the 2019 CHI Conference on Human Factors in Computing Sys- tems(New York, NY , USA, 2019), CHI ’19, Association for Computing Machinery, p. 1–12.doi:10.1145/3290605.3...

work page doi:10.1145/3290605.3300911 2019

[1] [1]

C., DEMETRESCUC., LENTIS., NICCHIS., SAN- TUCCIG.: Symnav: Visually assisting symbolic execution

[ABB∗19] ANGELINIM., BLASILLIG., BORZACCHIELLOL., COPPA E., D’ELIAD. C., DEMETRESCUC., LENTIS., NICCHIS., SAN- TUCCIG.: Symnav: Visually assisting symbolic execution. In2019 IEEE Symposium on Visualization for Cyber Security (VizSec)(2019), pp. 1–11.doi:10.1109/VizSec48167.2019.9161524. 3 [air24]AIRBNB: visx - a collection of expressive, low-level visuali...

work page doi:10.1109/vizsec48167.2019.9161524 2019

[2] [2]

Akiba, S

7 [ASY∗19] AKIBAT., SANOS., YANASET., OHTAT., KOYAMAM.: Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining(New York, NY , USA, 2019), KDD ’19, Association for Computing Machinery, p. 2623–2631.doi: 10.1145/3292500.3330701. 2 [BCD∗18] BALDONIR....

work page doi:10.1145/3292500.3330701 2019

[3] [3]

2, 6 [Now19] NOWACKM.: Fine-grain memory object representation in symbolic execution

URL: https://github.com/microsoft/nni. 2, 6 [Now19] NOWACKM.: Fine-grain memory object representation in symbolic execution. In2019 34th IEEE/ACM International Confer- ence on Automated Software Engineering (ASE)(2019), pp. 912–923. doi:10.1109/ASE.2019.00089. 2 [PKR19] PANDEYA., KOTCHARLAKOTAP. R. G., ROYS.: Deferred concretization in symbolic execution ...

work page doi:10.1109/ase.2019.00089 2019

[4] [4]

J., VEERAMACHANENIK., QUH.: Atmseer: Increasing trans- parency and controllability in automated machine learning

7 [WMJ∗19] WANGQ., MINGY., JINZ., SHENQ., LIUD., SMITH M. J., VEERAMACHANENIK., QUH.: Atmseer: Increasing trans- parency and controllability in automated machine learning. InProceed- ings of the 2019 CHI Conference on Human Factors in Computing Sys- tems(New York, NY , USA, 2019), CHI ’19, Association for Computing Machinery, p. 1–12.doi:10.1145/3290605.3...

work page doi:10.1145/3290605.3300911 2019