pith. sign in

arxiv: 1907.05637 · v1 · pith:7A7Z5DFKnew · submitted 2019-07-12 · 💻 cs.PL

Concolic Testing Heap-Manipulating Programs

Pith reviewed 2026-05-24 22:17 UTC · model grok-4.3

classification 💻 cs.PL
keywords concolic testingseparation logicheap-manipulating programstest input generationsymbolic executionprogram testing
0
0 comments X

The pith

CSF is the first concolic testing engine for heap-manipulating programs that uses separation logic to generate valid test inputs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a technique to automatically create test inputs for programs that build and modify complex structures such as linked lists and trees stored in the heap. Traditional concolic testing works well for numeric code but struggles when inputs must satisfy precise shape and pointer constraints. CSF addresses this by encoding those constraints in separation logic and combining it with both specification-driven generation and symbolic execution. Evaluation on difficult examples shows the engine produces valid inputs and reaches high coverage while remaining efficient. The work also indicates that CSF can pair with tools that infer preconditions, lowering the need for manual specifications.

Core claim

We propose the first concolic testing engine called CSF for heap-manipulating programs based on separation logic. CSF effectively combines specification-based testing and concolic execution for test input generation. It is evaluated on a set of challenging heap-manipulating programs. The results show that CSF generates valid test inputs with high coverage efficiently. Furthermore, we show that CSF can be potentially used in combination with precondition inference tools to reduce the user effort.

What carries the argument

The CSF engine, which encodes path conditions over heap objects using separation logic to drive concolic execution and input generation.

If this is right

  • Valid test inputs are produced for programs whose correctness depends on heap shapes.
  • High statement and branch coverage is reached without excessive runtime cost.
  • User effort drops when CSF is paired with tools that automatically infer input preconditions.
  • The same engine can be applied to other pointer-based data structures beyond the evaluated examples.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach may extend naturally to languages whose memory model differs from the one assumed in the evaluation.
  • Combining CSF with static analyses could further prune infeasible paths before concolic search begins.

Load-bearing premise

Separation logic can encode and solve path conditions over heap objects with both precision and practical computational cost.

What would settle it

An evaluation in which CSF is run on the same set of heap-manipulating programs yet produces mostly invalid inputs or fails to reach the reported coverage levels.

Figures

Figures reproduced from arXiv: 1907.05637 by Jun Sun, Long H. Pham, Quang Loc Le, Quoc-Sang Phan.

Figure 1
Figure 1. Figure 1: Sample program satisfy strict requirements. Firstly, all BinaryNode objects must be structured in a binary tree shape. Secondly, for any BinaryNode object in the tree, its element value must be greater than all the element values of its left sub-tree and less than those of the right sub-tree. One way to define valid binary search tree objects is through programming a repOK method [9,45], as shown in App. 1… view at source ↗
Figure 2
Figure 2. Figure 2: Specification language, where k is a 32-bit integer constant, v¯ is a sequence of variables tion, the exploration is pruned when the heap configuration violates the specification. However, HEX has limited expressiveness, e.g., HEX cannot capture the property that the nodes in the binary search tree are sorted due to the lack of arithmetic constraints. In comparison, our approach works as follows. We use se… view at source ↗
Figure 3
Figure 3. Figure 3: Unfoldings that the definition of Pi(t¯i) is a disjunction of multiple base cases and inductive cases. During unfolding, κ is split into a set of formulae, one for each disjunct in the definition of every inductive predicate Pi(t¯i) in κ. The process ends when n reaches 0. Procedure unfold is formalized as follows. Given an inductively predicate defini￾tion pred Pi (¯vi) ≡ Φi and a formula constituted with… view at source ↗
Figure 4
Figure 4. Figure 4: Two test inputs datat ::= data c { (type v; )∗ } type ::= bool | int | c prog ::= stmt∗ stmt ::= v := e | v.fi := e | goto e | assert e | if e0 then goto e1 else goto e2 | v := new c(v1, ..., vn) | free v e ::= k | v | v.fi | e1 opb e2 | opu e | null [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: A core intermediate language The correctness of the algorithm, i.e., each generated test input is a valid one, is straightforward as each symbolic model obtained from the unfolding satisfies the orig￾inal precondition, since each one is an under-approximation of a ∆ in Γ. 4 Concolic Execution Specification-based testing allows us to generate test inputs which cover some parts of the program. Some program p… view at source ↗
Figure 6
Figure 6. Figure 6: Execution rules: Σ[x ← k] updates the mapping Σ by setting x to be k; fresh is used as an overloading function to return a new variable/address; s ` e ⇓ k denotes the evaluation of expression e to a concrete value k in the current context s Intuitively, if the conditions above the line is satisfied, a node matching the current state generates multiple children nodes. In the following, we explain some of th… view at source ↗
Figure 7
Figure 7. Figure 7: Constraint trees construction: a question mark represents an unexplored path and OK denotes the execution terminates without error keeps the constraints with the field-access form (i.e., v.fi) and field-assign form (i.e., v.fi := e) and will eliminate them before sending these formulae to the solver. In the rule [C−TCOND], two new nodes denoting the then branch and the else branch of the condition are adde… view at source ↗
Figure 8
Figure 8. Figure 8: An example in the second experiment specification-based testing approach does not suffer this limitation because it generates test inputs independently of programs under test. In this experiment, we aim to show the usefulness of specification-based testing in CSF, especially for programs with complex numerical conditions. To do that, we systematically compose a set of programs which travel a singly-linked … view at source ↗
Figure 9
Figure 9. Figure 9: A test input which leads to RuntimeException shows the number of exceptions in the category. Lastly, two columns #Calls and T ime(s) show the number of solver calls and the time needed to generate the test inputs respectively. In summary, CSF generates 292 test inputs in 344 seconds which achieved 58.36% branch coverage in average. Our investigation shows that all of these test inputs are valid according t… view at source ↗
read the original abstract

Concolic testing is a test generation technique which works effectively by integrating random testing generation and symbolic execution. Existing concolic testing engines focus on numeric programs. Heap-manipulating programs make extensive use of complex heap objects like trees and lists. Testing such programs is challenging due to multiple reasons. Firstly, test inputs for such program are required to satisfy non-trivial constraints which must be specified precisely. Secondly, precisely encoding and solving path conditions in such programs are challenging and often expensive. In this work, we propose the first concolic testing engine called CSF for heap-manipulating programs based on separation logic. CSF effectively combines specification-based testing and concolic execution for test input generation. It is evaluated on a set of challenging heap-manipulating programs. The results show that CSF generates valid test inputs with high coverage efficiently. Furthermore, we show that CSF can be potentially used in combination with precondition inference tools to reduce the user effort.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper proposes CSF, the first concolic testing engine for heap-manipulating programs based on separation logic. CSF combines specification-based testing and concolic execution to generate test inputs for programs using complex heap objects such as trees and lists. It is evaluated on a set of challenging heap-manipulating programs, with results claimed to show that CSF generates valid test inputs with high coverage efficiently. The work also suggests CSF can be combined with precondition inference tools to reduce user effort.

Significance. If the implementation details, algorithms, and evaluation metrics support the claims, the work would represent a meaningful extension of concolic testing beyond numeric programs into the heap-manipulating domain. The use of separation logic to encode path conditions involving heap structures addresses a recognized challenge, and the hybrid approach with specification-based testing could improve practicality.

major comments (1)
  1. [Abstract] Abstract: the claims of 'high coverage efficiently' and 'generates valid test inputs' are stated without any metrics, implementation details, derivation steps, or evaluation data. This makes it impossible to assess whether the separation-logic encoding is both precise and tractable as asserted in the weakest assumption.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and positive assessment of the work's significance. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claims of 'high coverage efficiently' and 'generates valid test inputs' are stated without any metrics, implementation details, derivation steps, or evaluation data. This makes it impossible to assess whether the separation-logic encoding is both precise and tractable as asserted in the weakest assumption.

    Authors: We agree that the abstract, being a high-level summary, does not contain the quantitative metrics, implementation details or derivation steps that appear in Sections 4--6 of the manuscript. The evaluation section does report concrete coverage figures, input validity rates, and runtime measurements on the benchmark programs, together with the separation-logic encoding and its solver integration. To address the concern directly, we will revise the abstract to incorporate a small number of key quantitative results drawn from that evaluation, thereby making the claims more concrete while remaining within the usual length constraints of an abstract. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper describes the design and evaluation of a concolic testing tool CSF that encodes heap path conditions using separation logic and combines it with specification-based testing. The central claims concern the tool's ability to generate valid inputs with high coverage on evaluated programs. No equations, derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided abstract or described claims. The contribution is an engineering artifact whose correctness is assessed empirically rather than through any self-referential mathematical reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; no free parameters, detailed axioms, or invented scientific entities are identifiable from the given text.

axioms (1)
  • domain assumption Separation logic can be applied to encode path conditions for heap objects in concolic execution.
    This is the core technical premise stated in the abstract for making the engine work.

pith-pipeline@v0.9.0 · 5693 in / 1188 out tokens · 25986 ms · 2026-05-24T22:17:45.323435+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages

  1. [1]

    https://blog.trailofbits.com/2016/08/02/ engineering-solutions-to-hard-program-analysis-problems/

    A fuzzer and a symbolic executor walk into a cloud. https://blog.trailofbits.com/2016/08/02/ engineering-solutions-to-hard-program-analysis-problems/

  2. [2]

    https://fbinfer.com/

    Facebook Infer. https://fbinfer.com/

  3. [3]

    https://github.com/bardsoftware/ganttproject

    GanttProject. https://github.com/bardsoftware/ganttproject

  4. [4]

    https://www.eclemma.org/jacoco/

    JaCoCo. https://www.eclemma.org/jacoco/

  5. [5]

    http://plexil.sourceforge.net

    PLEXIL. http://plexil.sourceforge.net

  6. [6]

    https://github.com/nasa/PLEXIL5

    PLEXIL5. https://github.com/nasa/PLEXIL5

  7. [7]

    http://sir.unl.edu/portal/index.php

    SIR. http://sir.unl.edu/portal/index.php

  8. [8]

    https://code.google.com/archive/p/sireum/downloads

    Sireum. https://code.google.com/archive/p/sireum/downloads

  9. [9]

    In: Frankl, P.G

    Boyapati, C., Khurshid, S., Marinov, D.: Korat: Automated Testing Based on Java Predicates. In: Frankl, P.G. (ed.) ISSTA 2002, pp. 123–133. ACM (2002). https://doi.org/10.1145/566172.566191

  10. [10]

    In: Bultan, T., Sen, K

    Braione, P., Denaro, G., Mattavelli, A., Pezz `e, M.: Combining Symbolic Execution and Search-based Testing for Programs with Complex Heap Inputs. In: Bultan, T., Sen, K. (eds.) ISSTA 2017, pp. 90–101. ACM (2017). https://doi.org/10.1145/3092703.3092715

  11. [11]

    In: Nitto, E.D., Harman, M., Heymans, P

    Braione, P., Denaro, G., Pezz `e, M.: Symbolic Execution of Programs with Heap Inputs. In: Nitto, E.D., Harman, M., Heymans, P. (eds.) FSE 2015, pp. 602–613. ACM (2015). https://doi.org/10.1145/2786805.2786842

  12. [12]

    In: Zimmermann, T., Cleland-Huang, J., Su, Z

    Braione, P., Denaro, G., Pezz `e, M.: JBSE: A Symbolic Executor for Java Programs with Complex Heap Inputs. In: Zimmermann, T., Cleland-Huang, J., Su, Z. (eds.) FSE 2016, pp. 1018–1022. ACM (2016). https://doi.org/10.1145/2950290.2983940

  13. [13]

    O’Hearn, and Hongseok Yang

    Calcagno, C., Distefano, D., O’Hearn, P.W., Yang, H.: Compositional Shape Analysis by Means of Bi-Abduction. JACM 58(6), 26:1–26:66 (2011). https://doi.org/10.1145/2049697.2049700

  14. [14]

    Chin, W.N., David, C., Nguyen, H.H., Qin, S.: Automated Verification of Shape, Size and Bag Properties via User-defined Predicates in Separation Logic. Sci. Comput. Program. 77(9), 1006–1036 (2012). https://doi.org/10.1016/j.scico.2010.07.004

  15. [15]

    In: ASE 2006, pp

    Deng, X., Lee, J., Robby: Bogor/Kiasan: A K-bounded Symbolic Execution for Checking Strong Heap Properties of Open Systems. In: ASE 2006, pp. 157–166. IEEE Computer So- ciety (2006). https://doi.org/10.1109/ASE.2006.26

  16. [16]

    In: SEFM 2007

    Deng, X., Robby, Hatcliff, J.: Towards A Case-Optimal Symbolic Execution Algorithm for Analyzing Strong Properties of Object-Oriented Programs. In: SEFM 2007. IEEE Computer Society (2007). https://doi.org/10.1109/SEFM.2007.43

  17. [17]

    Master’s thesis, Massachusetts Institute of Technology, USA (2003)

    Dennis, G.D.: TSAFE : Building a Trusted Computing Base for Air Traffic Control Software. Master’s thesis, Massachusetts Institute of Technology, USA (2003)

  18. [18]

    In: Sarkar, V ., Hall, M.W

    Godefroid, P., Klarlund, N., Sen, K.: DART: Directed Automated Random Test- ing. In: Sarkar, V ., Hall, M.W. (eds.) PLDI 2005, pp. 213–223. ACM (2005). https://doi.org/10.1145/1065010.1065036

  19. [19]

    Queue 10(1), 20:20–20:27 (2012)

    Godefroid, P., Levin, M.Y ., Molnar, D.: SAGE: Whitebox Fuzzing for Security Testing. Queue 10(1), 20:20–20:27 (2012). https://doi.org/10.1145/2090147.2094081

  20. [20]

    In: Petrenko, A., Ulrich, A

    Heimdahl, M.P.E., Rayadurgam, S., Visser, W., Devaraj, G., Gao, J.: Auto-generating Test Sequences Using Model Checkers: A Case Study. In: Petrenko, A., Ulrich, A. (eds.) FATES 2003, pp. 42–59. Springer (2003). https://doi.org/10.1007/978-3-540-24617-6 4

  21. [21]

    In: Jobstmann, B., Leino, K.R.M

    Hillery, B., Mercer, E., Rungta, N., Person, S.: Exact Heap Summaries for Symbolic Execu- tion. In: Jobstmann, B., Leino, K.R.M. (eds.) VMCAI 2016, pp. 206–225. Springer (2016). https://doi.org/10.1007/978-3-662-49122-5 10

  22. [22]

    In: Katoen, J.P., Stevens, P

    Hong, H.S., Lee, I., Sokolsky, O., Ural, H.: A Temporal Logic Based Theory of Test Cover- age and Generation. In: Katoen, J.P., Stevens, P. (eds.) TACAS 2002, pp. 327–341. Springer (2002). https://doi.org/10.1007/3-540-46002-0 23

  23. [23]

    In: Hankin, C., Schmidt, D

    Ishtiaq, S.S., O’Hearn, P.W.: BI as an Assertion Language for Mutable Data Struc- tures. In: Hankin, C., Schmidt, D. (eds.) POPL 2001, pp. 14–26. ACM (2001). https://doi.org/10.1145/360204.375719

  24. [24]

    In: Denney, E., Giannakopoulou, D., Pasareanu, C.S

    Jayaraman, K., Harvison, D., Ganesh, V ., Kiezun, A.: jFuzz: A Concolic Whitebox Fuzzer for Java. In: Denney, E., Giannakopoulou, D., Pasareanu, C.S. (eds.) NFM 2009, pp. 121–125 (2009)

  25. [25]

    In: Garavel, H., Hatcliff, J

    Khurshid, S., P ˘as˘areanu, C.S., Visser, W.: Generalized Symbolic Execution for Model Checking and Testing. In: Garavel, H., Hatcliff, J. (eds.) TACAS 2003, pp. 553–568. Springer (2003). https://doi.org/10.1007/3-540-36577-X 40

  26. [26]

    King, J.C.: Symbolic Execution and Program Testing. Commun. ACM 19(7), 385–394 (1976). https://doi.org/10.1145/360248.360252

  27. [27]

    In: Biere, A., Bloem, R

    Le, Q.L., Gherghina, C., Qin, S., Chin, W.N.: Shape Analysis via Second-Order Bi- Abduction. In: Biere, A., Bloem, R. (eds.) CA V 2014, pp. 52–68. Springer (2014). https://doi.org/10.1007/978-3-319-08867-9 4

  28. [29]

    In: Beyer, D., Huisman, M

    Le, Q.L., Sun, J., Qin, S.: Frame Inference for Inductive Entailment Proofs in Separa- tion Logic. In: Beyer, D., Huisman, M. (eds.) TACAS 2018, pp. 41–60. Springer (2018). https://doi.org/10.1007/978-3-319-89960-2 3

  29. [30]

    In: Majumdar, R., Kuncak, V

    Le, Q.L., Tatsuta, M., Sun, J., Chin, W.: A Decidable Fragment in Separation Logic with Inductive Predicates and Arithmetic. In: Majumdar, R., Kuncak, V . (eds.) CA V 2017, pp. 495–517. Springer (2017). https://doi.org/10.1007/978-3-319-63390-9 26

  30. [31]

    In: ICSME 2016, pp

    Le, X.D., Le, Q.L., Lo, D., Le Goues, C.: Enhancing Automated Program Repair with Deductive Verification. In: ICSME 2016, pp. 428–432. IEEE Computer Society (2016). https://doi.org/10.1109/ICSME.2016.66

  31. [32]

    In: Proc

    Luckow, K., Dimja ˇsevi´c, M., Giannakopoulou, D., Howar, F., Isberner, M., Kahsai, T., Raka- mari´c, Z., Raman, V .: JDart: A Dynamic Symbolic Analysis Framework. In: Chechik, M., Raskin, J.F. (eds.) TACAS 2016, pp. 442–459. Springer (2016). https://doi.org/10.1007/978- 3-662-49674-9 26

  32. [33]

    In: Glinz, M., Murphy, G.C., Pezz`e, M

    Marinescu, P.D., Cadar, C.: Make Test-zesti: A Symbolic Execution Solution for Improving Regression Testing. In: Glinz, M., Murphy, G.C., Pezz`e, M. (eds.) ICSE 2012, pp. 716–726. IEEE Computer Society (2012). https://doi.org/10.1109/ICSE.2012.6227146

  33. [34]

    In: ASE 2001, pp

    Marinov, D., Khurshid, S.: TestEra: A Novel Framework for Automated Test- ing of Java Programs. In: ASE 2001, pp. 22–. IEEE Computer Society (2001). https://doi.org/10.1109/ASE.2001.989787

  34. [35]

    In: ATV A 2019

    Pham, L.H., Le, Q.L., Phan, Q.S., Sun, J., Qin, S.: Enhancing Symbolic Execution of Heap- based Programs with Separation Logic for Test Input Generation. In: ATV A 2019. To appear

  35. [36]

    In: Chaudron, M., Crnkovic, I., Chechik, M., Harman, M

    Pham, L.H., Le, Q.L., Phan, Q.S., Sun, J., Qin, S.: Testing Heap-based Programs with Java StarFinder. In: Chaudron, M., Crnkovic, I., Chechik, M., Harman, M. (eds.) ICSE 2018, pp. 268–269. ACM (2018). https://doi.org/10.1145/3183440.3194964

  36. [38]

    In: LICS 2002, pp

    Reynolds, J.: Separation Logic: A Logic for Shared Mutable Data Structures. In: LICS 2002, pp. 55–74. IEEE Computer Society (2002). https://doi.org/10.1109/LICS.2002.1029817

  37. [39]

    PACMPL 3(POPL), 66:1–66:31 (2019)

    Santos, J.F., Maksimovi, P., Sampaio, G., Gardner, P.: JaVerT 2.0: Composi- tional Symbolic Execution for JavaScript. PACMPL 3(POPL), 66:1–66:31 (2019). https://doi.org/10.1145/3290379

  38. [40]

    In: Wermelinger, M., Gall, H.C

    Sen, K., Marinov, D., Agha, G.: CUTE: A Concolic Unit Testing Engine for C. In: Wermelinger, M., Gall, H.C. (eds.) FSE 2005, pp. 263–272. ACM (2005). https://doi.org/10.1145/1081706.1081750

  39. [41]

    In: NDSS 2016

    Stephens, N., Grosen, J., Salls, C., Dutcher, A., Wang, R., Corbetta, J., Shoshitaishvili, Y ., Kruegel, C., Vigna, G.: Driller: Augmenting Fuzzing Through Selective Symbolic Execu- tion. In: NDSS 2016. The Internet Society (2016)

  40. [42]

    In: Bertolino, A., Canfora, G., Elbaum, S.G

    Tanno, H., Zhang, X., Hoshino, T., Sen, K.: TesMa and CATG: Automated Test Generation Tools for Models of Enterprise Applications. In: Bertolino, A., Canfora, G., Elbaum, S.G. (eds.) ICSE 2015, pp. 717–720. IEEE Computer Society (2015). https://doi.org/10.1109/ICSE.2015.231

  41. [43]

    Scalable Funding of Bitcoin Micropayment Channel Networks

    Tillmann, N., De Halleux, J.: Pex-White Box Test Generation for .NET. In: Beckert, B., H¨ahnle, R. (eds.) TAP 2008, pp. 134–153. Springer (2008). https://doi.org/10.1007/978-3- 540-79124-9 10

  42. [44]

    In: Kowalewski, S., Philippou, A

    Vanoverberghe, D., Tillmann, N., Piessens, F.: Test Input Generation for Programs with Pointers. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009, pp. 277–291. Springer (2009). https://doi.org/10.1007/978-3-642-00768-2 25

  43. [45]

    In: Avrunin, G.S., Rothermel, G

    Visser, W., P ˇasˇareanu, C.S., Khurshid, S.: Test Input Generation with Java PathFinder. In: Avrunin, G.S., Rothermel, G. (eds.) ISSTA 2004, pp. 97–107. ACM (2004). https://doi.org/10.1145/1007512.1007526

  44. [46]

    In: Chaudron, M., Crnkovic, I., Chechik, M., Harman, M

    Wang, X., Sun, J., Chen, Z., Zhang, P., Wang, J., Lin, Y .: Towards Optimal Concolic Testing. In: Chaudron, M., Crnkovic, I., Chechik, M., Harman, M. (eds.) ICSE 2018, pp. 291–302 (2018). https://doi.org/10.1145/3180155.3180177

  45. [47]

    In: Enck, W., Felt, A.P

    Yun, I., Lee, S., Xu, M., Jang, Y ., Kim, T.: QSYM : A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing. In: Enck, W., Felt, A.P. (eds.) USENIX Security 2018, pp. 745–761. USENIX Association (2018)