Concolic Testing Heap-Manipulating Programs
Pith reviewed 2026-05-24 22:17 UTC · model grok-4.3
The pith
CSF is the first concolic testing engine for heap-manipulating programs that uses separation logic to generate valid test inputs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose the first concolic testing engine called CSF for heap-manipulating programs based on separation logic. CSF effectively combines specification-based testing and concolic execution for test input generation. It is evaluated on a set of challenging heap-manipulating programs. The results show that CSF generates valid test inputs with high coverage efficiently. Furthermore, we show that CSF can be potentially used in combination with precondition inference tools to reduce the user effort.
What carries the argument
The CSF engine, which encodes path conditions over heap objects using separation logic to drive concolic execution and input generation.
If this is right
- Valid test inputs are produced for programs whose correctness depends on heap shapes.
- High statement and branch coverage is reached without excessive runtime cost.
- User effort drops when CSF is paired with tools that automatically infer input preconditions.
- The same engine can be applied to other pointer-based data structures beyond the evaluated examples.
Where Pith is reading between the lines
- The approach may extend naturally to languages whose memory model differs from the one assumed in the evaluation.
- Combining CSF with static analyses could further prune infeasible paths before concolic search begins.
Load-bearing premise
Separation logic can encode and solve path conditions over heap objects with both precision and practical computational cost.
What would settle it
An evaluation in which CSF is run on the same set of heap-manipulating programs yet produces mostly invalid inputs or fails to reach the reported coverage levels.
Figures
read the original abstract
Concolic testing is a test generation technique which works effectively by integrating random testing generation and symbolic execution. Existing concolic testing engines focus on numeric programs. Heap-manipulating programs make extensive use of complex heap objects like trees and lists. Testing such programs is challenging due to multiple reasons. Firstly, test inputs for such program are required to satisfy non-trivial constraints which must be specified precisely. Secondly, precisely encoding and solving path conditions in such programs are challenging and often expensive. In this work, we propose the first concolic testing engine called CSF for heap-manipulating programs based on separation logic. CSF effectively combines specification-based testing and concolic execution for test input generation. It is evaluated on a set of challenging heap-manipulating programs. The results show that CSF generates valid test inputs with high coverage efficiently. Furthermore, we show that CSF can be potentially used in combination with precondition inference tools to reduce the user effort.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes CSF, the first concolic testing engine for heap-manipulating programs based on separation logic. CSF combines specification-based testing and concolic execution to generate test inputs for programs using complex heap objects such as trees and lists. It is evaluated on a set of challenging heap-manipulating programs, with results claimed to show that CSF generates valid test inputs with high coverage efficiently. The work also suggests CSF can be combined with precondition inference tools to reduce user effort.
Significance. If the implementation details, algorithms, and evaluation metrics support the claims, the work would represent a meaningful extension of concolic testing beyond numeric programs into the heap-manipulating domain. The use of separation logic to encode path conditions involving heap structures addresses a recognized challenge, and the hybrid approach with specification-based testing could improve practicality.
major comments (1)
- [Abstract] Abstract: the claims of 'high coverage efficiently' and 'generates valid test inputs' are stated without any metrics, implementation details, derivation steps, or evaluation data. This makes it impossible to assess whether the separation-logic encoding is both precise and tractable as asserted in the weakest assumption.
Simulated Author's Rebuttal
We thank the referee for the constructive review and positive assessment of the work's significance. We address the single major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claims of 'high coverage efficiently' and 'generates valid test inputs' are stated without any metrics, implementation details, derivation steps, or evaluation data. This makes it impossible to assess whether the separation-logic encoding is both precise and tractable as asserted in the weakest assumption.
Authors: We agree that the abstract, being a high-level summary, does not contain the quantitative metrics, implementation details or derivation steps that appear in Sections 4--6 of the manuscript. The evaluation section does report concrete coverage figures, input validity rates, and runtime measurements on the benchmark programs, together with the separation-logic encoding and its solver integration. To address the concern directly, we will revise the abstract to incorporate a small number of key quantitative results drawn from that evaluation, thereby making the claims more concrete while remaining within the usual length constraints of an abstract. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper describes the design and evaluation of a concolic testing tool CSF that encodes heap path conditions using separation logic and combines it with specification-based testing. The central claims concern the tool's ability to generate valid inputs with high coverage on evaluated programs. No equations, derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided abstract or described claims. The contribution is an engineering artifact whose correctness is assessed empirically rather than through any self-referential mathematical reduction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Separation logic can be applied to encode path conditions for heap objects in concolic execution.
Reference graph
Works this paper leans on
-
[1]
https://blog.trailofbits.com/2016/08/02/ engineering-solutions-to-hard-program-analysis-problems/
A fuzzer and a symbolic executor walk into a cloud. https://blog.trailofbits.com/2016/08/02/ engineering-solutions-to-hard-program-analysis-problems/
work page 2016
- [2]
-
[3]
https://github.com/bardsoftware/ganttproject
GanttProject. https://github.com/bardsoftware/ganttproject
- [4]
- [5]
- [6]
- [7]
-
[8]
https://code.google.com/archive/p/sireum/downloads
Sireum. https://code.google.com/archive/p/sireum/downloads
-
[9]
Boyapati, C., Khurshid, S., Marinov, D.: Korat: Automated Testing Based on Java Predicates. In: Frankl, P.G. (ed.) ISSTA 2002, pp. 123–133. ACM (2002). https://doi.org/10.1145/566172.566191
-
[10]
Braione, P., Denaro, G., Mattavelli, A., Pezz `e, M.: Combining Symbolic Execution and Search-based Testing for Programs with Complex Heap Inputs. In: Bultan, T., Sen, K. (eds.) ISSTA 2017, pp. 90–101. ACM (2017). https://doi.org/10.1145/3092703.3092715
-
[11]
In: Nitto, E.D., Harman, M., Heymans, P
Braione, P., Denaro, G., Pezz `e, M.: Symbolic Execution of Programs with Heap Inputs. In: Nitto, E.D., Harman, M., Heymans, P. (eds.) FSE 2015, pp. 602–613. ACM (2015). https://doi.org/10.1145/2786805.2786842
-
[12]
In: Zimmermann, T., Cleland-Huang, J., Su, Z
Braione, P., Denaro, G., Pezz `e, M.: JBSE: A Symbolic Executor for Java Programs with Complex Heap Inputs. In: Zimmermann, T., Cleland-Huang, J., Su, Z. (eds.) FSE 2016, pp. 1018–1022. ACM (2016). https://doi.org/10.1145/2950290.2983940
-
[13]
Calcagno, C., Distefano, D., O’Hearn, P.W., Yang, H.: Compositional Shape Analysis by Means of Bi-Abduction. JACM 58(6), 26:1–26:66 (2011). https://doi.org/10.1145/2049697.2049700
-
[14]
Chin, W.N., David, C., Nguyen, H.H., Qin, S.: Automated Verification of Shape, Size and Bag Properties via User-defined Predicates in Separation Logic. Sci. Comput. Program. 77(9), 1006–1036 (2012). https://doi.org/10.1016/j.scico.2010.07.004
-
[15]
Deng, X., Lee, J., Robby: Bogor/Kiasan: A K-bounded Symbolic Execution for Checking Strong Heap Properties of Open Systems. In: ASE 2006, pp. 157–166. IEEE Computer So- ciety (2006). https://doi.org/10.1109/ASE.2006.26
-
[16]
Deng, X., Robby, Hatcliff, J.: Towards A Case-Optimal Symbolic Execution Algorithm for Analyzing Strong Properties of Object-Oriented Programs. In: SEFM 2007. IEEE Computer Society (2007). https://doi.org/10.1109/SEFM.2007.43
-
[17]
Master’s thesis, Massachusetts Institute of Technology, USA (2003)
Dennis, G.D.: TSAFE : Building a Trusted Computing Base for Air Traffic Control Software. Master’s thesis, Massachusetts Institute of Technology, USA (2003)
work page 2003
-
[18]
Godefroid, P., Klarlund, N., Sen, K.: DART: Directed Automated Random Test- ing. In: Sarkar, V ., Hall, M.W. (eds.) PLDI 2005, pp. 213–223. ACM (2005). https://doi.org/10.1145/1065010.1065036
-
[19]
Queue 10(1), 20:20–20:27 (2012)
Godefroid, P., Levin, M.Y ., Molnar, D.: SAGE: Whitebox Fuzzing for Security Testing. Queue 10(1), 20:20–20:27 (2012). https://doi.org/10.1145/2090147.2094081
-
[20]
Heimdahl, M.P.E., Rayadurgam, S., Visser, W., Devaraj, G., Gao, J.: Auto-generating Test Sequences Using Model Checkers: A Case Study. In: Petrenko, A., Ulrich, A. (eds.) FATES 2003, pp. 42–59. Springer (2003). https://doi.org/10.1007/978-3-540-24617-6 4
-
[21]
In: Jobstmann, B., Leino, K.R.M
Hillery, B., Mercer, E., Rungta, N., Person, S.: Exact Heap Summaries for Symbolic Execu- tion. In: Jobstmann, B., Leino, K.R.M. (eds.) VMCAI 2016, pp. 206–225. Springer (2016). https://doi.org/10.1007/978-3-662-49122-5 10
-
[22]
Hong, H.S., Lee, I., Sokolsky, O., Ural, H.: A Temporal Logic Based Theory of Test Cover- age and Generation. In: Katoen, J.P., Stevens, P. (eds.) TACAS 2002, pp. 327–341. Springer (2002). https://doi.org/10.1007/3-540-46002-0 23
-
[23]
Ishtiaq, S.S., O’Hearn, P.W.: BI as an Assertion Language for Mutable Data Struc- tures. In: Hankin, C., Schmidt, D. (eds.) POPL 2001, pp. 14–26. ACM (2001). https://doi.org/10.1145/360204.375719
-
[24]
In: Denney, E., Giannakopoulou, D., Pasareanu, C.S
Jayaraman, K., Harvison, D., Ganesh, V ., Kiezun, A.: jFuzz: A Concolic Whitebox Fuzzer for Java. In: Denney, E., Giannakopoulou, D., Pasareanu, C.S. (eds.) NFM 2009, pp. 121–125 (2009)
work page 2009
-
[25]
Khurshid, S., P ˘as˘areanu, C.S., Visser, W.: Generalized Symbolic Execution for Model Checking and Testing. In: Garavel, H., Hatcliff, J. (eds.) TACAS 2003, pp. 553–568. Springer (2003). https://doi.org/10.1007/3-540-36577-X 40
-
[26]
King, J.C.: Symbolic Execution and Program Testing. Commun. ACM 19(7), 385–394 (1976). https://doi.org/10.1145/360248.360252
-
[27]
Le, Q.L., Gherghina, C., Qin, S., Chin, W.N.: Shape Analysis via Second-Order Bi- Abduction. In: Biere, A., Bloem, R. (eds.) CA V 2014, pp. 52–68. Springer (2014). https://doi.org/10.1007/978-3-319-08867-9 4
-
[29]
Le, Q.L., Sun, J., Qin, S.: Frame Inference for Inductive Entailment Proofs in Separa- tion Logic. In: Beyer, D., Huisman, M. (eds.) TACAS 2018, pp. 41–60. Springer (2018). https://doi.org/10.1007/978-3-319-89960-2 3
-
[30]
Le, Q.L., Tatsuta, M., Sun, J., Chin, W.: A Decidable Fragment in Separation Logic with Inductive Predicates and Arithmetic. In: Majumdar, R., Kuncak, V . (eds.) CA V 2017, pp. 495–517. Springer (2017). https://doi.org/10.1007/978-3-319-63390-9 26
-
[31]
Le, X.D., Le, Q.L., Lo, D., Le Goues, C.: Enhancing Automated Program Repair with Deductive Verification. In: ICSME 2016, pp. 428–432. IEEE Computer Society (2016). https://doi.org/10.1109/ICSME.2016.66
-
[32]
Luckow, K., Dimja ˇsevi´c, M., Giannakopoulou, D., Howar, F., Isberner, M., Kahsai, T., Raka- mari´c, Z., Raman, V .: JDart: A Dynamic Symbolic Analysis Framework. In: Chechik, M., Raskin, J.F. (eds.) TACAS 2016, pp. 442–459. Springer (2016). https://doi.org/10.1007/978- 3-662-49674-9 26
-
[33]
In: Glinz, M., Murphy, G.C., Pezz`e, M
Marinescu, P.D., Cadar, C.: Make Test-zesti: A Symbolic Execution Solution for Improving Regression Testing. In: Glinz, M., Murphy, G.C., Pezz`e, M. (eds.) ICSE 2012, pp. 716–726. IEEE Computer Society (2012). https://doi.org/10.1109/ICSE.2012.6227146
-
[34]
Marinov, D., Khurshid, S.: TestEra: A Novel Framework for Automated Test- ing of Java Programs. In: ASE 2001, pp. 22–. IEEE Computer Society (2001). https://doi.org/10.1109/ASE.2001.989787
-
[35]
Pham, L.H., Le, Q.L., Phan, Q.S., Sun, J., Qin, S.: Enhancing Symbolic Execution of Heap- based Programs with Separation Logic for Test Input Generation. In: ATV A 2019. To appear
work page 2019
-
[36]
In: Chaudron, M., Crnkovic, I., Chechik, M., Harman, M
Pham, L.H., Le, Q.L., Phan, Q.S., Sun, J., Qin, S.: Testing Heap-based Programs with Java StarFinder. In: Chaudron, M., Crnkovic, I., Chechik, M., Harman, M. (eds.) ICSE 2018, pp. 268–269. ACM (2018). https://doi.org/10.1145/3183440.3194964
-
[38]
Reynolds, J.: Separation Logic: A Logic for Shared Mutable Data Structures. In: LICS 2002, pp. 55–74. IEEE Computer Society (2002). https://doi.org/10.1109/LICS.2002.1029817
-
[39]
PACMPL 3(POPL), 66:1–66:31 (2019)
Santos, J.F., Maksimovi, P., Sampaio, G., Gardner, P.: JaVerT 2.0: Composi- tional Symbolic Execution for JavaScript. PACMPL 3(POPL), 66:1–66:31 (2019). https://doi.org/10.1145/3290379
-
[40]
In: Wermelinger, M., Gall, H.C
Sen, K., Marinov, D., Agha, G.: CUTE: A Concolic Unit Testing Engine for C. In: Wermelinger, M., Gall, H.C. (eds.) FSE 2005, pp. 263–272. ACM (2005). https://doi.org/10.1145/1081706.1081750
-
[41]
Stephens, N., Grosen, J., Salls, C., Dutcher, A., Wang, R., Corbetta, J., Shoshitaishvili, Y ., Kruegel, C., Vigna, G.: Driller: Augmenting Fuzzing Through Selective Symbolic Execu- tion. In: NDSS 2016. The Internet Society (2016)
work page 2016
-
[42]
In: Bertolino, A., Canfora, G., Elbaum, S.G
Tanno, H., Zhang, X., Hoshino, T., Sen, K.: TesMa and CATG: Automated Test Generation Tools for Models of Enterprise Applications. In: Bertolino, A., Canfora, G., Elbaum, S.G. (eds.) ICSE 2015, pp. 717–720. IEEE Computer Society (2015). https://doi.org/10.1109/ICSE.2015.231
-
[43]
Scalable Funding of Bitcoin Micropayment Channel Networks
Tillmann, N., De Halleux, J.: Pex-White Box Test Generation for .NET. In: Beckert, B., H¨ahnle, R. (eds.) TAP 2008, pp. 134–153. Springer (2008). https://doi.org/10.1007/978-3- 540-79124-9 10
-
[44]
In: Kowalewski, S., Philippou, A
Vanoverberghe, D., Tillmann, N., Piessens, F.: Test Input Generation for Programs with Pointers. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009, pp. 277–291. Springer (2009). https://doi.org/10.1007/978-3-642-00768-2 25
-
[45]
In: Avrunin, G.S., Rothermel, G
Visser, W., P ˇasˇareanu, C.S., Khurshid, S.: Test Input Generation with Java PathFinder. In: Avrunin, G.S., Rothermel, G. (eds.) ISSTA 2004, pp. 97–107. ACM (2004). https://doi.org/10.1145/1007512.1007526
-
[46]
In: Chaudron, M., Crnkovic, I., Chechik, M., Harman, M
Wang, X., Sun, J., Chen, Z., Zhang, P., Wang, J., Lin, Y .: Towards Optimal Concolic Testing. In: Chaudron, M., Crnkovic, I., Chechik, M., Harman, M. (eds.) ICSE 2018, pp. 291–302 (2018). https://doi.org/10.1145/3180155.3180177
-
[47]
Yun, I., Lee, S., Xu, M., Jang, Y ., Kim, T.: QSYM : A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing. In: Enck, W., Felt, A.P. (eds.) USENIX Security 2018, pp. 745–761. USENIX Association (2018)
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.