pith. sign in

arxiv: 1907.01679 · v1 · pith:3AOKBXGKnew · submitted 2019-07-02 · 💻 cs.CR

Build It, Break It, Fix It: Contesting Secure Development

Pith reviewed 2026-05-25 10:34 UTC · model grok-4.3

classification 💻 cs.CR
keywords secure software developmentprogramming contestssecurity flawstype-safe languagesC/C++bug findingsoftware security
0
0 comments X

The pith

Statically type-safe languages produced 11 times fewer security flaws than C/C++ in the BIBIFI contest.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the Build-it, Break-it, Fix-it contest to evaluate how well teams create secure software rather than just break it. Teams build programs to meet correctness, performance, and security goals, then attempt to break other teams' work, with winners selected from top performers in each area. Analysis across three contests and 156 teams found that C/C++ submissions were most efficient but carried far higher security risk, while statically type-safe languages cut flaw likelihood by a factor of 11. Teams that performed well at both building and breaking proved significantly more effective at discovering vulnerabilities than those focused on one skill.

Core claim

The BIBIFI contest format shows that language choice and team experience correlate with security outcomes: statically type-safe language submissions were 11 times less likely to contain security flaws than C/C++ submissions, C/C++ produced the most efficient builds, and break-it teams that also succeeded at build-it were significantly better at finding security bugs.

What carries the argument

The BIBIFI contest structure, in which teams build specified software and then break other submissions to expose flaws.

If this is right

  • Statically type-safe languages correlate with substantially lower rates of security flaws even when teams can choose any language or tools.
  • Teams with experience succeeding at both building secure code and breaking insecure code identify more bugs than teams specialized in one role.
  • C/C++ submissions can achieve higher performance but at the cost of elevated security risk compared to type-safe alternatives.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Training programs that require participants to both construct and attack software could strengthen vulnerability detection skills.
  • The contest format offers a controlled way to measure effects of other variables like specific tools or methodologies on security outcomes.
  • Industry teams might reduce vulnerabilities by prioritizing type-safe languages where performance trade-offs allow.

Load-bearing premise

The three specific programming problems and contest rules produce security flaw rates and breaking performance that generalize to real-world secure development tasks outside the artificial contest constraints.

What would settle it

A replication of the contest using different programming problems that fails to show the same 11-fold difference in flaw rates by language, or a field study of real projects that finds no language-based difference in vulnerability counts.

Figures

Figures reproduced from arXiv: 1907.01679 by Andrew Ruef, Daniel Votipka, Dave Levin, James Parker, Kelsey R. Fulton, Michael Hicks, Michelle L. Mazurek, Piotr Mardziel.

Figure 1
Figure 1. Figure 1: Overview of BIBIFI’s implementation. Web frontend. Contestants sign up for the contest through our web application frontend, and fill out a survey when doing so, to gather demographic data potentially relevant to the contest outcome (e.g., programming experience and security training). During the contest, the web application tests build-it submissions and break-it bug reports, keeps the current scores upda… view at source ↗
Figure 2
Figure 2. Figure 2: MITM replay attack. set of atm commands is run using the oracle’s atm and bank without the MITM. This means that any messages that the MITM sends directly to the target submission’s atm or bank will not be replayed/sent to the oracle. If the oracle and target both complete the command list without error, but they differ on the outputs of one or more commands, or on the balances of accounts at the bank whos… view at source ↗
Figure 3
Figure 3. Figure 3: Grammar for the Multiuser DB command language as BNF. Here, [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The number of build-it submissions in each contest, organized by primary programming language [PITH_FULL_IMAGE:figures/full_fig_p020_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Each team’s ship score, compared to the lines of code in its implementation and organized by language [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Final resilience scores, ordered by team, and plotted for each contest problem. Build-it teams who did [PITH_FULL_IMAGE:figures/full_fig_p024_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The fraction of teams in whose submission a security bug was found, by contest and language category. [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Scores of break-it teams prior to the fix-it phase, broken down by points from security and correctness [PITH_FULL_IMAGE:figures/full_fig_p028_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Count of security bugs found by each break-it team, organized by contest and whether the team [PITH_FULL_IMAGE:figures/full_fig_p029_9.png] view at source ↗
read the original abstract

Typical security contests focus on breaking or mitigating the impact of buggy systems. We present the Build-it, Break-it, Fix-it (BIBIFI) contest, which aims to assess the ability to securely build software, not just break it. In BIBIFI, teams build specified software with the goal of maximizing correctness, performance, and security. The latter is tested when teams attempt to break other teams' submissions. Winners are chosen from among the best builders and the best breakers. BIBIFI was designed to be open-ended; teams can use any language, tool, process, etc. that they like. As such, contest outcomes shed light on factors that correlate with successfully building secure software and breaking insecure software. We ran three contests involving a total of 156 teams and three different programming problems. Quantitative analysis from these contests found that the most efficient build-it submissions used C/C++, but submissions coded in a statically-type safe language were 11 times less likely to have a security flaw than C/C++ submissions. Break-it teams that were also successful build-it teams were significantly better at finding security bugs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript introduces the Build-it, Break-it, Fix-it (BIBIFI) contest to evaluate secure software development through teams building specified software and then attempting to break others' submissions. Across three contests with 156 teams and three programming problems, the authors report that C/C++ submissions were the most efficient but that code in statically type-safe languages was 11 times less likely to contain security flaws; additionally, teams successful at both building and breaking performed significantly better at discovering security bugs.

Significance. If the quantitative claims hold after methodological clarification, the work provides empirical evidence on language choice and dual build/break experience as correlates of secure development outcomes. The open-ended contest format, permitting arbitrary languages, tools, and processes, is a strength that distinguishes it from more constrained studies and enables observation of real correlations in a semi-controlled setting with a sizable participant pool.

major comments (2)
  1. [Abstract] Abstract: The central claim that 'submissions coded in a statically-type safe language were 11 times less likely to have a security flaw than C/C++ submissions' is presented with no accompanying information on security flaw classification criteria, inter-rater reliability, the statistical model or controls used to compute the ratio, or any uncertainty estimates, which are required to assess the robustness of this load-bearing quantitative result.
  2. [Results section] Results section: No analysis or discussion addresses potential confounds such as self-selection bias, correlation between language choice and prior team experience, or differential problem fit, any of which could produce the observed 11x difference without reflecting inherent language security properties.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive review. We address each major comment below. We will revise the manuscript to provide greater methodological transparency in the abstract and to add explicit discussion of potential confounds.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that 'submissions coded in a statically-type safe language were 11 times less likely to have a security flaw than C/C++ submissions' is presented with no accompanying information on security flaw classification criteria, inter-rater reliability, the statistical model or controls used to compute the ratio, or any uncertainty estimates, which are required to assess the robustness of this load-bearing quantitative result.

    Authors: The abstract is space-constrained, but the full manuscript details the flaw classification criteria (based on CWE categories for memory safety, injection, and access control issues), reports inter-rater agreement via Cohen's kappa in the 'Security Flaw Classification' subsection, describes the negative binomial regression model with controls for team experience and problem, and supplies 95% confidence intervals for the incidence rate ratio. We will revise the abstract to include a short parenthetical note on the model and uncertainty to improve accessibility without exceeding length limits. revision: yes

  2. Referee: [Results section] Results section: No analysis or discussion addresses potential confounds such as self-selection bias, correlation between language choice and prior team experience, or differential problem fit, any of which could produce the observed 11x difference without reflecting inherent language security properties.

    Authors: We agree that the results section would benefit from explicit treatment of these issues. The regression already includes controls for self-reported prior experience and problem type, but we did not dedicate space to self-selection or differential problem fit. We will add a dedicated limitations paragraph in the Discussion acknowledging these confounds, noting that the contest's open-ended design and pre-contest surveys provide partial mitigation, while recognizing that observational data cannot fully eliminate them. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical observations from contest data stand alone

full rationale

The paper reports direct quantitative findings from three BIBIFI contests (156 teams, three problems): C/C++ submissions were most efficient but statically safe languages showed 11x fewer security flaws, and dual build/break teams performed better at bug finding. These are presented as observed correlations in the collected submissions and break attempts, with no equations, fitted parameters renamed as predictions, self-definitional constructs, or load-bearing self-citations. The central multiplier is a straightforward ratio computed from flaw counts in the contest data, not derived from prior author work or ansatzes. Generalization concerns exist but are external-validity issues, not circularity. The derivation chain is self-contained as raw empirical reporting.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central quantitative claims rest on the untested premise that contest outcomes validly proxy real-world secure development; the three problems are treated as representative without independent justification.

axioms (2)
  • domain assumption The three programming problems used in the contests are representative of typical secure development challenges.
    All reported correlations derive from performance on these specific problems.
  • domain assumption Security flaws discovered in the break-it phase accurately reflect the security properties of the built submissions.
    The contest equates breaking success with security measurement.

pith-pipeline@v0.9.0 · 5746 in / 1143 out tokens · 49710 ms · 2026-05-25T10:34:51.898958+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

75 extracted references · 75 canonical work pages

  1. [1]

    Martín Abadi, Mihai Budiu, Úlfar Erlingsson, and Jay Ligatti. 2009. Control-flow integrity principles, implementations, and applications. ACM Transactions on Information and System Security (TISSEC) 13, 1 (2009), 4:1–4:40

  2. [2]

    Y. Acar, M. Backes, S. Fahl, S. Garfinkel, D. Kim, M. L. Mazurek, and C. Stransky. 2017. Comparing the Usability of Cryptographic APIs. In 2017 IEEE Symposium on Security and Privacy (SP)

  3. [3]

    Mazurek, and Sascha Fahl

    Yasemin Acar, Christian Stransky, Dominik Wermke, Michelle L. Mazurek, and Sascha Fahl. 2017. Security Developer Studies with GitHub Users: Exploring a Convenience Sample. In Thirteenth Symposium on Usable Privacy and Security (SOUPS 2017)

  4. [4]

    Mazurek, and Sascha Fahl

    Yasemin Acar, Christian Stransky, Dominik Wermke, Charles Weir, Michelle L. Mazurek, and Sascha Fahl. 2017. Developers Need Support Too: A Survey of Security Advice for Software Developers. In IEEE Secure Development Conference (SecDev 2017)

  5. [5]

    acm [n. d.]. The ACM-ICPC International Collegiate Programming Contest. http://icpc.baylor.edu. ([n. d.])

  6. [6]

    American Fuzzing Lop (AFL)

    AFL 2018. American Fuzzing Lop (AFL). http://lcamtuf.coredump.cx/afl/. (2018)

  7. [7]

    Daniele Antonioli, Hamid Reza Ghaeini, Sridhar Adepu, Martin Ochoa, and Nils Ole Tippenhauer. 2017. Gamifying ICS Security Training and Research: Design, Implementation, and Results of S3. In Proceedings of the 2017 Workshop on Cyber-Physical Systems Security and PrivaCy (CPS ’17)

  8. [8]

    Angela Sasse

    Ingolf Becker, Simon Parkin, and M. Angela Sasse. 2017. Finding Security Champions in Blends of Security Culture. In 2nd European Workshop on Usable Security (Euro USEC 2017) . Internet Society

  9. [9]

    Daniel J Bernstein, Tanja Lange, and Peter Schwabe. 2012. The security impact of a new cryptographic library. In International Conference on Cryptology and Information Security in Latin America . Springer, 159–176

  10. [10]

    Black, Lee Badger, Barbara Guttman, and Elizabeth Fong

    Paul E. Black, Lee Badger, Barbara Guttman, and Elizabeth Fong. 2016. Dramatically Reducing Software Vulnerabilities: Report to the White House Office of Science and Technology Policy. Technical Report Draft NISTIR 8151. National Institute of Standards and Technology. http://csrc.nist.gov/publications/drafts/nistir-8151/nistir8151_draft.pdf

  11. [11]

    Kevin Bock, George Hughey, and Dave Levin. 2018. King of the Hill: A Novel Cybersecurity Competition for Teaching Penetration Testing. In 2018 USENIX Workshop on Advances in Security Education (ASE 18)

  12. [12]

    bsimm [n. d.]. Building Security In Maturity Model (BSIMM). http://bsimm.com. ([n. d.])

  13. [13]

    Kenneth P Burnham, David R Anderson, and Kathryn P Huyvaert. 2011. AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behavioral Ecology and Sociobiology 65, 1 (2011), 23–35

  14. [14]

    Peter Chapman, Jonathan Burket, and David Brumley. 2014. PicoCTF: A game-based computer security competition for high school students. In 2014 USENIX Summit on Gaming, Games, and Gamification in Security Education (3GSE 14)

  15. [15]

    Brian Chess and Jacob West. 2007. Secure Programming with Static Analysis . Addison-Wesley

  16. [16]

    Nicholas Childers, Bryce Boe, Lorenzo Cavallaro, Ludovico Cavedon, Marco Cova, Manuel Egele, and Giovanni Vigna

  17. [17]

    In DIMV A

    Organizing Large Scale Hacking Competitions. In DIMV A

  18. [18]

    Jacob Cohen. 1988. Statistical Power Analysis for the Behavioral Sciences . Lawrence Erlbaum Associates

  19. [19]

    Art Conklin. 2005. The Use of a Collegiate Cyber Defense Competition in Information Security Education. InInfoSecCD

  20. [20]

    Art Conklin. 2006. Cyber defense competitions and information security education: An active learning solution for a capstone course. In HICSS

  21. [21]

    Gregory Conti, Thomas Babbitt, and John Nelson. 2011. Hacking competitions and their untapped potential for security education. Security & Privacy 9, 3 (2011), 56–59. ACM Transactions on Privacy and Security, Vol. 9, No. 4, Article 39. Publication date: March 2010. 39:34 J. Parker et al

  22. [22]

    DEF CON Communications, Inc. [n. d.]. Capture the Flag Archive. https://www.defcon.org/html/links/dc-ctf.html. ([n. d.])

  23. [23]

    Adam Doupé, Manuel Egele, Benjamin Caillat, Gianluca Stringhini, Gorkem Yakin, Ali Zand, Ludovico Cavedon, and Giovanni Vigna. 2011. Hit ’Em Where It Hurts: A Live Security Exercise on Cyber Situational Awareness. InACSAC

  24. [24]

    dragostech.com inc. [n. d.]. CanSecWest Applied Security Conference. http://cansecwest.com. ([n. d.])

  25. [25]

    Chris Eagle. 2013. Computer security competitions: Expanding educational outcomes. Security & Privacy 11, 4 (2013), 69–71

  26. [26]

    Anne Edmundson, Brian Holtkamp, Emanuel Rivera, Matthew Finifter, Adrian Mettler, and David Wagner. 2013. An Empirical Study on the Effectiveness of Security Code Review. In International Symposium on Engineering Secure Software and Systems (ESSoS)

  27. [27]

    Manuel Egele, David Brumley, Yanick Fratantonio, and Christopher Kruegel. 2013. An empirical study of cryptographic misuse in android applications. In the 2013 ACM SIGSAC conference . ACM Press, 73–84. https://www.cs.ucsb.edu/ ~chris/research/doc/ccs13_cryptolint.pdf

  28. [28]

    Sascha Fahl, Marian Harbach, Henning Perl, Markus Koetter, and Matthew Smith. 2013. Rethinking SSL development in an appified world. In Proc. ACM CCS. http://dl.acm.org/citation.cfm?doid=2508859.2516655

  29. [29]

    Matthew Finifter and David Wagner. 2011. Exploring the relationship betweenweb application development tools and security. In USENIX Conference on Web Application Development (WebApps)

  30. [30]

    Martin Georgiev, Subodh Iyengar, Suman Jana, Rishita Anubhai, Dan Boneh, and Vitaly Shmatikov. 2012. The most dangerous code in the world: validating SSL certificates in non-browser software. In CCS ’12: Proceedings of the 2012 ACM conference on Computer and communications security . ACM. https://doi.org/10.1145/2382196.2382204

  31. [31]

    git [n. d.]. Git – distributed version control management system. http://git-scm.com. ([n. d.])

  32. [32]

    google [n. d.]. Google Code Jam. http://code.google.com/codejam. ([n. d.])

  33. [33]

    Keith Harrison and Gregory White. 2010. An empirical study on the effectiveness of common security measures. In Hawaii International Conference on System Sciences (HICSS)

  34. [34]

    Hoffman, Tim Rosenberg, and Ronald Dodge

    Lance J. Hoffman, Tim Rosenberg, and Ronald Dodge. 2005. Exploring a national cybersecurity exercise for universities. Security & Privacy 3, 5 (2005), 27–33

  35. [35]

    Michael Howard and David LeBlanc. 2003. Writing Secure Code. Microsoft Press

  36. [36]

    Michael Howard and Steve Lipner. 2006. The Security Development Lifecycle . Microsoft Press

  37. [37]

    icfp [n. d.]. ICFP Programming Contest. http://icfpcontest.org. ([n. d.])

  38. [38]

    DEF CON Communications Inc. [n. d.]. DEF CON Hacking Conference. http://www.defcon.org. ([n. d.])

  39. [39]

    Queena Kim. 2014. Want to learn cybersecurity? Head to Def Con. http://www.marketplace.org/2014/08/25/tech/ want-learn-cybersecurity-head-def-con. (2014)

  40. [40]

    George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating Fuzz Testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS ’18)

  41. [41]

    Gary McGraw. 2006. Software Security: Building Security In . Addison-Wesley

  42. [42]

    mdc3 [n. d.]. Maryland Cyber Challenge & Competition. http://www.fbcinc.com/e/cybermdconference/competitorinfo. aspx. ([n. d.])

  43. [43]

    David Molnar, Xue Cong Li, and David A. Wagner. 2009. Dynamic Test Generation to Find Integer Bugs in x86 Binary Linux Programs. In USENIX Security Symposium

  44. [44]

    N J D Nagelkerke. 1991. A note on a general definition of the coefficient of determination. Biometrika 78, 3 (09 1991), 691–692

  45. [45]

    National Collegiate Cyber Defense Competition. [n. d.]. http://www.nationalccdc.org. ([n. d.])

  46. [46]

    Daniela Oliveira, Marissa Rosenthal, Nicole Morin, Kuo-Chuan Yeh, Justin Cappos, and Yanyan Zhuang. 2014. It’s the Psychology Stupid: How Heuristics Explain Software Vulnerabilities and How Priming Can Illuminate Developer’s Blind Spots. In ACSAC

  47. [47]

    DeLong, Justin Cappos, and Yuriy Brun

    Daniela Seabra Oliveira, Tian Lin, Muhammad Sajidur Rahman, Rad Akefirad, Donovan Ellis, Eliany Perez, Rahul Bobhate, Lois A. DeLong, Justin Cappos, and Yuriy Brun. 2018. API Blindspots: Why Experienced Developers Write Vulnerable Code. In Fourteenth Symposium on Usable Privacy and Security (SOUPS 2018)

  48. [48]

    OWASP. 2010. Secure Coding Practices - Quick Reference Guide. (2010). https://www.owasp.org/images/0/08/OWASP_ SCP_Quick_Reference_Guide_v2.pdf

  49. [49]

    James Parker, Niki Vazou, and Michael Hicks. 2019. LWeb: Information Flow Security for Multi-tier Web Applications. Proc. ACM Program. Lang. 3, POPL (Jan. 2019)

  50. [50]

    Van-Thuan Pham, Sakaar Khurana, Subhajit Roy, and Abhik Roychoudhury. 2017. Bucketing Failing Tests via Symbolic Analysis. In International Conference on Fundeamental Approaches to Software Engineering (FASE)

  51. [51]

    Polytechnic Institute of New York University. [n. d.]. CSAW - CyberSecurity Competition 2012. http://www.poly.edu/ csaw2012/csaw-CTF. ([n. d.]). ACM Transactions on Privacy and Security, Vol. 9, No. 4, Article 39. Publication date: March 2010. Build It, Break It, Fix It 39:35

  52. [52]

    Prechelt

    L. Prechelt. 2011. Plat_Forms: A web development platform comparison by an exploratory experiment searching for emergent platform properties. IEEE Transactions on Software Engineering 37, 1 (2011), 95–108

  53. [53]

    psql [n. d.]. PostgreSQL: The world’s most advanced open source database. http://www.postgresql.org. ([n. d.])

  54. [54]

    Mazurek, and Piotr Mardziel

    Andrew Ruef, Michael Hicks, James Parker, Dave Levin, Michelle L. Mazurek, and Piotr Mardziel. 2016. Build It, Break It, Fix It: Contesting Secure Development. In CCS

  55. [55]

    Andrew Ruef, Michael Hicks, James Parker, Dave Levin, Atif Memon, Jandelyn Plane, and Piotr Mardziel. 2015. Build It Break It: Measuring and Comparing Development Security. In CSET

  56. [56]

    Saltzer and Michael D

    Jerome H. Saltzer and Michael D. Schroeder. 1975. The Protection of Information in Computer Systems. Proc. IEEE 63, 9 (1975), 1278–1308

  57. [57]

    Riccardo Scandariato, James Walden, and Wouter Joosen. 2013. Static analysis versus penetration testing: A controlled experiment. In IEEE International Symposium on Reliability Engineering (ISSRE)

  58. [58]

    Robert C. Seacord. 2013. Secure Coding in C and C++ . Addison-Wesley

  59. [59]

    Deian Stefan, Alejandro Russo, John Mitchell, and David Mazieres. 2011. Flexible Dynamic Information Flow Control in Haskell. In ACM SIGPLAN Haskell Symposium

  60. [60]

    Redmiles, Michael Backes, Simson Garfinkel, Michelle L

    Christian Stransky, Yasemin Acar, Duc Cuong Nguyen, Dominik Wermke, Doowon Kim, Elissa M. Redmiles, Michael Backes, Simson Garfinkel, Michelle L. Mazurek, and Sascha Fahl. 2017. Lessons Learned from Using an Online Platform to Conduct Large-Scale, Online Controlled Security Experiments with Software Developers. In 10th USENIX Workshop on Cyber Security Ex...

  61. [61]

    Positive Technologies. 2018. ATM logic attacks: scenarios, 2018. https://www.ptsecurity.com/upload/corporate/ww-en/ analytics/ATM-Vulnerabilities-2018-eng.pdf. (Nov. 2018)

  62. [62]

    Christopher Thompson and David Wagner. 2017. A Large-Scale Study of Modern Code Review and Security in Open Source Projects. In Proceedings of the 13th International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE)

  63. [63]

    topcoder [n. d.]. Top Coder competitions. http://apps.topcoder.com/wiki/display/tc/Algorithm+Overview. ([n. d.])

  64. [64]

    Erik Trickel, Francesco Disperati, Eric Gustafson, Faezeh Kalantari, Mike Mabey, Naveen Tiwari, Yeganeh Safaei, Adam Doupé, and Giovanni Vigna. 2017. Shell We Play A Game? CTF-as-a-service for Security Education. In 2017 USENIX Workshop on Advances in Security Education (ASE 17)

  65. [65]

    Úlfar Erlingsson. 2012. personal communication stating that CFI was not deployed at Microsoft due to its overhead exceeding 10%. (2012)

  66. [66]

    Rijnard van Tonder, John Kotheimer, and Claire Le Goues. 2018. Semantic Crash Bucketing. In IEEE International Conference on Automated Software Engineering (ASE)

  67. [67]

    John Viega and Gary McGraw. 2001. Building Secure Software: How to A void Security Problems the Right Way . Addison- Wesley

  68. [68]

    James Walden, Jeff Stuckman, and Riccardo Scandariato. 2014. Predicting Vulnerable Components: Software Metrics vs Text Mining. In IEEE International Symposium on Software Reliability Engineering

  69. [69]

    Charles Weir, Awais Rashid, and James Noble. 2017. I’d Like to Have an Argument, Please: Using Dialectic for Effective App Security. In 2nd European Workshop on Usable Security (Euro USEC 2017) . Internet Society

  70. [70]

    SeongIl Wi, Jaeseung Choi, and Sang Kil Cha. 2018. Git-based CTF: A Simple and Effective Approach to Organizing In-Course Attack-and-Defense Security Competition. In 2018 USENIX Workshop on Advances in Security Education (ASE 18)

  71. [71]

    Glenn Wurster and P C van Oorschot. 2008. The developer is the enemy. In NSPW. 89

  72. [72]

    J Xie, H R Lipford, and B Chu. 2011. Why do programmers make security errors?. In 2011 IEEE Symposium on Visual Languages and Human-Centric Computing . http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6070393

  73. [73]

    Muhammad Mudassar Yamin, Basel Katt, Espen Torseth, Vasileios Gkioulos, and Stewart James Kowalski. 2018. Make It and Break It: An IoT Smart Home Testbed Case Study. InProceedings of the 2Nd International Symposium on Computer Science and Intelligent Control (ISCSIC ’18)

  74. [74]

    Joonseok Yang, Duksan Ryu, and Jongmoon Baik. 2016. Improving vulnerability prediction accuracy with Secure Coding Standard violation measures. In International Conference on Big Data and Smart Computing (BigComp)

  75. [75]

    yesodweb [n. d.]. Yesod Web Framework for Haskell. http://www.yesodweb.com. ([n. d.]). Received February 2007; revised March 2009; accepted June 2009 ACM Transactions on Privacy and Security, Vol. 9, No. 4, Article 39. Publication date: March 2010