Towards Lakosian Multilingual Software Design Principles
Pith reviewed 2026-05-25 19:54 UTC · model grok-4.3
The pith
Lakosian physical design rules are extended to multilingual software using pybind11.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
An extension to the Lakosian C++ design rules is proposed for multilingual software using pybind11, with compliance measured on 50 repositories using the MLSA toolkit, leading to a generalization for any FFI.
What carries the argument
The proposed extensions to Lakosian physical design rules for the pybind11 foreign function interface, which address the opacity that blocks common analysis tools.
If this is right
- Physical design principles can apply to cross-language boundaries in FFI-based systems.
- Automated measurement of compliance becomes possible for multilingual code.
- General rules can be developed that apply beyond pybind11 to other FFIs.
- Debugging and security analysis tools can operate more effectively on multilingual software.
Where Pith is reading between the lines
- Similar extensions might improve design practices for other combinations of languages and interfaces.
- Adopting these rules could influence how large systems are structured to minimize cross-language dependencies.
- The measurement on GitHub repos provides a baseline for tracking improvements in multilingual design over time.
Load-bearing premise
That the Lakosian physical design methodology can be extended to pybind11 while preserving its benefits and that the sample of 50 repositories indicates typical practice.
What would settle it
Finding that the proposed rules do not reduce the opacity of pybind11 calls or that a different sample of repositories shows very different compliance rates.
Figures
read the original abstract
Large software systems often comprise programs written in different programming languages. In the case when cross-language interoperability is accomplished with a Foreign Function Interface (FFI), for example pybind11, Boost.Python, Emscripten, PyV8, or JNI, among many others, common software engineering tools, such as call-graph analysis, are obstructed by the opacity of the FFI. This complicates debugging and fosters potential inefficiency and security problems. One contributing issue is that there is little rigorous software design advice for multilingual software. In this paper, we present our progress towards a more rigorous design approach to multilingual software. The approach is based on the existing approach to the design of large-scale C++ systems developed by Lakos. The Lakosian approach is one of the few design methodologies to address physical design rather than just logical design. Using the MLSA toolkit developed in prior work for analysis of multilingual software, we focus in on one FFI -- the pybind11 FFI. An extension to the Lakosian C++ design rules is proposed to address multilingual software that uses pybind11. Using a sample of 50 public GitHub repositories that use pybind11, we measure how many repositories would currently satisfy these rules. We conclude with a proposed generalization of the pybind11-based rules for any multilingual software using an FFI interface.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes an extension of Lakosian physical-design rules (component dependencies, acyclic dependencies) to multilingual C++/Python systems that use the pybind11 FFI, presents explicit new rules for the FFI boundary, and reports a compliance measurement performed with the MLSA toolkit on a sample of 50 public GitHub repositories that employ pybind11. It concludes by sketching a generalization to arbitrary FFIs.
Significance. If the proposed rules can be shown to retain the claimed Lakosian benefits (reduced cross-language opacity, lower debugging and security costs) while remaining practical, the work would supply one of the first rigorous physical-design guidelines for multilingual code. The use of an existing analysis toolkit and a concrete measurement step on real repositories are positive steps toward falsifiable claims.
major comments (3)
- [abstract / evaluation section] The evaluation only counts how many of the 50 repositories satisfy the new pybind11 rules; it supplies no measurement (e.g., call-graph opacity, cross-language defect density, or maintainability proxies) comparing compliant versus non-compliant repositories. This leaves the central claim—that the extension preserves Lakosian benefits—untested (see abstract and the measurement paragraph).
- [evaluation section] The sample of 50 repositories is described only as “public GitHub repositories that use pybind11”; no selection criteria, stratification by project size or domain, or justification of representativeness is given, so the compliance percentages cannot be interpreted as an indication of current practice.
- [rule-proposal and measurement paragraphs] The manuscript states that explicit rules are proposed and that MLSA was used to check them, yet neither the rule statements themselves nor any implementation or error-analysis details appear in the provided text; without these the measurement step cannot be reproduced or assessed for soundness.
minor comments (1)
- [conclusion] The abstract claims a “proposed generalization” but the text does not indicate whether this generalization is stated formally or remains informal.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below, clarifying the paper's scope as an initial step toward Lakosian multilingual design and indicating revisions to improve clarity, reproducibility, and acknowledgment of limitations.
read point-by-point responses
-
Referee: [abstract / evaluation section] The evaluation only counts how many of the 50 repositories satisfy the new pybind11 rules; it supplies no measurement (e.g., call-graph opacity, cross-language defect density, or maintainability proxies) comparing compliant versus non-compliant repositories. This leaves the central claim—that the extension preserves Lakosian benefits—untested (see abstract and the measurement paragraph).
Authors: The paper is framed as exploratory ('Towards...') and measures baseline compliance to establish that the proposed rules are applicable to real code; it does not claim or attempt to empirically demonstrate preservation of benefits such as reduced opacity or lower defect rates. Those benefits are hypothesized from the original Lakosian C++ results. We agree the evaluation does not test the benefits and will revise the abstract, introduction, and conclusion to explicitly delimit the scope and flag benefit validation as future work. revision: yes
-
Referee: [evaluation section] The sample of 50 repositories is described only as “public GitHub repositories that use pybind11”; no selection criteria, stratification by project size or domain, or justification of representativeness is given, so the compliance percentages cannot be interpreted as an indication of current practice.
Authors: Repositories were obtained via GitHub search for pybind11 usage followed by successful MLSA analysis; no stratification by size or domain was applied because the aim was an initial feasibility demonstration rather than a representative survey. We accept that this restricts interpretation and will expand the evaluation section with explicit selection criteria, summary statistics on repository sizes, and a limitations paragraph on generalizability. revision: yes
-
Referee: [rule-proposal and measurement paragraphs] The manuscript states that explicit rules are proposed and that MLSA was used to check them, yet neither the rule statements themselves nor any implementation or error-analysis details appear in the provided text; without these the measurement step cannot be reproduced or assessed for soundness.
Authors: The rules appear in the 'Proposed pybind11 Rules' section and MLSA application is summarized in the evaluation, but we agree that fuller detail would aid reproducibility. We will move the complete rule statements into the main text, add a subsection describing the MLSA extensions and checks performed, and include a brief error-analysis or threats-to-validity discussion. revision: yes
- Direct empirical comparison of Lakosian benefits (e.g., call-graph opacity or cross-language defect density) between compliant and non-compliant repositories, as no such metrics were collected in the study.
Circularity Check
Minor self-citation to analysis toolkit; proposed rules and compliance count remain independent of inputs
full rationale
The paper proposes an extension to Lakosian physical-design rules for the pybind11 FFI and then applies the authors' prior MLSA toolkit to count compliance across an external sample of 50 GitHub repositories. No equations or fitted parameters are present; the rules are introduced as a new extension rather than derived from the sample, and the compliance count is a direct measurement rather than a prediction that reduces to the same data. The self-citation to MLSA is limited to the analysis tool and is not load-bearing for the central proposal or conclusions.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Lakosian physical design rules remain beneficial when extended across language boundaries via an FFI.
- domain assumption The MLSA toolkit can accurately detect pybind11 usage and call-graph structure.
Reference graph
Works this paper leans on
-
[1]
Abrahams, D., and R.W. Grosse-Kunstleve. 2003. Building Hybrid Systems with Boost.Python. 14 5. https://www.boost.org/doc/libs/1_69_0/libs/python/doc/ html/article.html
work page 2003
-
[2]
Application -only Call Graph Construction
Ali, K., and Ondrej Lhotak. 2012. “Application -only Call Graph Construction.” ECOOP'12 Proceedings of the 26th European Conf. on Object-Oriented Prog.. Beijing
work page 2012
-
[3]
Fast static analysis of C++ virtual function calls
Bacon, D., and P. Sweeney. 1996. “Fast static analysis of C++ virtual function calls.” 11th ACM SIGPLAN Conf. on OO Prog. Sys., Lang & App
work page 1996
-
[4]
Automated support for seamless interoperability in polylingual software systems
Barrett, D., A. Kaplan, and J. Wileden. 1996. “Automated support for seamless interoperability in polylingual software systems.” 4th ACM SIGSOFT symposium on Foundations of software engineering. New York
work page 1996
-
[5]
Lightweight Call-Graph Construction for Multilingual Software Analysis
Bogar, A.M., D. Lyons, and D. Baird. 2018. “Lightweight Call-Graph Construction for Multilingual Software Analysis.” 13th Int. Conf. Soft. Tech.. Porto, Portugal
work page 2018
-
[6]
Preventing injection attacks with syntax embeddings
Bravenboer, M., E. Dolstra, and E. Visser. 2010. “Preventing injection attacks with syntax embeddings.” Sci. Comput. Program. 75 (7): 473-495
work page 2010
-
[7]
Communication -Sensitive Static Dataflow for Parallel Message Passing Applications
Bronevetsky, G. 2009. “Communication -Sensitive Static Dataflow for Parallel Message Passing Applications.” International Symposium on Code Generation and Optimization. Seattle WA
work page 2009
-
[8]
Design of large-scale polylingual systems
Grechanik, M., D. Batory, and D. Perry. 2004. “Design of large-scale polylingual systems.” 26th Int. Conf. on Software Systems. Edinburgh UK
work page 2004
-
[9]
Cross -Language Interoperability in a Multi-Language Runtime
Grimmer, M., R. Schatz, C. Seaton, T. Wurthinger, and M. Lujan. 2018. “Cross -Language Interoperability in a Multi-Language Runtime.” ACM Trans. on Prog. Languages and Systems (ACM) 40 (2): 8:1-8:43
work page 2018
-
[10]
Mutation -Based Fault Localization for Real -World Multilingual Programs
Hong, S., and et al. 2015. “Mutation -Based Fault Localization for Real -World Multilingual Programs.” 30th IEEE/ACM Int. Conf. on Automated Software Eng
work page 2015
-
[11]
Lakos, John. 1996. Large-Scale C++ Software Design. Addison-Wesley
work page 1996
-
[12]
HybriDroid: static analysis framework for Android hybrid applications
Lee, S., J. Doby, and S. Ryu. 2016. “HybriDroid: static analysis framework for Android hybrid applications.” 31st IEEE/ACM International Conference on Automated Software Engineering. Singapore
work page 2016
-
[13]
Lightweight Multilingual Software Analysis
Lyons, D., A. Bogar, and D. Baird. 2017. “Lightweight Multilingual Software Analysis.” 12th Int. Conf. on Software Technologies (ICSoft). Madrid, Spain
work page 2017
-
[14]
Lightweight Multilingual Software Analysis
Lyons, D., A.M. Bogar, and D. Baird. 2018. “Lightweight Multilingual Software Analysis.” In Chall. & Opp. in ICT Research Projects, by J. Filipe. SCITEPRESS
work page 2018
-
[15]
Mayer, P., M. Kirsch, and M -A. Le. 2017. “On multi - language software development, cross-language links and accompanying tools: a survey of professional software developers.” Journal of Software Engineering Research and Development 5 (1)
work page 2017
-
[16]
Multilingual source code analysis: State of the art and challenges
Mushtak, Z., and G. Rasool. 2015. “Multilingual source code analysis: State of the art and challenges.” Int. Conf. Open Source Sys. & Tech
work page 2015
-
[17]
Nielson, F., H.R. Nielson, and C. Hankin. 2005. Principles of Program Analysis. Springer. 2010. Python 2.7 doc.. https://docs.python.org/2.7/
work page 2005
-
[18]
Seamless operability between C++11 and Python
Smirnoff, I. 2017. “Seamless operability between C++11 and Python.” EuroPython Conference. Rimini, Italy
work page 2017
-
[19]
Cross-Language Program Analysis and Refactoring
Strien, D., H. Kratz, and W. Lowe. 2006. “Cross-Language Program Analysis and Refactoring.” 6th Int. Workshop on Source Code Analysis and Manipulation
work page 2006
-
[20]
Multilingual Source Code Analysis: A Systematic Literature Review
Zaigham, M., G. Rasool, and B. Shehzad. 2017. “Multilingual Source Code Analysis: A Systematic Literature Review.” IEEE Access PP (99)
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.