pith. sign in

arxiv: 2606.26386 · v1 · pith:MYZOTUQJnew · submitted 2026-06-24 · 💻 cs.PL

A Typestate Approach to Purpose-aware Programming

Pith reviewed 2026-06-26 00:32 UTC · model grok-4.3

classification 💻 cs.PL
keywords typestatepurpose-aware programmingdata usage complianceobject-oriented languagestype systemssensitive dataPurPL
0
0 comments X

The pith

PurPL is an object-oriented language that uses typestate to represent and enforce purpose sets for sensitive data at compile time.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents PurPL, an object-oriented programming language whose typestate system treats the state of sensitive data as the set of purposes for which it may be used. These purpose sets can grow or shrink during execution as data is processed or repurposed. A formal type system ensures operations respect the current set of allowed purposes. The authors implemented a type checker and tested it on multiple scenarios involving sensitive data. The work addresses the gap in compile-time guarantees for purpose-aware data usage.

Core claim

PurPL features a typestate system in which the state of a sensitive data type is defined as the set of purposes for which the data can be used; this set can transition by adding or removing purposes at runtime, and the type system verifies that all uses of the data comply with its current purpose set.

What carries the argument

Typestate system whose states are purpose sets for sensitive data, with transitions that add or remove purposes during execution.

If this is right

  • Programs handling sensitive data can be statically verified to respect intended purposes.
  • Purpose sets for data can change dynamically through language-level operations.
  • A type checker implementation can validate a range of sensitive-data handling scenarios.
  • Data-usage compliance becomes a compile-time property rather than a runtime check.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same typestate approach could be applied to other static compliance properties such as retention limits or consent tracking.
  • Extending the system to larger codebases would require testing how purpose sets propagate across library boundaries and inheritance.
  • Combining the static checker with lightweight runtime assertions could handle cases where full static tracking is impractical.

Load-bearing premise

That purpose sets represented as typestates can track all relevant data flows and transitions in realistic object-oriented programs without unacceptable restrictions on expressiveness.

What would settle it

A realistic program that correctly restricts data to its intended purposes yet is rejected by the PurPL type checker, or a program that violates purpose restrictions yet type-checks successfully.

Figures

Figures reproduced from arXiv: 2606.26386 by Anitha Gollamudi (1), Joan Montas (1), Matteo Cimini (1) ((1) University of Massachusetts Lowell), Samuel Dodson (1).

Figure 1
Figure 1. Figure 1: Syntax of PurPL 3 Syntax and Semantics This section formalizes PurPL in its version that omits the ability to specify states for purposes. Section 4 adds that ability to the formalism of this section. 3.1 Syntax The syntax of PurPL is in [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Definition of CT(𝐶), CT(𝐶, 𝑓 ), and CT(𝐶,𝑚) actual arguments may change their types after method invocation and our type system needs to keep track of that, we impose that only variables can be passed to a method. This imposes that arguments are first assigned to a variable before being passed to a method, a requirement that can be easily enforced with a compiler pass. Statements include assignments, field… view at source ↗
Figure 3
Figure 3. Figure 3: Term Typing not change their ground types, they may change their set of purposes. The typing rules for integers, booleans, and variables in [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Statement Typing Statement Typing [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Syntax of PurPL with Purpose States being modified, and must type check the body without being modified. This type environment is the output type environment after type checking the whole while state￾ment. This is a conservative approach. For example, the evaluation of a condition may grant purposes 𝑝1, . . ., 𝑝10 to variable 𝑥. In turn, the body of the while-statement may revoke purposes and leave 𝑥 with … view at source ↗
Figure 6
Figure 6. Figure 6: Relevant Typing Rules for Purpose State Most typing rules do not use the purpose environment and simply thread it through their rule [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
read the original abstract

Real-world applications often require verification that sensitive data is being used for their intended purpose. However, existing literature offers limited results regarding compile-time guarantees in this domain. In this paper, we explore the use of typestate to reason about the purpose of data. In typestate, types have a state, which can transition to other states in the style of automata. In our approach, the state of the type of sensitive data is defined as the set of purposes for which the data can be used. This set can grow or shrink at runtime as purposes can be added or removed during execution. In this paper, we have developed PurPL, an object-oriented programming language that features a typestate system that is capable of reasoning about purposes and data usage compliance according to them. We give an overview of PurPL through examples and present a formal type system. We have also implemented PurPL's type checker, and we report on our experiments with type checking various programming scenarios that handle sensitive data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes PurPL, an object-oriented language whose typestate system represents the state of sensitive data as a dynamic set of allowed purposes. Purpose sets can grow or shrink at runtime via explicit operations, and the type system is intended to enforce that data is only used in accordance with its current purpose set. The manuscript presents motivating examples, a formal type system, an implementation of the type checker, and experiments exercising the checker on various sensitive-data scenarios.

Significance. A sound typestate encoding of purpose sets that correctly handles realistic OO aliasing would constitute a useful addition to the literature on compile-time privacy and compliance enforcement. The work already supplies a formal type system and a working type-checker implementation; these are concrete strengths that would allow others to build on or verify the approach if the aliasing and transition rules are made precise.

major comments (2)
  1. [Formal type system] Formal type system (the section presenting the typing rules): the central claim requires that purpose-set transitions remain sound under aliasing. The description states that sets “can grow or shrink at runtime” but supplies no indication of how the rules treat non-unique references (fields, parameters, returns). If the rules do not propagate purpose-set changes across all aliases of an object, the compliance guarantee is unsound; if they implicitly require unique ownership, that restriction must be stated and its impact on expressiveness evaluated.
  2. [Experiments] Experimental evaluation: the manuscript reports experiments with the type checker on “various programming scenarios that handle sensitive data,” yet provides no description of the test cases, whether they exercise aliasing, or any quantitative results. Without such detail it is impossible to assess whether the implementation actually validates the central claim under realistic OO conditions.
minor comments (1)
  1. [Abstract] The abstract and introduction could more explicitly locate the formal rules (e.g., “§3 presents the typing judgment …”) so readers can immediately connect claims to the technical development.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of the formal type system and experimental evaluation. We address each point below and commit to revisions that strengthen the manuscript without altering its core claims.

read point-by-point responses
  1. Referee: [Formal type system] Formal type system (the section presenting the typing rules): the central claim requires that purpose-set transitions remain sound under aliasing. The description states that sets “can grow or shrink at runtime” but supplies no indication of how the rules treat non-unique references (fields, parameters, returns). If the rules do not propagate purpose-set changes across all aliases of an object, the compliance guarantee is unsound; if they implicitly require unique ownership, that restriction must be stated and its impact on expressiveness evaluated.

    Authors: We agree that the formal type system section does not explicitly describe the treatment of aliasing or non-unique references. The manuscript presents a formal type system but leaves the aliasing rules implicit. We will revise this section to make the rules precise, either by extending them to propagate purpose-set changes across aliases or by explicitly stating any ownership restrictions (such as linearity) together with an assessment of their effect on expressiveness in an object-oriented setting. revision: yes

  2. Referee: [Experiments] Experimental evaluation: the manuscript reports experiments with the type checker on “various programming scenarios that handle sensitive data,” yet provides no description of the test cases, whether they exercise aliasing, or any quantitative results. Without such detail it is impossible to assess whether the implementation actually validates the central claim under realistic OO conditions.

    Authors: We concur that the experimental section is underspecified. The reported experiments consist of type-checking runs on hand-written scenarios involving sensitive data, but the manuscript provides neither the test-case descriptions nor quantitative outcomes. We will expand the section to list the scenarios, note which ones involve aliasing through fields or parameters, and report quantitative results such as the number of cases checked and the type-checker verdicts. revision: yes

Circularity Check

0 steps flagged

No circularity: language design with direct type-system definition

full rationale

The paper presents PurPL as a language design with an accompanying formal type system and implementation. No equations, fitted parameters, or predictions appear that reduce to their own inputs by construction. The typestate rules for purpose sets are introduced as the system's definition rather than derived from prior results or self-citations. The central claim is the existence and implementability of the system itself, which does not rely on any load-bearing self-referential step.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no equations, parameters, or background assumptions to audit.

pith-pipeline@v0.9.1-grok · 5718 in / 883 out tokens · 9960 ms · 2026-06-26T00:32:11.205546+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 23 canonical work pages

  1. [1]

    The Privacy Act 1988.https://www.oaic.gov.au/privacy/ privacy-legislation/the-privacy-act

    1988. The Privacy Act 1988.https://www.oaic.gov.au/privacy/ privacy-legislation/the-privacy-act

  2. [2]

    The California Consumer Privacy Act of 2018

    2018. The California Consumer Privacy Act of 2018. https://leginfo.legislature.ca.gov/faces/codes_displayT ext. xhtml?division=3.&part=4.&lawCode=CIV&title=1.81.5

  3. [3]

    Tariq Alshugran, Julius Dichter, and Miad Faezipour. 2015. Formally expressing HIPAA privacy policies for web services. In2015 IEEE International Conference on Electro/Informa- tion Technology (EIT). IEEE, 295–299. doi:10.1109/EIT .2015. 7293356

  4. [4]

    Owen Arden, Peter Liu, and Andrew C. Myers. 2012. Flow- Limited Authorization. InIEEE Computer Security Foundations Symposium (CSF). 225–239. doi:10.1109/CSF .2015.42

  5. [5]

    Probabilistic relational reasoning for differential privacy

    Thomas H. Austin and Cormac Flanagan. 2012. Multiple facets for dynamic information flow. InProceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Program- ming Languages(Philadelphia, PA, USA)(POPL ’12). Associa- tion for Computing Machinery, New York, NY , USA, 165–178. doi:10.1145/2103656.2103677

  6. [6]

    BBC. 2019. Google hit with £44m GDPR fine over ads. https://www.bbc.com/news/technology-46944696. Accessed: 2026-04-30

  7. [7]

    BBC. 2021. WhatsApp issued second-largest GDPR fine of C225m. https://www.bbc.com/news/technology-58422465. Ac- cessed: 2026-04-30

  8. [8]

    Eleanor Birrell, Jay Rodolitz, Angel Ding, Jenna Lee, Emily McReynolds, Jevan Hutson, and Ada Lerner. 2024. SoK: Tech- nical Implementation and Human Impact of Internet Privacy Regulations. In2024 IEEE Symposium on Security and Privacy (SP). 673–696. doi:10.1109/SP54263.2024.00206

  9. [9]

    Dino Bollinger, Karel Kubicek, Carlos Cotrini, and David Basin. 2022. Automating Cookie Consent and GDPR Violation Detection. In31st USENIX Security Symposium (USENIX Security 22). USENIX Association, Boston, MA, 2893– 2910.https://www.usenix.org/conference/usenixsecurity22/ presentation/bollinger

  10. [10]

    Piero Bonatti, Sabrina Kirrane, Iliana Petrova, and Luigi Sauro

  11. [11]

    doi:10.1007/ s13218-020-00677-4

    Machine Understandable Policies and GDPR Compliance Checking.KI - Künstliche Intelligenz34 (07 2020). doi:10.1007/ s13218-020-00677-4

  12. [12]

    Omar Chowdhury, Limin Jia, Deepak Garg, and Anupam Datta

  13. [13]

    InComputer Aided Verification, Armin Biere and Roderick Bloem (Eds.)

    Temporal Mode-Checking for Runtime Monitoring of Privacy Policies. InComputer Aided Verification, Armin Biere and Roderick Bloem (Eds.). Springer International Publishing, Cham, 131–149. doi:10.1007/978-3-319-08867-9_9

  14. [14]

    Dorothy E. Denning. 1976. A Lattice Model of Secure Informa- tion Flow.Commun. ACM19, 5 (1976), 236–243. doi:10.1145/ 360051.360056

  15. [15]

    Henry DeYoung, Deepak Garg, Limin Jia, Dilsun Kaynar, and Anupam Datta. 2010. Experiences in the logical specification of the HIPAA and GLBA privacy laws. InProceedings of the 9th Annual ACM Workshop on Privacy in the Electronic Society (Chicago, Illinois, USA)(WPES ’10). Association for Computing Machinery, New York, NY , USA, 73–82. doi:10.1145/1866919. 1866930

  16. [16]

    Lavanya Elluri and Karuna Pande Joshi. 2018. A Knowledge Representation of Cloud Data controls for EU GDPR Compliance. In11th IEEE International Conference on Cloud Computing (CLOUD). doi:10.1109/SERVICES.2018.00036

  17. [17]

    European Parliament. 2016. General Data Protection Regulation. https://eur-lex.europa.eu/eli/reg/2016/679/oj

  18. [18]

    Pietro Ferrara and Fausto Spoto. 2018. Static Analysis for GDPR Compliance. InProceedings of the Second Italian Conference on Cyber Security, Milan, Italy, February 6th - to - 9th, 2018 (CEUR Workshop Proceedings, Vol. 2058), Elena Ferrari, Marco Baldi, and Roberto Baldoni (Eds.). CEUR-WS.org.https://ceur- ws.org/Vol-2058/paper-10.pdf

  19. [19]

    Mafalda Ferreira, Tiago Brito, José Fragoso Santos, and Nuno Santos. 2023. RuleKeeper: GDPR-Aware Personal Data Compli- ance for Web Frameworks. In2023 IEEE Symposium on Security and Privacy (SP). 2817–2834. doi:10.1109/SP46215.2023. 10179395

  20. [20]

    Ronald Garcia, Éric Tanter, Roger Wolff, and Jonathan Aldrich

  21. [21]

    doi:10.1145/2629609

    Foundations of Typestate-Oriented Programming.ACM Transactions on Programming Languages and Systems36, 4, Article 12 (oct 2014), 44 pages. doi:10.1145/2629609

  22. [22]

    François Hublet, David Basin, and Sr¯dan Krsti´c. 2024. Enforcing the GDPR. InComputer Security – ESORICS 2023, Gene Tsudik, Mauro Conti, Kaitai Liang, and Georgios Smaragdakis (Eds.). Springer Nature Switzerland, Cham, 400–422. doi:10.1007/978- 3-031-51476-0_20

  23. [23]

    François Hublet, David Basin, and Sr ¯dan Krsti ´c. 2024. User- Controlled Privacy: Taint, Track, and Control.Proceedings on Privacy Enhancing Technologies (PoPETs)2024, 1 (2024), 597–

  24. [24]

    doi:10.56553/popets-2024-0034

  25. [25]

    Zsolt István, Soujanya Ponnapalli, and Vijay Chidambaram. 2021. Software-defined data protection: low overhead policy compli- ance at the storage layer is within reach!Proc. VLDB Endow.14, 7 (mar 2021), 1167–1174. doi:10.14778/3450980.3450986

  26. [26]

    Farzane Karami, David Basin, and Einar Broch Johnsen. 2022. DPL: A Language for GDPR Enforcement. In2022 IEEE 35th Computer Security Foundations Symposium (CSF). 112–129. doi:10.1109/CSF54842.2022.9919687

  27. [27]

    Myers and Barbara Liskov

    Andrew C. Myers and Barbara Liskov. 1997. A Decentralized Model for Information Flow Control. InProceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP). 129–

  28. [28]

    doi:10.1145/268998.266669

  29. [29]

    Luca Piras, Mohammed Ghazi Al-Obeidallah, Andrea Prai- tano, Aggeliki Tsohou, Haralambos Mouratidis, Beatriz Gallego- Nicasio Crespo, Jean Baptiste Bernard, Marco Fiorani, Em- manouil Magkos, Andrès Castillo Sanz, Michalis Pavlidis, Roberto D’Addario, and Giuseppe Giovanni Zorzino. 2019. DE- FeND Architecture: A Privacy by Design Platform for GDPR Complia...

  30. [30]

    Livio Robaldo, Cesare Bartolini, Monica Palmirani, Arianna Rossi, Michele Martoni, and Gabriele Lenzini. 2020. Formal- izing GDPR Provisions in Reified I/O Logic: The DAPRECO Knowledge Base.Journal of Logic, Language and Information 29, 4 (dec 2020), 401–449. doi:10.1007/s10849-019-09309-z

  31. [31]

    Andrei Sabelfeld and Andrew C. Myers. 2003. Language-Based Information-Flow Security.IEEE Journal on Selected Areas in Communications21, 1 (2003), 5–19. doi:10.1109/JSAC.2002. 806121

  32. [32]

    Mitchell, and David Mazières

    Deian Stefan, Alejandro Russo, John C. Mitchell, and David Mazières. 2011. Flexible Dynamic Information Flow Control in Haskell. InProceedings of the 4th ACM SIGPLAN Symposium on Haskell. 95–106. doi:10.1145/2034675.2034688

  33. [33]

    Robert E. Strom. 1983. Mechanisms for compile-time enforce- ment of security. InProceedings of the 10th ACM SIGACT- SIGPLAN Symposium on Principles of Programming Languages (Austin, Texas)(POPL ’83). Association for Computing Ma- chinery, New York, NY , USA, 276–284. doi:10.1145/567067. 567093 Joan Montas, Samuel Dodson, Anitha Gollamudi, and Matteo Cimini

  34. [34]

    Alain Tchana, Raphael Colin, Adrien Le Berre, Vincent Berger, Benoît Combemale, and Ludovic Pailler. 2023. rgpdOS: GDPR Enforcement By The Operating System. In2023 53rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks - Supplemental Volume (DSN-S). 100–104. doi:10. 1109/DSN-S58398.2023.00032

  35. [35]

    Abhishek Tiwari, Fabian Bendun, and Christian Hammer. 2018. A Formal Logic Framework for the Automation of the Right to Be Forgotten. InSecurity and Privacy in Communication Networks, Raheem Beyah, Bing Chang, Yingjiu Li, and Sencun Zhu (Eds.). Springer International Publishing, Cham, 95–111. doi:10.1007/ 978-3-030-01701-9_6

  36. [36]

    United States Public Law. 1996. Health Insurance Portability and Accountability Act of 1996 (HIPAA) Pub.L. 104–191 and the HIPAA Privacy Rule 2003. 45 CFR Part 160 and Part 16 Subparts A and E

  37. [37]

    Denis V olpano, Geoffrey Smith, and Cynthia Irvine. 1996. A Sound Type System for Secure Flow Analysis.Journal of Com- puter Security4, 3 (1996), 167–187. doi:10.3233/JCS-1996- 42-304

  38. [38]

    Wikipedia. 2024. Facebook–Cambridge Analytica data scandal — Wikipedia, The Free Encyclopedia.http://en.wikipedia.org/ w/index.php?title=Facebook%E2%80%93Cambridge% 20Analytica%20data%20scandal&oldid=1241685028. [Online; accessed 24-September-2024]

  39. [39]

    Roger Wolff, Ronald Garcia, Éric Tanter, and Jonathan Aldrich

  40. [40]

    InECOOP 2011 – Object-Oriented Programming, Mira Mezini (Ed.)

    Gradual Typestate. InECOOP 2011 – Object-Oriented Programming, Mira Mezini (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 459–483. doi:10.1007/978-3-642-22655- 7_22

  41. [41]

    Austin, Armando Solar- Lezama, Cormac Flanagan, and Stephen Chong

    Jean Yang, Travis Hance, Thomas H. Austin, Armando Solar- Lezama, Cormac Flanagan, and Stephen Chong. 2016. Precise, dynamic information flow for database-backed applications. In Proceedings of the 37th ACM SIGPLAN Conference on Program- ming Language Design and Implementation(Santa Barbara, CA, USA)(PLDI ’16). Association for Computing Machinery, New Yor...

  42. [42]

    Jean Yang, Kuat Yessenov, and Armando Solar-Lezama. 2012. A language for automatically enforcing privacy policies. InProceed- ings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages(Philadelphia, PA, USA) (POPL ’12). Association for Computing Machinery, New York, NY , USA, 85–96. doi:10.1145/2103656.2103669