pith. sign in

arxiv: 2604.16361 · v1 · pith:EMSZVE4Enew · submitted 2026-03-19 · 💻 cs.SE

Modelling GDPR-based Privacy Requirements with Software Engineering Diagrams: A Systematic Literature Review

Pith reviewed 2026-05-15 08:23 UTC · model grok-4.3

classification 💻 cs.SE
keywords GDPRprivacy requirementssoftware engineering diagramssystematic literature reviewrequirements modelingcompliance checkingsoftware design
0
0 comments X

The pith

A review of 18 studies shows current software engineering diagrams capture only partial GDPR privacy requirements and lack integration across the development lifecycle.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper conducts a systematic literature review to examine how diagrams from software engineering are used to model and integrate GDPR-based privacy requirements into system designs. It selects and analyzes 18 primary studies published from 2017 to 2025, grouping them by the types of diagrams employed and the specific GDPR principles or rights they address. The central observation is that existing approaches handle isolated aspects of privacy but fall short in connecting diagrams to one another and in maintaining traceability from initial design through later stages. This matters because incomplete modeling can leave software systems vulnerable to privacy compliance failures. The review concludes by pointing to specific gaps that future work must close.

Core claim

The review establishes that software engineering diagrams have been applied to represent certain GDPR elements such as consent and data minimization, yet the 18 studies collectively demonstrate insufficient support for combining multiple diagram types, tracing requirements across the full software lifecycle, providing dedicated tool support, and enabling automated compliance verification.

What carries the argument

Categorization of diagram types used in the studies together with their mapping onto specific GDPR principles and rights.

If this is right

  • Improved inter-diagram integration would allow privacy requirements to be consistently represented across different views of the same system.
  • Full lifecycle traceability mechanisms would make it possible to check whether GDPR requirements remain satisfied from design through deployment and maintenance.
  • Tool support would reduce the manual effort required to create and maintain privacy-aware diagrams.
  • Automated compliance checking would help detect violations earlier and more reliably than manual review alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Without integration, designers may model privacy in one diagram while overlooking conflicting requirements shown in another.
  • The identified gaps suggest that many current software projects risk incomplete GDPR compliance despite using diagrams.
  • Extending the review with new studies after 2025 could test whether the reported needs have begun to be met.

Load-bearing premise

That the 18 primary studies identified through the search protocol give a complete and unbiased picture of all relevant research in this area.

What would settle it

Discovery of additional studies from 2017 to 2025 that were missed by the protocol and that already demonstrate working inter-diagram integration plus automated GDPR compliance checking.

Figures

Figures reproduced from arXiv: 2604.16361 by Anna Philippou, Evangelia Vanezi, Georgia M. Kapitsaki.

Figure 1
Figure 1. Figure 1: PRISMA-style flow diagram illustrating the study [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Temporal distribution of the selected studies by [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
read the original abstract

The application of the General Data Protection Regulation (GDPR) has significantly affected privacy requirements elicitation, modelling, and verification in Software Engineering (SE). One of the affected areas is requirements visualisation through modelling diagrams, which plays a crucial role in ensuring privacy compliance, as functional system requirements should be integrated with GDPR-based privacy requirements. We present a systematic literature review on how SE diagrams have been employed to capture and integrate GDPR-based privacy requirements into software system design. The study aims to identify the existing research landscape, existing gaps, and directions for future work. Following a rigorous search protocol and addressing two research questions, 18 primary studies published between 2017 and 2025 were selected, analysed, and categorised based on (i) the diagram types used, and (ii) the GDPR principles or rights addressed. The findings highlight the need for inter-diagram integration, full lifecycle traceability mechanisms, tool support, and automated compliance checking.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript presents a systematic literature review on the application of software engineering diagrams to capture and integrate GDPR-based privacy requirements into software system design. It follows a search protocol to select 18 primary studies (2017–2025), categorizes them by diagram types used and GDPR principles/rights addressed, and identifies gaps including insufficient inter-diagram integration, lack of full lifecycle traceability, limited tool support, and absence of automated compliance checking.

Significance. If the 18-study sample is representative, the review offers a useful synthesis of current practices in privacy-aware requirements modeling. It maps existing diagram usage against GDPR elements and points to concrete future directions (integration mechanisms, traceability, tooling, and automation) that could advance compliance support in software engineering.

major comments (1)
  1. [Abstract / Methods] Abstract and Methods: The manuscript asserts a 'rigorous search protocol' yet provides no explicit search strings, list of databases queried, inclusion/exclusion criteria, or quantitative assessment of coverage (e.g., recall estimate or sensitivity analysis). Because the central gap claims rest on the completeness of the 18-study sample, the absence of these details prevents verification that the reported absences of integration and traceability are genuine rather than artifacts of incomplete retrieval.
minor comments (1)
  1. [Abstract] Abstract: The date range '2017 and 2025' should state the precise search cutoff (e.g., papers published through December 2024 or early 2025) to allow readers to assess currency.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We agree that greater methodological transparency is essential for a systematic literature review and will revise the manuscript to address this concern directly.

read point-by-point responses
  1. Referee: [Abstract / Methods] Abstract and Methods: The manuscript asserts a 'rigorous search protocol' yet provides no explicit search strings, list of databases queried, inclusion/exclusion criteria, or quantitative assessment of coverage (e.g., recall estimate or sensitivity analysis). Because the central gap claims rest on the completeness of the 18-study sample, the absence of these details prevents verification that the reported absences of integration and traceability are genuine rather than artifacts of incomplete retrieval.

    Authors: We accept this observation. The current manuscript describes the protocol at a high level but does not reproduce the concrete search strings, the full list of databases, the precise inclusion/exclusion criteria, or any coverage metrics. In the revised version we will add a dedicated subsection (or expanded appendix) that reports: (1) the exact Boolean search strings applied to each database, (2) the complete list of databases and digital libraries queried, (3) the full inclusion and exclusion criteria with justifications, and (4) a PRISMA-style flow diagram together with the number of papers retrieved at each stage. We will also include a brief discussion of search limitations and any sensitivity checks performed. These additions will allow readers to assess the completeness of the 18-study sample and the validity of the identified gaps. revision: yes

Circularity Check

0 steps flagged

No circularity: SLR synthesizes external studies without self-referential reduction

full rationale

This is a systematic literature review that applies a search protocol to identify 18 primary studies (2017-2025), then categorizes them by diagram types and GDPR elements to identify gaps. No equations, parameters, predictions, or derivations exist. Claims rest on direct analysis of cited external literature. No self-citation is load-bearing for the central synthesis; the protocol is described independently of its outputs. Per hard rules, absent any quoted reduction of a result to its own inputs by construction, the finding is no significant circularity (score 0). The representativeness of the sample is an external validity concern, not an internal circularity issue.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

This is a literature review paper that does not introduce free parameters, new axioms beyond standard review methodology, or invented entities. It relies on existing published studies and conventional SLR practices.

axioms (1)
  • domain assumption A rigorous search protocol was followed to identify and select primary studies
    Stated in the abstract; specific protocol details such as search strings and databases are not provided here.

pith-pipeline@v0.9.0 · 5464 in / 1242 out tokens · 66028 ms · 2026-05-15T08:23:26.068209+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

  1. [1]

    is this ohk´ ey?

    Springer. Sion, L., Dewitte, P ., V an Landuyt, D., Wuyts, K., Emanuilov, I., V alcke, P ., and Joosen, W. (2019). An architectural view for data protection by design. In 2019 IEEE International Conference on Software Ar- chitecture (ICSA), pages 11–20. IEEE. Tao, Y . and Kung, C. (1991). Formal definition and verifi- cation of data flow diagrams. Journal of...

  2. [2]

    V anezi, E., V asileiou, A., and Papadopoulos, G

    IEEE. V anezi, E., V asileiou, A., and Papadopoulos, G. A. (2025b). Making data collection transparent and usable: Anno- tating web forms with processing purposes. In Con- ference on e-Business, e-Services and e-Society, pages 288–300. Springer. V anezi, E., Zampa, G., Mettouris, C., Yeratziotis, A., and Papadopoulos, G. A. (2021). Complicy: Evaluating th...

  3. [3]

    V eseli, F., Olvera, J

    Springer. V eseli, F., Olvera, J. S., Pulls, T., and Rannenberg, K. (2019). Engineering privacy by design: lessons from the design and implementation of an identity wallet platform. In Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, pages 1475–1483. Ye, T., Zhuang, Y ., and Qiao, G. (2023). MBIPV: a model- based approach for identifying...