MuSimA: A Tool with Multi-modal Input for Generating Bespoke ABAC Datasets
Pith reviewed 2026-05-10 16:19 UTC · model grok-4.3
The pith
MuSimA generates synthetic ABAC datasets matching user-specified attribute value distributions from JSON files or hand-drawn sketches.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MuSimA is a web-based tool that generates bespoke ABAC datasets with user-specified probability distributions of attribute values. Specifications are accepted either as structured JSON or as minimal JSON combined with hand-drawn distribution sketches, from which a Large Language Model extracts the parameters. The resulting synthetic data can be generated at varying scales and downloaded for use in testing ABAC systems.
What carries the argument
Multi-modal input handler that accepts JSON specifications or hand-drawn sketches and uses an LLM to convert sketches into distribution parameters for ABAC dataset generation.
If this is right
- ABAC researchers can produce datasets whose attribute statistics exactly match chosen probability distributions.
- Scalability experiments become feasible by generating data at multiple sizes and policy complexities.
- The tool supports both precise file-based input and intuitive visual specification of distributions.
- The generated data is intended to be downloaded and used directly for ABAC method evaluation.
Where Pith is reading between the lines
- Widespread use of such a generator could produce more consistent benchmark datasets across ABAC papers.
- The sketch-to-parameter step could be extended to support additional visual cues like histograms or curves.
- Direct linkage of the output data to policy simulators would allow immediate testing of enforcement correctness.
Load-bearing premise
The LLM reliably converts hand-drawn sketches into accurate distribution parameters and the resulting synthetic data behaves like real ABAC data for system testing.
What would settle it
A side-by-side run of the same ABAC enforcement algorithm on MuSimA-generated data and on a real collected ABAC dataset, where the two produce substantially different performance or policy outcomes.
Figures
read the original abstract
Recent advances in research on Attribute-based Access Control (ABAC) has led to the development of several ingenious methods for representing and enforcing organizational security policies. However, so far little effort has been spent towards building a tool for generating large-scale synthetic datasets that can be used to test the developed ABAC systems. In this paper, we address this shortcoming by building MuSimA - a web-based tool for generating ABAC datasets with user-specified probability distributions of attribute values. It supports multi-modal input, i.e., users can provide specifications either as a structured JSON file or as a combination of a minimal JSON along with hand-drawn distribution sketches. In the latter case, a Large Language Model is used to automatically extract appropriate distribution parameters from the sketches. The generated synthetic ABAC data matching the input specifications can be downloaded by the user. For studying scalability of algorithms and methods related to ABAC, data can be generated for varying sizes and complexities. We make MuSimA freely available for use by the research community.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents MuSimA, a web-based tool for generating synthetic ABAC datasets with user-specified probability distributions over attribute values. It supports multi-modal input via either a full structured JSON specification or a minimal JSON paired with hand-drawn distribution sketches, where an LLM extracts the distribution parameters. The tool allows generation and download of datasets at varying sizes and complexities for evaluating ABAC systems and is made freely available.
Significance. If the LLM reliably interprets sketches and the output data proves realistic and useful, MuSimA could help address the scarcity of customizable ABAC datasets for research. However, the manuscript supplies no implementation details, accuracy metrics, error analysis, statistical fidelity checks, or user studies, so the practical significance cannot be assessed from the provided description alone.
major comments (2)
- [Abstract] Abstract: The central claim that MuSimA 'addresses this shortcoming' by supporting multi-modal input with LLM-based sketch interpretation is load-bearing for the paper's contribution, yet the abstract (and manuscript) provides no accuracy metrics, example sketch-to-parameter mappings, error rates, or validation experiments for the LLM component.
- [Abstract] Abstract: The statement that 'data can be generated for varying sizes and complexities' to study ABAC scalability is presented without any reported performance measurements, generation times, dataset examples, or comparisons to prior synthetic ABAC data generators.
minor comments (1)
- [Abstract] Abstract: The phrase 'minimal JSON along with hand-drawn distribution sketches' would benefit from a brief clarification of the exact interface and how the two inputs are merged before LLM processing.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript describing MuSimA. We address each major comment point by point below, indicating planned revisions where we can strengthen the paper without altering its core contribution as a tool description.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that MuSimA 'addresses this shortcoming' by supporting multi-modal input with LLM-based sketch interpretation is load-bearing for the paper's contribution, yet the abstract (and manuscript) provides no accuracy metrics, example sketch-to-parameter mappings, error rates, or validation experiments for the LLM component.
Authors: We acknowledge that the LLM-based sketch interpretation is a highlighted feature and that supporting evidence would strengthen the claim. The manuscript is structured as a tool paper focused on design, architecture, and user workflow rather than an empirical evaluation of the LLM. No accuracy metrics or validation experiments were conducted. In revision we will add concrete examples of sketch-to-parameter mappings (with before/after illustrations) and a limitations paragraph discussing LLM reliability. This constitutes a partial revision; full error analysis and user studies remain outside the current scope but can be flagged for future work. revision: partial
-
Referee: [Abstract] Abstract: The statement that 'data can be generated for varying sizes and complexities' to study ABAC scalability is presented without any reported performance measurements, generation times, dataset examples, or comparisons to prior synthetic ABAC data generators.
Authors: The statement reflects the tool's parametric design, which permits users to control dataset size and attribute complexity via the input specification. Specific benchmarks were not reported because the primary novelty lies in the multi-modal input mechanism. We will revise the manuscript to include sample generation times for datasets of different sizes, one or two downloadable example datasets, and a brief reference to prior ABAC generators for context. This addresses the request directly. revision: yes
Circularity Check
No circularity: descriptive tool paper with no derivations or fitted predictions
full rationale
The manuscript presents MuSimA as a web tool for ABAC dataset generation from user-specified distributions (JSON or LLM-processed sketches). No equations, parameter fitting, predictions, or derivation chains appear in the abstract or described content. The contribution is a feature description and availability statement, not a claimed first-principles result or statistical model. Self-citations are absent from the provided text, and the LLM component is presented as an implementation detail without any reduction to fitted inputs or self-referential definitions. This is a standard non-circular engineering/tool paper.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[2]
RanSAM: Randomized Search for ABAC Policy Mining
Nakul Aggarwal and Shamik Sural. “RanSAM: Randomized Search for ABAC Policy Mining”. In:ACM Con- ference on Data and Application Security and Privacy. 2023, pp. 291–293.DOI:10.1145/3577923.3585050. [3]Amazon Employee Access challenge:2014.URL:https : / / www . kaggle . com / competitions / amazon - employee-access-challenge/data. [4]Amazon UCI Dataset. 20...
-
[3]
From Static to Dynamic Access Control Policies via Attribute-Based Category Mining
Anna Bamberger and Maribel Fern ´andez. “From Static to Dynamic Access Control Policies via Attribute-Based Category Mining”. In:33rd International Symposium on Logic-Based Program Synthesis and Transformation. 2023, pp. 188–197.DOI:10.1007/978-3-031-45784-5_12
-
[4]
Thang Bui et al. “ABAC Lab: An Interactive Platform for Attribute-based Access Control Policy Analysis, Tools, and Datasets [Dataset/Tool Paper]”. In:ACM Symposium on Access Control Models and Technologies. 2025, pp. 111–116.DOI:10.1145/3734436.3734441
-
[5]
Yuqing Ding et al. “SharAcc: Enhancing scalability and security in Attribute-Based Access Control with sharding-based blockchain and full decentralization”. In:Computer Networks257 (2025), p. 110992.DOI: 10.1016/j.comnet.2024.110992
-
[6]
Yunhua He et al. “Attribute-Based Access Control Scheme for Secure Identity Resolution in Prognostics and Health Management”. In:IEEE Internet of Things Journal11.13 (2024), pp. 23140–23155.DOI:10.1109/ JIOT.2024.3387079
-
[7]
Kaiqing Huang. “Traceable and revocable large universe multi-authority attribute-based access control with resisting key abuse”. In:Comput. Networks272 (2025), p. 111694.DOI:10.1016/j.comnet.2025.111694
-
[8]
Performance analysis of dynamic ABAC systems using a queuing theoretic frame- work
Gaurav Madkaikar et al. “Performance analysis of dynamic ABAC systems using a queuing theoretic frame- work”. In:Comput. Secur .154 (2025), p. 104432.DOI:10.1016/j.cose.2025.104432
-
[9]
Towards ABAC policy mining from logs with deep learning
Decebal Mocanu et al. “Towards ABAC policy mining from logs with deep learning”. In:18th International Multiconference on Intelligent Systems. 2015.URL:https : / / pure . tue . nl / ws / files / 9876041 / ABACPolicyMining_author_version.pdf
work page 2015
-
[10]
Toward Deep Learning Based Access Control
Mohammad Nur Nobi et al. “Toward Deep Learning Based Access Control”. In:ACM Conference on Data and Application Security and Privacy. 2022, pp. 143–154.DOI:10.1145/3508398.3511497
-
[11]
Tool/Dataset Paper: Realistic ABAC Data Generation using Conditional Tabular GAN
Ritwik Rai and Shamik Sural. “Tool/Dataset Paper: Realistic ABAC Data Generation using Conditional Tabular GAN”. In:ACM Conference on Data and Application Security and Privacy. 2023, pp. 273–278.DOI:10.1145/ 3577923.3583635
-
[12]
ABAC policy mining method based on hierarchical clustering and relationship extraction
Siyuan Shang et al. “ABAC policy mining method based on hierarchical clustering and relationship extraction”. In:Comput. Secur .139 (2024), p. 103717.DOI:10.1016/j.cose.2024.103717
-
[13]
ABAC policy mining method for heterogeneous access control system
Siyuan Shang et al. “ABAC policy mining method for heterogeneous access control system”. In:J. Supercom- put.81.9 (2025), p. 1065.DOI:10.1007/s11227-025-07539-6
- [14]
-
[15]
IEEE Transactions on Mobile Computing , month = apr, pages =
Zihao Wang et al. “Attribute-Based Bilateral Access Control With Sanitization and Trust Management for IIoT”. In:IEEE Internet Things J.12.8 (2025), pp. 10818–10833.DOI:10.1109/JIOT.2024.3513454
-
[16]
Efficient Registered Attribute Based Access Control With Same Sub-Policies in Mobile Cloud Computing
Wuwei Weng et al. “Efficient Registered Attribute Based Access Control With Same Sub-Policies in Mobile Cloud Computing”. In:IEEE Transactions on Mobile Computing24.9 (2025), pp. 8441–8453.DOI:10.1109/ TMC.2025.3556279
-
[17]
Mining Attribute-Based Access Control Policies
Zhongyuan Xu and Scott D. Stoller. “Mining Attribute-Based Access Control Policies”. In:IEEE Transactions on Dependable and Secure Computing12.5 (2015), pp. 533–545.DOI:10.1109/TDSC.2014.2369048
-
[18]
Zhongyuan Xu and Scott D. Stoller. “Mining Attribute-Based Access Control Policies from Logs”. In:IFIP WG 11.3 Conference on Data and Applications Security and Privacy. 2014, pp. 276–291.DOI:10.1007/978- 3-662-43936-4_18
-
[19]
Mian Yang et al. “Extraction of Machine Enforceable ABAC Policies from Natural Language Text using LLM Knowledge Distillation”. In:30th ACM SACMAT. 2025, pp. 157–168.DOI:10.1145/3734436.3734447
-
[20]
Hui Yin et al. “Privacy-Preservation Enhanced and Efficient Attribute-Based Access Control for Smart Health in Cloud-Assisted Internet of Things”. In:IEEE Internet Things J.12.1 (2025), pp. 894–903.DOI:10.1109/ JIOT.2024.3470891
-
[21]
Zhaoqian Zhang et al. “Attribute-Based Access Control With Credible Outsourcing and Collusion-Resistant Revocation Based on Blockchain for Iomt”. In:Concurr . Comput. Pract. Exp.37.12-14 (2025), pp. 1–17. 10
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.