ACAT: A Collaborative Platform for Efficient Aspect-Based Sentiment Dataset Annotation

Ana-Maria Luisa Mocanu; Ciprian-Octavian Truica; Elena-Simona Apostol

arxiv: 2606.04189 · v1 · pith:WSAK6JE3new · submitted 2026-06-02 · 💻 cs.CL

ACAT: A Collaborative Platform for Efficient Aspect-Based Sentiment Dataset Annotation

Ana-Maria Luisa Mocanu , Ciprian-Octavian Truica , Elena-Simona Apostol This is my paper

Pith reviewed 2026-06-28 09:55 UTC · model grok-4.3

classification 💻 cs.CL

keywords Aspect-Based Sentiment AnalysisCollaborative AnnotationInter-Annotator AgreementDataset AnnotationABSA WorkflowsETL PipelineSentiment Analysis

0 comments

The pith

ACAT automates alignment of multi-annotator ABSA data and computes IAA metrics at export to produce training-ready datasets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ACAT as a web-based platform that supports four specific ABSA annotation workflows and includes an automated ETL pipeline for handling collaborative input. Existing tools require researchers to manually consolidate flat-file outputs, reconstruct structures, and run custom scripts for agreement metrics. ACAT aims to replace those steps by aligning annotations and calculating IAA directly during export. A test on 1002 restaurant reviews with two annotators yielded a median time of 31.58 seconds per item and IAA values between 0.78 and 0.86.

Core claim

ACAT natively supports four ABSA workflows: Aspect-Category Sentiment Analysis, Clause-Level Segmentation, Aspect-Term Sentiment Analysis with character-level position tracking, and Aspect Sentiment Triplet Extraction with dual span offset preservation. Its core contribution is an automated Extract, Transform, Load (ETL) pipeline that aligns collaborative annotations and computes Inter-Annotator Agreement (IAA) metrics directly at export, yielding training-ready datasets.

What carries the argument

The automated Extract, Transform, Load (ETL) pipeline that aligns annotations from multiple users and computes IAA metrics directly at export time.

If this is right

Researchers receive training-ready datasets without writing custom consolidation scripts.
The platform supports four distinct ABSA workflows with preserved relational structures and position data.
Median annotation time reaches 31.58 seconds per review in the reported validation.
Raw IAA scores between 0.78 and 0.86 are obtained across tasks with annotators of differing expertise.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The ETL approach could be adapted to reduce manual work in dataset creation for other NLP annotation tasks.
Built-in IAA reporting at export may increase consistency in how agreement is documented for published datasets.
Workflow-specific span tracking suggests the platform could support extensions to related extraction problems like opinion triplets.

Load-bearing premise

The automated ETL pipeline correctly and completely aligns multi-annotator data and computes accurate IAA values without manual verification or adjustments.

What would settle it

A test export of annotations from multiple users that produces misaligned data structures or incorrect IAA scores requiring manual fixes.

Figures

Figures reproduced from arXiv: 2606.04189 by Ana-Maria Luisa Mocanu, Ciprian-Octavian Truica, Elena-Simona Apostol.

**Figure 1.** Figure 1: ACAT export architecture. Linguistic variables (a, s, t, o) are composed into four tasks of increasing granularity (top), with their CSV, JSON, and XML serialisations shown below, including character-level offsets and the implicit flag [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

read the original abstract

Aspect-Based Sentiment Analysis (ABSA) requires high-quality datasets to train reliable models. However, existing annotation tools treat output as flat files, leaving researchers to manually consolidate multi-annotator data, reconstruct relational structures, and compute reliability metrics through custom scripts. This paper introduces ACAT (Aspect-based sentiment analysis Collaborative Annotation Tool), a web-based platform natively supporting four ABSA workflows: (1) Aspect-Category Sentiment Analysis, (2) Clause-Level Segmentation, (3) Aspect-Term Sentiment Analysis with character-level position tracking, and (4) Aspect Sentiment Triplet Extraction with dual span offset preservation. Its core contribution is an automated Extract, Transform, Load (ETL) pipeline that aligns collaborative annotations and computes Inter-Annotator Agreement (IAA) metrics directly at export, yielding training-ready datasets. In a preliminary validation on 1,002 restaurant reviews with two annotators of differing expertise, ACAT achieves a median annotation time of 31.58 seconds and a raw IAA ranging from 0.78 to 0.86 across all tasks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ACAT introduces a web tool tailored to four ABSA workflows with an automated ETL claim, but supplies no details on alignment logic or IAA computation.

read the letter

The paper's main offering is a web platform that supports aspect-category, clause-level, aspect-term with spans, and triplet extraction workflows, plus an ETL step that supposedly aligns annotations from multiple users and spits out IAA scores and ready datasets at export.

That combination of ABSA-specific features is new enough to be worth noting; most general annotation tools do not preserve dual offsets or handle the relational structure out of the box. The reported median time of 31 seconds per review and IAA in the 0.78-0.86 range on 1,002 restaurant reviews with two annotators is at least a concrete data point.

The soft spot is exactly where the stress-test flagged: the core claim is the automated pipeline, yet the text gives no matching rules, handling of partial overlaps, pseudocode, or error analysis for span-based or triplet cases. Without that, the IAA numbers cannot be evaluated. The validation is also small and lacks any baseline comparison or statistical detail.

This is a practical tool paper aimed at groups that annotate ABSA data regularly. A reader building a dataset might find the interface useful, but the work does not supply enough technical substance for a methods paper. I would not send it to peer review in its current form; the central contribution needs the missing implementation description before it can be assessed.

Referee Report

1 major / 0 minor

Summary. The paper introduces ACAT, a web-based collaborative annotation platform for Aspect-Based Sentiment Analysis (ABSA) that natively supports four workflows: Aspect-Category Sentiment Analysis, Clause-Level Segmentation, Aspect-Term Sentiment Analysis with character-level position tracking, and Aspect Sentiment Triplet Extraction with dual span offset preservation. Its central claim is that an automated Extract, Transform, Load (ETL) pipeline aligns multi-annotator data (including spans and offsets) and computes Inter-Annotator Agreement (IAA) metrics directly at export to produce training-ready datasets. A preliminary validation on 1,002 restaurant reviews with two annotators reports a median annotation time of 31.58 seconds and raw IAA ranging from 0.78 to 0.86 across tasks.

Significance. If the ETL pipeline is correctly implemented for span-based and relational ABSA tasks, ACAT could reduce manual post-processing effort for multi-annotator datasets and provide reliable IAA metrics out of the box, addressing a practical gap in ABSA data preparation. The reported annotation times and IAA values suggest efficiency gains, but the absence of implementation details prevents determining whether these results are reproducible or generalizable.

major comments (1)

[ETL pipeline description (Methods/System section)] The manuscript's core contribution—the automated ETL pipeline that aligns collaborative annotations (including character-level spans and dual offsets) and computes IAA metrics—is stated without any algorithm, pseudocode, matching rules for partial overlaps, handling of the four workflows, or error analysis. This directly undermines evaluation of the reported IAA values (0.78–0.86) and the claim that the pipeline yields training-ready datasets without additional verification.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. The point regarding the ETL pipeline description is well-taken and we address it directly below.

read point-by-point responses

Referee: [ETL pipeline description (Methods/System section)] The manuscript's core contribution—the automated ETL pipeline that aligns collaborative annotations (including character-level spans and dual offsets) and computes IAA metrics—is stated without any algorithm, pseudocode, matching rules for partial overlaps, handling of the four workflows, or error analysis. This directly undermines evaluation of the reported IAA values (0.78–0.86) and the claim that the pipeline yields training-ready datasets without additional verification.

Authors: We agree that the manuscript presents the ETL pipeline at a high level without algorithms, pseudocode, explicit matching rules, per-workflow handling details, or error analysis. This limits assessment of the IAA figures and the training-ready claim. In revision we will expand the Methods/System section with: pseudocode for the alignment, transformation, and export steps; rules for span matching (exact, partial overlap via character offset or IoU threshold); workflow-specific logic for the four ABSA tasks; and a short error analysis of the IAA computation. These additions will make the pipeline reproducible and allow readers to evaluate the reported results. revision: yes

Circularity Check

0 steps flagged

No circularity: descriptive tool paper with no derivations or load-bearing self-citations

full rationale

The manuscript presents a software platform and reports empirical annotation times and IAA values from a small validation set. No equations, parameter fits, uniqueness theorems, or derivation steps appear in the abstract or described content. The central claim (automated ETL alignment and IAA computation) is asserted as an engineering contribution without any mathematical reduction to prior inputs or self-citations that would create circularity. This is a standard non-circular descriptive account of a tool.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical model, free parameters, axioms, or invented scientific entities are involved; the work is a software tool description.

pith-pipeline@v0.9.1-grok · 5725 in / 1057 out tokens · 23687 ms · 2026-06-28T09:55:56.647806+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references · 13 canonical work pages · 2 internal anchors

[1]

human annotators: A comprehensive analysis of chatgpt for text annotation

Aldeen,M.,Luo,J.,Lian,A.,Zheng,V.,Hong,A.,Yetukuri,P.,Cheng,L.:Chatgpt vs. human annotators: A comprehensive analysis of chatgpt for text annotation. In: International Conference on Machine Learning and Applications. pp. 602–609. IEEE (2023). https://doi.org/10.1109/ICMLA58977.2023.00089

work page doi:10.1109/icmla58977.2023.00089 2023
[2]

Knowledge-Based Systems 326, 113987 (2025)

Apostol, E.S., Pisică, A.G., Truică, C.O.: ATESA-BÆRT: A heterogeneous ensem- ble learning model for aspect-based sentiment analysis. Knowledge-Based Systems 326, 113987 (2025). https://doi.org/10.1016/j.knosys.2025.113987 6 A. Mocanu et al

work page doi:10.1016/j.knosys.2025.113987 2025
[3]

Educational and psychological measurement20(1), 37–46 (1960)

Cohen, J.: A coefficient of agreement for nominal scales. Educational and psychological measurement20(1), 37–46 (1960). https://doi.org/10.1177/ 001316446002000104

1960
[4]

In: International Con- ference on Complex, Intelligent, and Software Intensive Systems

Colucci Cante, L., D’Angelo, S., Di Martino, B., Graziano, M.: Text annotation tools: A comprehensive review and comparative analysis. In: International Con- ference on Complex, Intelligent, and Software Intensive Systems. pp. 353–362. Springer (2024). https://doi.org/10.1007/978-3-031-70011-8_33

work page doi:10.1007/978-3-031-70011-8_33 2024
[5]

Psychological bulletin76(5), 378 (1971)

Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychological bulletin76(5), 378 (1971). https://doi.org/10.1037/h0031619

work page doi:10.1037/h0031619 1971
[6]

EduRABSA: An Education Review Dataset for Aspect-based Sentiment Analysis Tasks

Hua, Y.C., Denny, P., Wicker, J., Taskova, K.: EduRABSA: An Education Review Dataset for Aspect-based Sentiment Analysis Tasks. arXiv preprint arXiv:2508.17008 (2025). https://doi.org/10.48550/arXiv.2508.17008

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2508.17008 2025
[7]

Li, Maxwell Nye, and Jacob Andreas

Li, B.Z., Nye, M., Andreas, J.: Implicit representations of meaning in neural lan- guage models. In: Annual Meeting of the ACL and IJCNLP. pp. 1813–1827 (2021). https://doi.org/10.18653/v1/2021.acl-long.143

work page doi:10.18653/v1/2021.acl-long.143 2021
[8]

Journal of Information Science Theory and Practice3, 6–23 (2015)

Na, J.C., Kyaing, W.: Sentiment analysis of user-generated content on drug review websites. Journal of Information Science Theory and Practice3, 6–23 (2015). https: //doi.org/10.1633/JISTaP.2015.3.1.1

work page doi:10.1633/jistap.2015.3.1.1 2015
[9]

Nakayama, H., Kubo, T., Kamura, J., Taniguchi, Y., Liang, X.: doccano: Text annotation tool for human (2018), https://github.com/doccano/doccano

2018
[10]

Pang, B., Lee, L.: Opinion mining and sentiment analysis. Comput. Linguist35(2), 311–312 (2009). https://doi.org/10.1561/1500000011

work page doi:10.1561/1500000011 2009
[11]

In: AAAI conference on artificial intelligence

Peng, H., et al.: Knowing what, how and why: A near complete solution for aspect- based sentiment analysis. In: AAAI conference on artificial intelligence. pp. 8600– 8607 (2020). https://doi.org/10.1609/aaai.v34i05.6383

work page doi:10.1609/aaai.v34i05.6383 2020
[12]

In: Conference on Empirical Meth- ods in Natural Language Processing

Perry, T.: LightTag: Text Annotation Platform. In: Conference on Empirical Meth- ods in Natural Language Processing. pp. 20–27 (2021). https://doi.org/10.18653/ v1/2021.emnlp-demo.3

2021
[13]

IEEE Transactions on Affec- tiveComputing16(2),555–572(2025).https://doi.org/10.1109/taffc.2024.3434355

Petrescu, A., Truică, C.O., Apostol, E.S., Paschke, A.: EDSA-Ensemble: An Event Detection Sentiment Analysis Ensemble Architecture. IEEE Transactions on Affec- tiveComputing16(2),555–572(2025).https://doi.org/10.1109/taffc.2024.3434355

work page doi:10.1109/taffc.2024.3434355 2025
[14]

In: International Workshop on Semantic Evaluation

Pontiki, M., et al.: SemEval-2014 task 4: Aspect based sentiment analysis. In: International Workshop on Semantic Evaluation. pp. 27–35 (2014). https://doi. org/10.3115/v1/S14-2004

work page doi:10.3115/v1/s14-2004 2014
[15]

In: Inter- national Conference on Computational Linguistics

Saeidi, M., Bouchard, G., Liakata, M., Riedel, S.: SentiHood: Targeted As- pect Based Sentiment Analysis Dataset for Urban Neighbourhoods. In: Inter- national Conference on Computational Linguistics. pp. 1546–1556 (2016), https: //aclanthology.org/C16-1146/

2016
[16]

In: European Chapter of the ACL

Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: brat: a Web-based Tool for NLP-Assisted Text Annotation. In: European Chapter of the ACL. pp. 102–107 (2012), https://aclanthology.org/E12-2021/

2012
[17]

Sun, Y., Huang, Q., Tung, A.K., Yu, J.: Text embeddings should capture implicit semantics,notjustsurfacemeaning.arXivpreprintarXiv:2506.08354(2025).https: //doi.org/10.48550/arXiv.2506.08354

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2506.08354 2025
[18]

Tkachenko, M., Malyuk, M., Holmanyuk, A., Liubimov, N.: Label Studio: Data labeling software (2020-2025), https://github.com/HumanSignal/label-studio

2020
[19]

UPB Scientific Bulletin - Series C79(4), 69–84 (2017)

Truică, C.O., Leordeanu, C.A.: Classification of an imbalanced data set using de- cision tree algorithms. UPB Scientific Bulletin - Series C79(4), 69–84 (2017)

2017
[20]

In: ACL 2018, System Demonstrations

Yang, J., Zhang, Y., Li, L., Li, X.: YEDDA: A Lightweight Collaborative Text Span Annotation Tool. In: ACL 2018, System Demonstrations. pp. 31–36 (2018). https://doi.org/10.18653/v1/P18-4006

work page doi:10.18653/v1/p18-4006 2018

[1] [1]

human annotators: A comprehensive analysis of chatgpt for text annotation

Aldeen,M.,Luo,J.,Lian,A.,Zheng,V.,Hong,A.,Yetukuri,P.,Cheng,L.:Chatgpt vs. human annotators: A comprehensive analysis of chatgpt for text annotation. In: International Conference on Machine Learning and Applications. pp. 602–609. IEEE (2023). https://doi.org/10.1109/ICMLA58977.2023.00089

work page doi:10.1109/icmla58977.2023.00089 2023

[2] [2]

Knowledge-Based Systems 326, 113987 (2025)

Apostol, E.S., Pisică, A.G., Truică, C.O.: ATESA-BÆRT: A heterogeneous ensem- ble learning model for aspect-based sentiment analysis. Knowledge-Based Systems 326, 113987 (2025). https://doi.org/10.1016/j.knosys.2025.113987 6 A. Mocanu et al

work page doi:10.1016/j.knosys.2025.113987 2025

[3] [3]

Educational and psychological measurement20(1), 37–46 (1960)

Cohen, J.: A coefficient of agreement for nominal scales. Educational and psychological measurement20(1), 37–46 (1960). https://doi.org/10.1177/ 001316446002000104

1960

[4] [4]

In: International Con- ference on Complex, Intelligent, and Software Intensive Systems

Colucci Cante, L., D’Angelo, S., Di Martino, B., Graziano, M.: Text annotation tools: A comprehensive review and comparative analysis. In: International Con- ference on Complex, Intelligent, and Software Intensive Systems. pp. 353–362. Springer (2024). https://doi.org/10.1007/978-3-031-70011-8_33

work page doi:10.1007/978-3-031-70011-8_33 2024

[5] [5]

Psychological bulletin76(5), 378 (1971)

Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychological bulletin76(5), 378 (1971). https://doi.org/10.1037/h0031619

work page doi:10.1037/h0031619 1971

[6] [6]

EduRABSA: An Education Review Dataset for Aspect-based Sentiment Analysis Tasks

Hua, Y.C., Denny, P., Wicker, J., Taskova, K.: EduRABSA: An Education Review Dataset for Aspect-based Sentiment Analysis Tasks. arXiv preprint arXiv:2508.17008 (2025). https://doi.org/10.48550/arXiv.2508.17008

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2508.17008 2025

[7] [7]

Li, Maxwell Nye, and Jacob Andreas

Li, B.Z., Nye, M., Andreas, J.: Implicit representations of meaning in neural lan- guage models. In: Annual Meeting of the ACL and IJCNLP. pp. 1813–1827 (2021). https://doi.org/10.18653/v1/2021.acl-long.143

work page doi:10.18653/v1/2021.acl-long.143 2021

[8] [8]

Journal of Information Science Theory and Practice3, 6–23 (2015)

Na, J.C., Kyaing, W.: Sentiment analysis of user-generated content on drug review websites. Journal of Information Science Theory and Practice3, 6–23 (2015). https: //doi.org/10.1633/JISTaP.2015.3.1.1

work page doi:10.1633/jistap.2015.3.1.1 2015

[9] [9]

Nakayama, H., Kubo, T., Kamura, J., Taniguchi, Y., Liang, X.: doccano: Text annotation tool for human (2018), https://github.com/doccano/doccano

2018

[10] [10]

Pang, B., Lee, L.: Opinion mining and sentiment analysis. Comput. Linguist35(2), 311–312 (2009). https://doi.org/10.1561/1500000011

work page doi:10.1561/1500000011 2009

[11] [11]

In: AAAI conference on artificial intelligence

Peng, H., et al.: Knowing what, how and why: A near complete solution for aspect- based sentiment analysis. In: AAAI conference on artificial intelligence. pp. 8600– 8607 (2020). https://doi.org/10.1609/aaai.v34i05.6383

work page doi:10.1609/aaai.v34i05.6383 2020

[12] [12]

In: Conference on Empirical Meth- ods in Natural Language Processing

Perry, T.: LightTag: Text Annotation Platform. In: Conference on Empirical Meth- ods in Natural Language Processing. pp. 20–27 (2021). https://doi.org/10.18653/ v1/2021.emnlp-demo.3

2021

[13] [13]

IEEE Transactions on Affec- tiveComputing16(2),555–572(2025).https://doi.org/10.1109/taffc.2024.3434355

Petrescu, A., Truică, C.O., Apostol, E.S., Paschke, A.: EDSA-Ensemble: An Event Detection Sentiment Analysis Ensemble Architecture. IEEE Transactions on Affec- tiveComputing16(2),555–572(2025).https://doi.org/10.1109/taffc.2024.3434355

work page doi:10.1109/taffc.2024.3434355 2025

[14] [14]

In: International Workshop on Semantic Evaluation

Pontiki, M., et al.: SemEval-2014 task 4: Aspect based sentiment analysis. In: International Workshop on Semantic Evaluation. pp. 27–35 (2014). https://doi. org/10.3115/v1/S14-2004

work page doi:10.3115/v1/s14-2004 2014

[15] [15]

In: Inter- national Conference on Computational Linguistics

Saeidi, M., Bouchard, G., Liakata, M., Riedel, S.: SentiHood: Targeted As- pect Based Sentiment Analysis Dataset for Urban Neighbourhoods. In: Inter- national Conference on Computational Linguistics. pp. 1546–1556 (2016), https: //aclanthology.org/C16-1146/

2016

[16] [16]

In: European Chapter of the ACL

Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: brat: a Web-based Tool for NLP-Assisted Text Annotation. In: European Chapter of the ACL. pp. 102–107 (2012), https://aclanthology.org/E12-2021/

2012

[17] [17]

Sun, Y., Huang, Q., Tung, A.K., Yu, J.: Text embeddings should capture implicit semantics,notjustsurfacemeaning.arXivpreprintarXiv:2506.08354(2025).https: //doi.org/10.48550/arXiv.2506.08354

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2506.08354 2025

[18] [18]

Tkachenko, M., Malyuk, M., Holmanyuk, A., Liubimov, N.: Label Studio: Data labeling software (2020-2025), https://github.com/HumanSignal/label-studio

2020

[19] [19]

UPB Scientific Bulletin - Series C79(4), 69–84 (2017)

Truică, C.O., Leordeanu, C.A.: Classification of an imbalanced data set using de- cision tree algorithms. UPB Scientific Bulletin - Series C79(4), 69–84 (2017)

2017

[20] [20]

In: ACL 2018, System Demonstrations

Yang, J., Zhang, Y., Li, L., Li, X.: YEDDA: A Lightweight Collaborative Text Span Annotation Tool. In: ACL 2018, System Demonstrations. pp. 31–36 (2018). https://doi.org/10.18653/v1/P18-4006

work page doi:10.18653/v1/p18-4006 2018