pith. machine review for the scientific record. sign in

arxiv: 2605.02505 · v1 · submitted 2026-05-04 · 💻 cs.CL

Recognition: unknown

Revisiting Semantic Role Labeling: Efficient Structured Inference with Dependency-Informed Analysis

Authors on Pith no claims yet

Pith reviewed 2026-05-09 16:01 UTC · model grok-4.3

classification 💻 cs.CL
keywords semantic role labelingstructured inferencedependency parsingBERTRoBERTapredicate-argument structuremultilingual projectionefficient inference
0
0 comments X

The pith

A modern encoder-based framework for semantic role labeling preserves explicit predicate-argument structure while running inference ten times faster than prior systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents an updated approach to semantic role labeling that integrates modern language model encoders with explicit modeling of who did what to whom in sentences. Using BERT-base it reaches performance levels comparable to older frameworks, while RoBERTa and DeBERTa versions raise F1 scores inside the same structure-aware setup. Analysis shows that signals from dependency parses mainly reduce inconsistencies in the predicted spans rather than lifting raw accuracy. The design also enables direct projection of role labels across languages as a practical extension.

Core claim

A dependency-informed structured inference framework built on top of encoder models maintains explicit predicate-argument representations, delivers comparable predictive performance with BERT-base, improves F1 with RoBERTa and DeBERTa, and achieves tenfold faster inference than previous AllenNLP-style systems. Dependency cues are shown through diagnostic checks to increase structural stability at the span level, and the same explicit structure supports downstream multilingual SRL projection.

What carries the argument

The dependency-informed structured inference layer that injects dependency-parse cues to guide and stabilize span-level semantic role assignments within an encoder-based SRL model.

If this is right

  • RoBERTa and DeBERTa encoders produce higher F1 scores than BERT-base inside the identical framework.
  • The preserved explicit predicate-argument structure directly enables multilingual SRL label projection.
  • Dependency signals contribute more to prediction consistency than to absolute accuracy gains.
  • The encoder-agnostic design remains compatible with newer language models beyond those tested.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same dependency-stabilized architecture could be applied to other structured prediction tasks such as coreference or event extraction.
  • Explicit role structures may offer a route to more interpretable outputs from otherwise opaque encoder models.
  • Projection-based transfer could be empirically validated on low-resource languages to measure cross-lingual gains.
  • Real-time applications like live question answering might now become feasible at scale because of the inference speedup.

Load-bearing premise

That dependency parses supply reliable structural signals that improve stability without adding errors that offset the reported speed and accuracy gains.

What would settle it

A controlled test in which noisy or inaccurate dependency parses are fed to the model and F1 falls below the non-dependency baseline or inference time no longer improves by a factor of ten.

Figures

Figures reproduced from arXiv: 2605.02505 by Bonnie J. Dorr, Leah Jones, Sangpil Youm.

Figure 1
Figure 1. Figure 1: Overview of the revisiting SRL framework. Sentence-level encoding reuse enables efficient predicate￾conditioned inferences across modern encoder architectures. Dependency-informed diagnosis characterizes span-level er￾rors, and representation-level analysis investigates depen￾dency effects. The framework further supports representation￾consistent cross-lingual SRL transfer as a downstream appli￾cation. Fol… view at source ↗
Figure 2
Figure 2. Figure 2: illustrates such a case. When aligning after the October 1987 crash with its French coun￾terpart après l’écrasement (de) octobre 1987, de is omitted and écrasement is repeated. However, on the English side, after is incorrectly tagged as I-ARGM-TMP rather than B-ARGM-TMP, leaving the correction incomplete and propagating the error to the French side. [I-ARGM-TMP] after — après; 7-5 [I-ARGM-TMP] the — l’ ; … view at source ↗
Figure 3
Figure 3. Figure 3: Dependency-aware correction applied to the Octo￾ber 1987 crash example. The revised SRL output correctly marks after with a B-ARGM-TMP tag, restoring the missing boundary and producing a complete span alignment across English and French. These findings underscore a broader implication: structurally explicit semantic representations pro￾vide stability advantages not only within monolin￾gual inference but al… view at source ↗
Figure 4
Figure 4. Figure 4: Example SRL instance represented as a JSON object, showing tokenized words, the predicate index, and BIO-formatted semantic role labels. Yu Zhang, Qingrong Xia, Shilin Zhou, Yong Jiang, Guo￾hong Fu, and Min Zhang. 2022. Semantic role la￾beling as dependency parsing: Exploring latent tree structures inside arguments. In Proceedings of the 29th International Conference on Computational Lin￾guistics (COLING),… view at source ↗
Figure 5
Figure 5. Figure 5: shows the proportions of roles missing from our predictions, while view at source ↗
Figure 6
Figure 6. Figure 6: Proportion of missing ARGMs in our system’s predictions compared to OntoNotes 5.0 ground truth. C Distribution of Missing Semantic Roles—AllenNLP This section summarizes the distribution of missing roles in AllenNLP relative to the ground truth. Fig￾ure 7 reports the proportions of roles missing from AllenNLP, while view at source ↗
Figure 7
Figure 7. Figure 7: Proportion of missing roles or predicates in Al￾lenNLP’s predictions compared to OntoNotes 5.0 ground truth view at source ↗
Figure 8
Figure 8. Figure 8: Proportion of missing ARGMs in AllenNLP’s predictions compared to OntoNotes 5.0 ground truth view at source ↗
Figure 9
Figure 9. Figure 9: Overview of the modernized Dependency Aware Error Analyzer. without dependency information in the LLM. F Limitations We analyze the errors produced by our SRL mod￾els and categorize them using a dependency-aware detector. While the detector identifies systematic split-span patterns, the majority of cases still re￾quire human judgment for comprehensive resolu￾tion. To mitigate this, we report which errors a… view at source ↗
read the original abstract

Semantic Role Labeling (SRL) provides an explicit representation of predicate-argument structure, capturing linguistically grounded relations such as who did what to whom. While recent NLP progress has been dominated by large language models (LLMs), these systems often rely on implicit semantic representations, often lacking explicit structural constraints and systematic explanatory mechanisms. Traditionally, SRL systems have often relied on AllenNLP; however, the framework entered maintenance mode in December 2022, limiting compatibility with evolving encoder architectures and modern inference requirements. We revisit structured SRL modeling, introducing a modernized encoder-based framework that preserves explicit predicate-argument structure while enabling inference 10 times faster. Using BERT-base, the model attains comparable predictive performance, and RoBERTa and DeBERTa further improve F1 performance within the same framework. We adopt a dependency-informed diagnostic methodology to characterize span-level inconsistencies and conduct a representation-level analysis of LLM behavior under dependency-informed structural signals. Results indicate that dependency cues primarily improve structural stability. Finally, we illustrate how the framework's explicit predicate-argument structure can support multilingual SRL projection as a downstream application.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper revisits Semantic Role Labeling (SRL) by introducing a modernized encoder-based framework that incorporates dependency-informed structural signals. It claims to preserve explicit predicate-argument structure while achieving inference speeds 10 times faster than traditional systems like AllenNLP. Using BERT-base, it attains comparable F1 performance, with RoBERTa and DeBERTa yielding further improvements. A dependency-informed diagnostic analysis is used to characterize span-level inconsistencies and analyze LLM behavior, indicating that dependency cues mainly enhance structural stability. The framework is also illustrated for multilingual SRL projection as a downstream task.

Significance. If the performance, speed, and stability claims are substantiated, this manuscript would make a significant contribution to the field by bridging classical structured prediction in SRL with modern pre-trained language models. It addresses the maintenance issues of legacy frameworks like AllenNLP and provides efficiency gains alongside explicit structural representations, which are often absent in pure LLM approaches. The diagnostic analysis offers valuable insights into the role of dependency information in improving model consistency. The multilingual projection example demonstrates practical utility.

major comments (2)
  1. [Abstract] The 10x faster inference claim is load-bearing for the paper's efficiency contribution. However, the abstract does not detail the baseline implementation, whether the timing includes dependency parsing overhead, or the specific hardware and batch sizes used for measurement. This omission prevents independent verification of the speedup and assessment of its practical significance when parser time is factored in.
  2. [Dependency-informed diagnostic methodology] The conclusion that dependency cues 'primarily improve structural stability' relies on the assumption that predicted dependency parses do not introduce significant new errors. The manuscript lacks an ablation study contrasting results with gold-standard dependency parses against the predicted ones used in experiments. Additionally, parser accuracy on the SRL datasets is not reported. This is critical because any span-boundary errors from the parser could negate the claimed F1 comparability and stability benefits, directly challenging the central assertion that the framework preserves explicit structure without offsetting drawbacks.
minor comments (1)
  1. [Abstract] The reference to AllenNLP entering 'maintenance mode in December 2022' would benefit from a citation to the official announcement or repository status for completeness.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive evaluation of the paper's significance and for the detailed, constructive comments. We appreciate the recognition of the efficiency gains, structural preservation, and diagnostic insights. Below we respond point-by-point to the major comments and describe the revisions we will incorporate.

read point-by-point responses
  1. Referee: [Abstract] The 10x faster inference claim is load-bearing for the paper's efficiency contribution. However, the abstract does not detail the baseline implementation, whether the timing includes dependency parsing overhead, or the specific hardware and batch sizes used for measurement. This omission prevents independent verification of the speedup and assessment of its practical significance when parser time is factored in.

    Authors: We agree that the abstract requires additional detail to allow verification of the 10x speedup. In the revised manuscript we will expand the abstract to explicitly name the baseline (AllenNLP SRL system), state that the reported timing measures only the SRL inference step (with parser overhead reported separately in the experiments section), and specify the hardware (single NVIDIA A100 GPU) and batch size (32) used. We will also add a short paragraph in the experimental setup describing the timing protocol, including how wall-clock time was measured and averaged over multiple runs. revision: yes

  2. Referee: [Dependency-informed diagnostic methodology] The conclusion that dependency cues 'primarily improve structural stability' relies on the assumption that predicted dependency parses do not introduce significant new errors. The manuscript lacks an ablation study contrasting results with gold-standard dependency parses against the predicted ones used in experiments. Additionally, parser accuracy on the SRL datasets is not reported. This is critical because any span-boundary errors from the parser could negate the claimed F1 comparability and stability benefits, directly challenging the central assertion that the framework preserves explicit structure without offsetting drawbacks.

    Authors: We accept that reporting parser accuracy and providing an ablation with gold parses would strengthen the diagnostic claims. We will add the unlabeled and labeled attachment scores of the dependency parser on the CoNLL-2009/2012 SRL test sets in the revised version. For the ablation, we will include results using gold dependency parses on the English development set (where gold parses are available) and discuss the delta relative to predicted parses; a full test-set ablation will be noted as computationally intensive but feasible for a subset of languages. These additions will allow readers to assess whether parser-induced span errors offset the observed stability gains. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical claims are independently grounded

full rationale

The paper presents no mathematical derivation chain or equations that reduce by construction to their own inputs. Central claims of comparable F1 with BERT-base, improved F1 with RoBERTa/DeBERTa, and 10x faster inference rest on standard fine-tuning experiments over train/test splits plus runtime measurements, not on fitted parameters renamed as predictions or self-definitional structures. The dependency-informed diagnostic is applied post-hoc to characterize inconsistencies and is not invoked to justify the framework's existence or performance. No load-bearing self-citations or uniqueness theorems from the authors' prior work are used to force the results; the analysis remains externally falsifiable via replication on the same splits.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work rests on standard assumptions from the SRL and dependency parsing literature rather than introducing new free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5488 in / 1115 out tokens · 95986 ms · 2026-05-09T16:01:12.122811+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

55 extracted references · 21 canonical work pages · 3 internal anchors

  1. [1]

    , year =

    Jurafsky, Daniel and Martin, James H. , year =. Speech and

  2. [2]

    and Bonial, Claire

    Bonn, Julia and Tayyar Madabushi, Harish and Hwang, Jena D. and Bonial, Claire. Adjudicating LLM s as P rop B ank Adjudicators. Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024. 2024

  3. [3]

    and Palmer, Martha

    Bonial, Claire and Bonn, Julia and Conger, Kathryn and Hwang, Jena D. and Palmer, Martha. P rop B ank: Semantics of New Predicate Types. Proceedings of the Ninth International Conference on Language Resources and Evaluation ( LREC '14). 2014

  4. [4]

    Findings of the Association for Computational Linguistics: EMNLP 2022 , year =

    Semantic Role Labeling Meets Definition Modeling: Using Natural Language to Describe Predicate-Argument Structures , author =. Findings of the Association for Computational Linguistics: EMNLP 2022 , year =

  5. [5]

    LLM s Can Also Do Well! Breaking Barriers in Semantic Role Labeling via Large Language Models

    Li, Xinxin and Chen, Huiyao and Liu, Chengjun and Li, Jing and Zhang, Meishan and Yu, Jun and Zhang, Min. LLM s Can Also Do Well! Breaking Barriers in Semantic Role Labeling via Large Language Models. Findings of the Association for Computational Linguistics: ACL 2025. doi:10.18653/v1/2025.findings-acl.1189

  6. [6]

    Transactions of the Association for Computational Linguistics , volume =

    Semantic Role Labeling as Syntactic Dependency Parsing , author =. Transactions of the Association for Computational Linguistics , volume =. 2020 , url =

  7. [7]

    Advanced Intelligent Computing Technology and Applications: 20th International Conference, ICIC 2024, Tianjin, China, August 5–8, 2024, Proceedings, Part I , pages =

    Cheng, Ning and Yan, Zhaohui and Wang, Ziming and Li, Zhijie and Yu, Jiaming and Zheng, Zilong and Tu, Kewei and Xu, Jinan and Han, Wenjuan , title =. Advanced Intelligent Computing Technology and Applications: 20th International Conference, ICIC 2024, Tianjin, China, August 5–8, 2024, Proceedings, Part I , pages =. 2024 , isbn =. doi:10.1007/978-981-97-5...

  8. [8]

    2024 , eprint=

    Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL , author=. 2024 , eprint=

  9. [9]

    A Systematic Survey of Semantic Role Labeling in the Era of Pretrained Language Models

    Semantic Role Labeling: A Systematical Survey , author =. arXiv preprint arXiv:2502.08660 , year =

  10. [10]

    Findings of the Association for Computational Linguistics: NAACL 2022 , pages =

    Zero-shot Cross-lingual Conversational Semantic Role Labeling , author =. Findings of the Association for Computational Linguistics: NAACL 2022 , pages =. 2022 , url =

  11. [11]

    Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus

    Fei, Hao and Zhang, Meishan and Ji, Donghong. Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. doi:10.18653/v1/2020.acl-main.627

  12. [12]

    Proceedings of the 29th International Conference on Computational Linguistics (COLING) , pages =

    Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments , author =. Proceedings of the 29th International Conference on Computational Linguistics (COLING) , pages =. 2022 , url =

  13. [13]

    The Limits of Interpretation

    Umberto Eco. The Limits of Interpretation

  14. [14]

    Temporal Tagging on Different Domains: Challenges, Strategies, and Gold Standards

    Jannik Strötgen and Michael Gertz. Temporal Tagging on Different Domains: Challenges, Strategies, and Gold Standards. Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12). 2012

  15. [15]

    Chercheur

    J.L. Chercheur. Case-Based Reasoning. 1994

  16. [16]

    Castor and L

    A. Castor and L. E. Pollux. The use of user modelling to guide inference and learning. Applied Intelligence. 1992

  17. [17]

    Superman and B

    S. Superman and B. Batman and C. Catwoman and S. Spiderman. Superheroes experiences with books. Journal journal journal

  18. [18]

    Elementary Statistics

    Paul Gerhard Hoel. Elementary Statistics. 1971

  19. [19]

    1954--58

    A history of technology. 1954--58

  20. [20]

    N. Chomsky. Conditions on Transformations. A festschrift for Morris Halle. 1973

  21. [21]

    Natural Fibre Twines

    BSI. Natural Fibre Twines. 1973

  22. [22]

    Language: Its Nature, Development, and Origin

    Otto Jespersen. Language: Its Nature, Development, and Origin

  23. [23]

    Semantic Role Labeling with Neural Network Factors

    FitzGerald, Nicholas and T. Semantic Role Labeling with Neural Network Factors. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015. doi:10.18653/v1/D15-1112

  24. [24]

    Neural Semantic Role Labeling with Dependency Path Embeddings

    Roth, Michael and Lapata, Mirella. Neural Semantic Role Labeling with Dependency Path Embeddings. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016. doi:10.18653/v1/P16-1113

  25. [25]

    Syntax for Semantic Role Labeling, To Be, Or Not To Be

    He, Shexia and Li, Zuchao and Zhao, Hai and Bai, Hongxiao. Syntax for Semantic Role Labeling, To Be, Or Not To Be. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018. doi:10.18653/v1/P18-1192

  26. [26]

    https://arxiv.org/abs/1904.05255

    Simple bert models for relation extraction and semantic role labeling , author=. arXiv preprint arXiv:1904.05255 , url = "https://arxiv.org/abs/1904.05255", year=

  27. [28]

    Supervised Open Information Extraction

    Stanovsky, Gabriel and Michael, Julian and Zettlemoyer, Luke and Dagan, Ido. Supervised Open Information Extraction. Proceedings of the 2018 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018. doi:10.18653/v1/N18-1081

  28. [29]

    Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer

    Deep Contextualized Word Representations , author =. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT) , volume =. 2018 , address =. doi:10.18653/v1/N18-1202 , url =

  29. [30]

    G lo V e: Global Vectors for Word Representation

    GloVe: Global Vectors for Word Representation , author =. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) , month = oct, year =. doi:10.3115/v1/D14-1162 , url =

  30. [31]

    doi:10.18653/v1/N19-1423 , pages =

    Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina. BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi:10.18653/v...

  31. [32]

    https://arxiv.org/pdf/2407.09283

    DAHRS: Divergence-Aware Hallucination-Remediated SRL Projection , author=. International Conference on Applications of Natural Language to Information Systems , pages=. 2024 , url = "https://arxiv.org/pdf/2407.09283", organization=

  32. [33]

    Deep Semantic Role Labeling: What Works and What ' s Next

    He, Luheng and Lee, Kenton and Lewis, Mike and Zettlemoyer, Luke. Deep Semantic Role Labeling: What Works and What ' s Next. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017. doi:10.18653/v1/P17-1044

  33. [34]

    2019 , isbn =

    Li, Zuchao and He, Shexia and Zhao, Hai and Zhang, Yiqing and Zhang, Zhuosheng and Zhou, Xi and Zhou, Xiang , title =. 2019 , isbn =. doi:10.1609/aaai.v33i01.33016730 , articleno =

  34. [35]

    Proceedings of the AAAI conference on artificial intelligence , volume=

    End-to-end semantic role labeling with neural transition-based model , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

  35. [36]

    Semantic Role Labeling for Sentiment Inference: A Case Study

    Klenner, Manfred and G. Semantic Role Labeling for Sentiment Inference: A Case Study. Proceedings of the 18th Conference on Natural Language Processing (KONVENS 2022). 2022

  36. [37]

    From Stance to Concern: Adaptation of Propositional Analysis to New Tasks and Domains

    Mather, Brodie and Dorr, Bonnie and Dalton, Adam and de Beaumont, William and Rambow, Owen and Schmer-Galunder, Sonja. From Stance to Concern: Adaptation of Propositional Analysis to New Tasks and Domains. Findings of the Association for Computational Linguistics: ACL 2022. 2022. doi:10.18653/v1/2022.findings-acl.264

  37. [38]

    Evaluating Factual Consistency of Texts with Semantic Role Labeling

    Fan, Jing and Aumiller, Dennis and Gertz, Michael. Evaluating Factual Consistency of Texts with Semantic Role Labeling. Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023). 2023. doi:10.18653/v1/2023.starsem-1.9

  38. [39]

    Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005) , pages =

    Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling , author =. Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005) , pages =. 2005 , address =

  39. [40]

    Honnibal, Matthew and Montani, Ines , title =

  40. [41]

    Proceedings of the Joint Conference on EMNLP and CoNLL -- Shared Task , pages =

    CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes , author =. Proceedings of the Joint Conference on EMNLP and CoNLL -- Shared Task , pages =. 2012 , address =

  41. [42]

    Proceedings of the Thirteenth Conference on Computational Natural Language Learning: Shared Task , pages =

    The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages , author =. Proceedings of the Thirteenth Conference on Computational Natural Language Learning: Shared Task , pages =. 2009 , address =

  42. [43]

    and Peters, Matthew and Schmitz, Michael and Zettlemoyer, Luke

    Gardner, Matt and Grus, Joel and Neumann, Mark and Tafjord, Oyvind and Dasigi, Pradeep and Liu, Nelson F. and Peters, Matthew and Schmitz, Michael and Zettlemoyer, Luke. A llen NLP : A Deep Semantic Natural Language Processing Platform. Proceedings of Workshop for NLP Open Source Software ( NLP - OSS ). 2018. doi:10.18653/v1/W18-2501

  43. [44]

    Advances in Neural Information Processing Systems 32 , pages =

    PyTorch: An Imperative Style, High-Performance Deep Learning Library , author =. Advances in Neural Information Processing Systems 32 , pages =

  44. [45]

    Proceedings of the 13th Language Resources and Evaluation Conference (LREC) , pages =

    Universal Proposition Bank 2.0 , author =. Proceedings of the 13th Language Resources and Evaluation Conference (LREC) , pages =. 2022 , address =

  45. [46]

    Linguistic Data Consortium , year =

    OntoNotes Release 5.0 , author =. Linguistic Data Consortium , year =

  46. [47]

    Proceedings of *SEM (STARSEM) 2022 , pages =

    PropBank Comes of Age—Larger, Smarter, and more Diverse , author =. Proceedings of *SEM (STARSEM) 2022 , pages =. 2022 , url =

  47. [48]

    RoBERTa: A Robustly Optimized BERT Pretraining Approach

    Roberta: A robustly optimized bert pretraining approach , author=. arXiv preprint arXiv:1907.11692 , year=

  48. [49]

    DeBERTa: Decoding-enhanced BERT with Disentangled Attention

    Deberta: Decoding-enhanced bert with disentangled attention , author=. arXiv preprint arXiv:2006.03654 , year=

  49. [50]

    Aho and Jeffrey D

    Alfred V. Aho and Jeffrey D. Ullman , title =. 1972

  50. [51]

    Publications Manual , year = "1983", publisher =

  51. [52]

    Chandra and Dexter C

    Ashok K. Chandra and Dexter C. Kozen and Larry J. Stockmeyer , year = "1981", title =. doi:10.1145/322234.322243

  52. [53]

    Scalable training of

    Andrew, Galen and Gao, Jianfeng , booktitle=. Scalable training of

  53. [54]

    Dan Gusfield , title =. 1997

  54. [55]

    Tetreault , title =

    Mohammad Sadegh Rasooli and Joel R. Tetreault , title =. Computing Research Repository , volume =. 2015 , url =

  55. [56]

    A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =

    Ando, Rie Kubota and Zhang, Tong , Issn =. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , Volume =. Journal of Machine Learning Research , Month = dec, Numpages =