DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects

Adaku Uchendu; Ali Al-Lawati; Dongwon Lee; Jason Lucas; Matt Murtagh; Uchendu Uchendu

arxiv: 2604.05318 · v1 · submitted 2026-04-07 · 💻 cs.CL

DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects

Jason Lucas , Matt Murtagh , Ali Al-Lawati , Uchendu Uchendu , Adaku Uchendu , Dongwon Lee This is my paper

Pith reviewed 2026-05-10 19:54 UTC · model grok-4.3

classification 💻 cs.CL

keywords disinformation detectiondialectal variationEnglish dialectsmodel robustnessharmful contentbenchmarkmultilingual modelscontent moderation

0 comments

The pith

Disinformation detectors show reduced performance on non-Standard American English dialects.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a benchmark to check how disinformation detectors handle English written in 50 different dialects instead of only Standard American English. It converts existing test data into dialect versions using systematic language rules and runs the new samples through 16 different models. Human-written dialect versions cause measurable drops in detection quality while AI-written versions do not, and some models lose more than a third of their accuracy on mixed inputs. Multilingual models maintain higher scores across dialects than models trained only on English. If the pattern holds, the tools used to flag harmful content may work less reliably for speakers of many English varieties.

Core claim

Evaluations using the DIA-HARM benchmark and the D3 corpus of 195K dialectal samples show that human-written dialectal content degrades F1 scores by 1.4-3.6 percent across 16 models while AI-generated dialectal content stays stable, with some models exhibiting over 33 percent degradation on mixed content. Fine-tuned transformers reach best-case F1 of 96.6 percent versus 78.3 percent for zero-shot LLMs, and cross-dialect analysis of 2450 pairs finds that multilingual models such as mDeBERTa average 97.2 percent F1 while monolingual models like RoBERTa fail on dialectal inputs.

What carries the argument

The DIA-HARM benchmark applies linguistically grounded transformations to create 50 English dialect variants of disinformation samples for testing detection robustness.

If this is right

Fine-tuned transformers substantially outperform zero-shot LLMs on dialectal disinformation inputs.
Multilingual models generalize across dialects far better than monolingual models such as RoBERTa.
Human-written dialect content triggers larger performance losses than AI-generated dialect content.
Current detectors may produce unequal results for hundreds of millions of non-Standard American English speakers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Moderation pipelines may need explicit dialect coverage during training to reduce uneven error rates.
The observed stability on AI-generated text suggests detectors may rely on cues that differ between synthetic and natural language.
Similar robustness gaps could appear in related tasks such as hate-speech or toxicity detection.
Global deployment of these models would benefit from routine testing on authentic regional English data.

Load-bearing premise

The transformed dialect samples accurately match real-world usage and keep the original disinformation label without adding separate changes that alter model behavior.

What would settle it

Running the same 16 models on naturally collected disinformation examples written in the target dialects and finding no drop in detection scores relative to Standard American English.

Figures

Figures reproduced from arXiv: 2604.05318 by Adaku Uchendu, Ali Al-Lawati, Dongwon Lee, Jason Lucas, Matt Murtagh, Uchendu Uchendu.

**Figure 2.** Figure 2: The DIA-HARM evaluation framework. Starting from 9 SAE disinformation benchmarks, we apply Multi-VALUE rule-based dialect transformations to generate 50 English dialectal variants. D-PURIFY validates transformation quality using semantic, logical, and feature accuracy metrics. We then evaluate 16 detectors across multiple experimental settings (SQ1–SQ4), measuring classification robustness under unseen, se… view at source ↗

**Figure 3.** Figure 3: SQ1: Generalization gap (∆ F1) from SAE to dialectal variants by content type. Solid blue = human content; hatched green = AI content; dotted orange = mixed content. Negative values indicate degradation on dialects. 6.1 SQ1: Generalization to Unseen Dialects [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Asymmetric harm across evaluation regimes. [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Per-model asymmetric harm across all evaluation regimes. Green dots show [PITH_FULL_IMAGE:figures/full_fig_p035_5.png] view at source ↗

read the original abstract

Harmful content detectors-particularly disinformation classifiers-are predominantly developed and evaluated on Standard American English (SAE), leaving their robustness to dialectal variation unexplored. We present DIA-HARM, the first benchmark for evaluating disinformation detection robustness across 50 English dialects spanning U.S., British, African, Caribbean, and Asia-Pacific varieties. Using Multi-VALUE's linguistically grounded transformations, we introduce D3 (Dialectal Disinformation Detection), a corpus of 195K samples derived from established disinformation benchmarks. Our evaluation of 16 detection models reveals systematic vulnerabilities: human-written dialectal content degrades detection by 1.4-3.6% F1, while AI-generated content remains stable. Fine-tuned transformers substantially outperform zero-shot LLMs (96.6% vs. 78.3% best-case F1), with some models exhibiting catastrophic failures exceeding 33% degradation on mixed content. Cross-dialectal transfer analysis across 2,450 dialect pairs shows that multilingual models (mDeBERTa: 97.2% average F1) generalize effectively, while monolingual models like RoBERTa and XLM-RoBERTa fail on dialectal inputs. These findings demonstrate that current disinformation detectors may systematically disadvantage hundreds of millions of non-SAE speakers worldwide. We release the DIA-HARM framework, D3 corpus, and evaluation tools: https://github.com/jsl5710/dia-harm

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DIA-HARM offers a broad new benchmark on dialectal disinformation detection but leans too much on unvalidated synthetic transformations.

read the letter

The main takeaway is that this paper creates the first benchmark for disinformation detection across 50 English dialects and reports performance gaps on transformed data, but those gaps may not reflect real dialectal issues. What the work does is introduce DIA-HARM and the D3 corpus of 195K samples derived via Multi-VALUE from prior benchmarks. It evaluates 16 models, finds bigger drops on human-written dialect content than AI-generated, and shows multilingual models transfer better across 2450 dialect pairs. Releasing the code and data is practical and lets the community check the details. The evaluations provide specific numbers on F1 degradation and model comparisons that were not available before for this scale of dialects. Covering varieties from multiple regions adds breadth. The soft spot sits in the data generation. The transformations are rule- or model-based, yet the paper does not appear to include checks like human judgments on how natural the text sounds or whether the disinformation label holds after changes. If the output introduces unnatural patterns or shifts meaning, the observed drops of a few percent or more could stem from that rather than dialect robustness. This makes the broader claim about disadvantaging non-standard English speakers rest on an assumption that needs testing against actual dialect data. This paper suits researchers focused on fairness in NLP applications like content moderation. Someone looking for a new testbed or ideas on cross-dialect evaluation will find the resources valuable. It deserves peer review because the benchmark and release are concrete steps forward, even if the interpretation of results requires more support. I would recommend sending it to referees.

Referee Report

2 major / 2 minor

Summary. The paper introduces DIA-HARM, the first benchmark for disinformation detection robustness across 50 English dialects (U.S., British, African, Caribbean, Asia-Pacific). It constructs the D3 corpus (195K samples) by applying Multi-VALUE linguistically grounded transformations to existing disinformation benchmarks. Evaluation of 16 models (fine-tuned transformers and zero-shot LLMs) reports F1 degradations of 1.4-3.6% on human-written dialectal content (with some mixed cases >33%), better performance from multilingual models (e.g., mDeBERTa at 97.2% average F1), and cross-dialectal transfer results over 2,450 pairs. The authors conclude that current detectors may systematically disadvantage non-SAE speakers and release the framework, D3 corpus, and tools.

Significance. If the D3 corpus validly represents real-world dialectal disinformation, the work identifies a practically important robustness gap in harmful-content detection systems that could affect hundreds of millions of speakers. The empirical scale (50 dialects, 16 models, large corpus), release of code/data, and cross-dialect transfer analysis are strengths that would support follow-on research in fairness and multilingual NLP.

major comments (2)

[D3 corpus construction] Corpus construction / D3 creation section: The central claim that detectors 'systematically disadvantage' non-SAE speakers rests on performance drops observed after Multi-VALUE transformations. No quantitative validation (human naturalness ratings, semantic equivalence checks against authentic dialect corpora, or label-preservation verification) is reported for the 50 dialects. If transformations introduce unnatural phrasing or alter surface cues that models rely on, the measured F1 gaps (1.4-3.6% and >33%) may reflect artifacts rather than dialectal robustness failure.
[Evaluation of 16 detection models] Evaluation and results section: The abstract and results distinguish human-written vs. AI-generated content and report specific degradation numbers, but provide no statistical tests (e.g., significance of F1 differences, confidence intervals, or controls for transformation-induced label drift) to support that the observed gaps are attributable to dialect rather than other factors. This weakens the load-bearing inference to real-world disadvantage.

minor comments (2)

[Abstract] Abstract: The phrasing 'first benchmark' should be qualified with citations to prior dialectal robustness studies in related tasks (e.g., sentiment, toxicity) to avoid overstatement.
[Results tables] Table/figure captions: Ensure all tables reporting F1 scores include the exact number of samples per dialect category and the baseline SAE performance for direct comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and outline revisions to strengthen the presentation of our results while preserving the core contributions of the DIA-HARM benchmark.

read point-by-point responses

Referee: [D3 corpus construction] Corpus construction / D3 creation section: The central claim that detectors 'systematically disadvantage' non-SAE speakers rests on performance drops observed after Multi-VALUE transformations. No quantitative validation (human naturalness ratings, semantic equivalence checks against authentic dialect corpora, or label-preservation verification) is reported for the 50 dialects. If transformations introduce unnatural phrasing or alter surface cues that models rely on, the measured F1 gaps (1.4-3.6% and >33%) may reflect artifacts rather than dialectal robustness failure.

Authors: We appreciate the referee's emphasis on corpus validity. The D3 corpus relies on Multi-VALUE transformations, which were previously validated in the source work for linguistic fidelity, naturalness, and semantic preservation across English dialects through expert linguistic review and human judgments. We did not replicate new human evaluations here to focus on the downstream detection task, but we will add explicit citations to those prior validations, a dedicated paragraph discussing their scope, and a brief acknowledgment that our results inherit the strengths and limitations of the transformation framework. Label preservation follows from the design of the transformations (surface-form changes that retain propositional content), and we will note this explicitly. We agree that fresh verification would be ideal and will include it as a limitation if space allows. revision: partial
Referee: [Evaluation of 16 detection models] Evaluation and results section: The abstract and results distinguish human-written vs. AI-generated content and report specific degradation numbers, but provide no statistical tests (e.g., significance of F1 differences, confidence intervals, or controls for transformation-induced label drift) to support that the observed gaps are attributable to dialect rather than other factors. This weakens the load-bearing inference to real-world disadvantage.

Authors: We agree that statistical support would strengthen the claims. In the revised manuscript we will add (1) bootstrap-derived 95% confidence intervals for all reported F1 scores, (2) paired statistical tests (Wilcoxon signed-rank) comparing original vs. dialectal performance per model and dialect group, and (3) a small-scale manual audit of 200 transformed samples to quantify any label drift introduced by the transformations. These additions will be placed in the evaluation section and will directly address whether the observed gaps exceed what could be expected from sampling variation or transformation artifacts. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark evaluation with external dependencies

full rationale

The paper constructs the D3 corpus by applying Multi-VALUE transformations (an external prior method) to established disinformation benchmarks and then measures performance of 16 detection models across dialects. No mathematical derivations, equations, or 'predictions' are present that reduce by construction to fitted parameters or self-referential definitions. Central claims rest on observed F1 scores and cross-dialect transfer metrics, which are directly falsifiable via independent replication on the released corpus rather than being forced by internal definitions or self-citation chains. The work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the validity of the Multi-VALUE dialect transformations preserving semantic content and labels, plus the assumption that the chosen 50 dialects and 16 models are representative for the stated conclusions.

axioms (1)

domain assumption Linguistically grounded transformations from Multi-VALUE produce valid dialectal variants that preserve the original disinformation label.
Invoked when creating the D3 corpus from established benchmarks.

pith-pipeline@v0.9.0 · 5576 in / 1203 out tokens · 55200 ms · 2026-05-10T19:54:06.000936+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

[1]

arXiv preprint arXiv:2006.00885 (2020)

Generating natural language adversarial ex- amples. InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2890–2896, Brussels, Belgium. Association for Computational Linguistics. Nida Aslam, Irfan Ullah Khan, Farah Salem Alotaibi, Lama Abdulaziz Aldaej, and Asma Khaled Al- dubaikil. 2021. Fake detect: A deep learn...

work page arXiv 2018
[2]

Nirosh Jayakody, Azeem Mohammad, and Malka N Halgamuge

Robust fake news detection over time and attack.ACM Transactions on Intelligent Systems and Technology, 11(1). Nirosh Jayakody, Azeem Mohammad, and Malka N Halgamuge. 2022. Fake news detection using a de- centralized deep learning model and federated learn- ing. InIECON 2022 – 48th Annual Conference of the IEEE Industrial Electronics Society, pages 1–6. I...

work page 2022
[3]

InFindings of the Association for Computational Linguistics: EMNLP 2023, pages 7226–7245, Singapore

Quantifying the dialect gap and its correlates across languages. InFindings of the Association for Computational Linguistics: EMNLP 2023, pages 7226–7245, Singapore. Association for Computa- tional Linguistics. Vijay Keswani and L Elisa Celis. 2021. Dialect diversity in text summarization on twitter. InProceedings of the web conference 2021, pages 3802–38...

work page 2023
[4]

Hu Linmei, Tianchi Yang, Chuan Shi, Houye Ji, and Xiaoli Li

Mm-covid: A multilingual and multimodal data repository for combating covid-19 disinforma- tion.Preprint, arXiv:2011.04088. Chin-Yew Lin. 2004. ROUGE: A package for auto- matic evaluation of summaries. InText Summariza- tion Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics. Hanmeng Liu, Ruoxi Ning, Zhiyang Teng, Jian ...

work page arXiv 2011
[5]

In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14279–14305, Singapore

Fighting fire with fire: The dual role of LLMs in crafting and detecting elusive disinformation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14279–14305, Singapore. Association for Compu- tational Linguistics. Jason S. Lucas, Barani Maung Maung, Maryam Tabar, Keegan McBride, and Dongwon Lee. 2024. The l...

work page 2023
[6]

liar, liar pants on fire

Rejected dialects: Biases against african amer- ican language in reward models. InFindings of the Association for Computational Linguistics: NAACL 2025, pages 7468–7487. Van-Hoang Nguyen, Kazunari Sugiyama, Preslav Nakov, and Min-Yen Kan. 2020. FANG: Leveraging social context for fake news detection using graph representation. InProceedings of the 29th AC...

work page arXiv 2025
[7]

InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11328–11348, Toronto, Canada

AlignScore: Evaluating factual consistency with a unified alignment function. InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11328–11348, Toronto, Canada. Association for Computational Linguistics. Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020. B...

work page arXiv 2020
[8]

SAE-trained metrics encode SAE grammat- ical norms, treating deviations as errors rather than valid variation

work page
[9]

SAE-adjacent dialects suffer disproportion- atelybecause their subtle features are inter- preted as noise or mistakes

work page
[10]

real” or “fake

Strict filtering would create SAE-biased benchmarksby systematically excluding va- rieties closest to SAE. Our lenient thresholds preserve the full spectrum of dialectal variation, enabling evaluation of de- tector robustness across both SAE-adjacent and SAE-distant varieties. This design choice is itself a methodological contribution: future dialect benc...

work page 2023
[11]

Over-flagging dominates.Dialect-induced false positives outnumber false negatives 6.5:1 (27,020 vs. 4,169). The primary harm is that authentic dialectal speech, including legitimate public health information and political dis- course, is systematically flagged as disinforma- tion

work page
[12]

Morphological markers are the primary trig- ger.The most frequent false-positive mech- anism is dialectal morphology: pluralthem- suffixing,a-prefixing, and non-standard deter- miner use create surface tokens that models have learned to associate with fabricated content, likely because such patterns are absent from SAE-dominated training data

work page
[13]

In shorter texts, dialectal features constitute a larger fraction of total tokens, amplifying their influence on model representations

Short-form content is most vulnerable.Twit- ter posts account for 71.7% of false positives despite representing only 33% of the test data. In shorter texts, dialectal features constitute a larger fraction of total tokens, amplifying their influence on model representations

work page
[14]

Morphosyntactically distant dialects are most affected.Fiji (Basilectal), Rural AA VE, and Bahamian English rank highest for over- flagging, while closer varieties (SE England, Chicano) show fewer errors, consistent with a linguistic-distance gradient

work page
[15]

Errors are confidently wrong, not uncer- tain.81.4% of false positives and 75.5% of false negatives are made with >0.95 confidence. RoBERTa achieves 99.5% mean confidence on false positives, ruling out calibration-based fixes and indicating that dialectal features are en- coded as class-discriminative in the learned rep- resentations

work page
[16]

real”, “fake

Disinformation evasion targets specific do- mains.False negatives concentrate in political fact-checking (LIAR, 32.0%) and COVID misin- formation (MMCOVID, 58.5%), with TextCNN as the most vulnerable model (24.9% of all FNs). This suggests that dialectal transfor- mation of topical disinformation can exploit domain-specific detection heuristics. P Prompt ...

work page

[1] [1]

arXiv preprint arXiv:2006.00885 (2020)

Generating natural language adversarial ex- amples. InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2890–2896, Brussels, Belgium. Association for Computational Linguistics. Nida Aslam, Irfan Ullah Khan, Farah Salem Alotaibi, Lama Abdulaziz Aldaej, and Asma Khaled Al- dubaikil. 2021. Fake detect: A deep learn...

work page arXiv 2018

[2] [2]

Nirosh Jayakody, Azeem Mohammad, and Malka N Halgamuge

Robust fake news detection over time and attack.ACM Transactions on Intelligent Systems and Technology, 11(1). Nirosh Jayakody, Azeem Mohammad, and Malka N Halgamuge. 2022. Fake news detection using a de- centralized deep learning model and federated learn- ing. InIECON 2022 – 48th Annual Conference of the IEEE Industrial Electronics Society, pages 1–6. I...

work page 2022

[3] [3]

InFindings of the Association for Computational Linguistics: EMNLP 2023, pages 7226–7245, Singapore

Quantifying the dialect gap and its correlates across languages. InFindings of the Association for Computational Linguistics: EMNLP 2023, pages 7226–7245, Singapore. Association for Computa- tional Linguistics. Vijay Keswani and L Elisa Celis. 2021. Dialect diversity in text summarization on twitter. InProceedings of the web conference 2021, pages 3802–38...

work page 2023

[4] [4]

Hu Linmei, Tianchi Yang, Chuan Shi, Houye Ji, and Xiaoli Li

Mm-covid: A multilingual and multimodal data repository for combating covid-19 disinforma- tion.Preprint, arXiv:2011.04088. Chin-Yew Lin. 2004. ROUGE: A package for auto- matic evaluation of summaries. InText Summariza- tion Branches Out, pages 74–81, Barcelona, Spain. Association for Computational Linguistics. Hanmeng Liu, Ruoxi Ning, Zhiyang Teng, Jian ...

work page arXiv 2011

[5] [5]

In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14279–14305, Singapore

Fighting fire with fire: The dual role of LLMs in crafting and detecting elusive disinformation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14279–14305, Singapore. Association for Compu- tational Linguistics. Jason S. Lucas, Barani Maung Maung, Maryam Tabar, Keegan McBride, and Dongwon Lee. 2024. The l...

work page 2023

[6] [6]

liar, liar pants on fire

Rejected dialects: Biases against african amer- ican language in reward models. InFindings of the Association for Computational Linguistics: NAACL 2025, pages 7468–7487. Van-Hoang Nguyen, Kazunari Sugiyama, Preslav Nakov, and Min-Yen Kan. 2020. FANG: Leveraging social context for fake news detection using graph representation. InProceedings of the 29th AC...

work page arXiv 2025

[7] [7]

InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11328–11348, Toronto, Canada

AlignScore: Evaluating factual consistency with a unified alignment function. InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11328–11348, Toronto, Canada. Association for Computational Linguistics. Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020. B...

work page arXiv 2020

[8] [8]

SAE-trained metrics encode SAE grammat- ical norms, treating deviations as errors rather than valid variation

work page

[9] [9]

SAE-adjacent dialects suffer disproportion- atelybecause their subtle features are inter- preted as noise or mistakes

work page

[10] [10]

real” or “fake

Strict filtering would create SAE-biased benchmarksby systematically excluding va- rieties closest to SAE. Our lenient thresholds preserve the full spectrum of dialectal variation, enabling evaluation of de- tector robustness across both SAE-adjacent and SAE-distant varieties. This design choice is itself a methodological contribution: future dialect benc...

work page 2023

[11] [11]

Over-flagging dominates.Dialect-induced false positives outnumber false negatives 6.5:1 (27,020 vs. 4,169). The primary harm is that authentic dialectal speech, including legitimate public health information and political dis- course, is systematically flagged as disinforma- tion

work page

[12] [12]

Morphological markers are the primary trig- ger.The most frequent false-positive mech- anism is dialectal morphology: pluralthem- suffixing,a-prefixing, and non-standard deter- miner use create surface tokens that models have learned to associate with fabricated content, likely because such patterns are absent from SAE-dominated training data

work page

[13] [13]

In shorter texts, dialectal features constitute a larger fraction of total tokens, amplifying their influence on model representations

Short-form content is most vulnerable.Twit- ter posts account for 71.7% of false positives despite representing only 33% of the test data. In shorter texts, dialectal features constitute a larger fraction of total tokens, amplifying their influence on model representations

work page

[14] [14]

Morphosyntactically distant dialects are most affected.Fiji (Basilectal), Rural AA VE, and Bahamian English rank highest for over- flagging, while closer varieties (SE England, Chicano) show fewer errors, consistent with a linguistic-distance gradient

work page

[15] [15]

Errors are confidently wrong, not uncer- tain.81.4% of false positives and 75.5% of false negatives are made with >0.95 confidence. RoBERTa achieves 99.5% mean confidence on false positives, ruling out calibration-based fixes and indicating that dialectal features are en- coded as class-discriminative in the learned rep- resentations

work page

[16] [16]

real”, “fake

Disinformation evasion targets specific do- mains.False negatives concentrate in political fact-checking (LIAR, 32.0%) and COVID misin- formation (MMCOVID, 58.5%), with TextCNN as the most vulnerable model (24.9% of all FNs). This suggests that dialectal transfor- mation of topical disinformation can exploit domain-specific detection heuristics. P Prompt ...

work page