On the Geometric Limits of Transformer Defenses against Obfuscation Attacks: Latent Embedding Collapse & Performance Robustness Gap

Becky Mashaido; Tapadhir Das

arxiv: 2605.19159 · v1 · pith:JRQQOFBLnew · submitted 2026-05-18 · 💻 cs.CR

On the Geometric Limits of Transformer Defenses against Obfuscation Attacks: Latent Embedding Collapse & Performance Robustness Gap

Becky Mashaido , Tapadhir Das This is my paper

Pith reviewed 2026-05-20 08:44 UTC · model grok-4.3

classification 💻 cs.CR

keywords prompt injectionobfuscation attackslatent embedding collapsetransformer defensesembedding robustnessperformance-robustness gapBERT encodersgeometric analysis

0 comments

The pith

High detection accuracy in prompt-injection defenses masks near-overlap between obfuscated and clean embeddings in transformer models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that multi-operator obfuscated prompts, which combine homoglyphs, zero-width characters, and noise such as punctuation or emojis, can partially collapse onto the embedding manifold of clean prompts. This latent embedding collapse occurs even though detectors built on BERT-family encoders achieve near-perfect classification performance. The minimal distance between clean and obfuscated embeddings reaches only 1.02 while obfuscated points show markedly higher intra-class variance, exposing a performance-robustness gap that persists across models of different depths and capacities. A reader would care because current evaluation practices rely on classification scores that do not detect this geometric instability, leaving potential attack surfaces unaddressed.

Core claim

Multi-operator obfuscated prompts partially collapse onto the embedding manifold of clean prompts, a phenomenon termed latent embedding collapse. Across multiple BERT-family encoders, detectors reach near-perfect classification yet the minimal clean-obfuscated margin equals 1.02, indicating near-overlap, while obfuscated embeddings exhibit elevated intra-class variance of 3.33 plus or minus 6.23. These results demonstrate a substantial performance-robustness gap, and increasing model capacity does not eliminate the collapse.

What carries the argument

Latent embedding collapse: the partial overlap of obfuscated prompt embeddings with the manifold of clean prompt embeddings, which reveals geometric fragility despite strong classification boundaries.

If this is right

Classification accuracy alone cannot certify robustness against obfuscated prompt injections.
Embedding-space margins and variance must be measured to assess whether a defense has truly separated clean and attacked inputs.
Scaling model depth or capacity leaves the observed collapse and variance unchanged.
Geometry-aware training or evaluation is required as a complement to performance-based testing.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Attackers could exploit the small margin by generating obfuscations that remain close to clean embeddings yet still trigger the intended injection.
Similar embedding instability may appear in non-BERT transformer families or in multimodal models that process text alongside other modalities.
Defenses could be improved by explicitly optimizing for larger inter-class margins in the latent space rather than classification loss alone.

Load-bearing premise

The specific multi-operator obfuscation combinations and BERT-family encoders tested represent the general behavior of transformer defenses against obfuscation attacks.

What would settle it

A defense architecture or training procedure that produces a clean-obfuscated embedding margin substantially larger than 1.02 while preserving near-perfect classification accuracy would show the collapse is not inherent.

Figures

Figures reproduced from arXiv: 2605.19159 by Becky Mashaido, Tapadhir Das.

**Figure 2.** Figure 2: Proposed methodology of understanding obfuscated [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: DistillBERT PCA Projections of Prompt Embeddings [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: BERTBase PCA Projections of Prompt Embeddings [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: BERTMedium PCA Projections of Prompt Embeddings [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 8.** Figure 8: BERTMedium t-SNE Projections of Prompt Embed [PITH_FULL_IMAGE:figures/full_fig_p005_8.png] view at source ↗

read the original abstract

Prompt injection attacks pose significant risks to language model safety, yet existing defenses are typically evaluated using classification performance. We show that high detection performance does not imply representational robustness. Specifically, multi-operator obfuscated prompts (combining homoglyphs, zero-width characters, and punctuation or emoji noise) can partially collapse onto the embedding manifold of clean prompts, a phenomenon we term latent embedding collapse. Results indicate that across multiple BERT family encoders with varying depth and capacity, detectors achieve near-perfect classification performance, yet the minimal clean-obfuscated margin delta = 1.02, indicating near-overlap of obfuscated and clean embeddings. Obfuscated embeddings further exhibit elevated intra-class variance (3.33 +/- 6.23), indicating severe latent-space instability despite high performance. These results reveal a substantial perf ormance-robustness gap, demonstrating that standard evaluation metrics fail to capture latent embedding collapse and underlying geometric fragility. Our findings show that increasing model capacity does not eliminate latent embedding collapse, motivating geometry-aware robustness analysis as a necessary complement to performance-based evaluation for prompt-injection defenses.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper flags that high detection accuracy can mask near-overlap in embeddings for multi-operator obfuscated prompts, but stops short of showing this actually weakens defenses in practice.

read the letter

The main thing to know is that this work measures a small clean-obfuscated margin of 1.02 and elevated variance in BERT embeddings even when classification accuracy looks near-perfect. They frame the overlap as latent embedding collapse under combined homoglyph, zero-width, and noise attacks, and note the issue persists across model sizes. That observation is worth flagging for anyone who treats accuracy as a sufficient robustness signal.

Referee Report

3 major / 2 minor

Summary. The manuscript claims that high classification accuracy of BERT-family detectors on prompt-injection attacks does not imply representational robustness. Multi-operator obfuscated prompts (homoglyphs + zero-width characters + punctuation/emoji noise) partially collapse onto the clean-prompt embedding manifold, producing a minimal clean-obfuscated margin of delta = 1.02 and elevated intra-class variance of 3.33 +/- 6.23; this 'latent embedding collapse' is presented as evidence of a performance-robustness gap that standard metrics miss, and the gap persists across model capacities.

Significance. If the geometric measurements were shown to predict actual defense failures, the result would usefully motivate geometry-aware evaluation as a complement to accuracy-based testing for LLM safety mechanisms.

major comments (3)

[Abstract and Results] Abstract and Results: The central claim of a 'performance-robustness gap' is asserted from the reported delta = 1.02 and variance 3.33 +/- 6.23, yet the manuscript contains no experiments that link these embedding statistics to practical outcomes such as successful evasion, reduced detector accuracy under optimized multi-operator attacks, or changes in prompt-injection success rates. This missing causal link is load-bearing for the headline conclusion.
[Empirical Evaluation] Empirical Evaluation: The obfuscated-embedding variance is reported as 3.33 +/- 6.23. Because the standard deviation exceeds the mean, the statistic may reflect measurement noise or a small number of extreme outliers rather than consistent instability; no robustness checks, outlier analysis, or per-sample distribution plots are provided to support the interpretation of 'severe latent-space instability'.
[Methods] Methods: The paper does not report statistical significance tests for delta or the variance difference, does not compare against single-operator baselines, and gives insufficient detail on dataset size, train/test splits, or exact operator combinations. These omissions limit evaluation of whether the chosen obfuscations and BERT variants are representative enough to support the general claim of geometric fragility.

minor comments (2)

[Abstract] Abstract contains a typographical spacing error ('perf ormance').
[Notation] The margin delta and variance quantities should be accompanied by explicit definitions or equations in the main text rather than appearing only in the abstract.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment point by point below, providing clarifications on the manuscript's claims and indicating revisions where they strengthen the presentation without altering the core findings.

read point-by-point responses

Referee: [Abstract and Results] The central claim of a 'performance-robustness gap' is asserted from the reported delta = 1.02 and variance 3.33 +/- 6.23, yet the manuscript contains no experiments that link these embedding statistics to practical outcomes such as successful evasion, reduced detector accuracy under optimized multi-operator attacks, or changes in prompt-injection success rates. This missing causal link is load-bearing for the headline conclusion.

Authors: The performance-robustness gap is defined as the discrepancy between near-perfect classification accuracy and the geometric properties of the latent space. The minimal margin of 1.02 shows that obfuscated embeddings lie within the clean-prompt manifold, so the detector's high accuracy is achieved despite near-overlap rather than true separation; the elevated variance further quantifies the instability of those representations. This geometric evidence directly supports the claim that accuracy-based metrics miss underlying fragility. We agree that explicit linkage to evasion rates would add value and will add a short discussion subsection relating the observed collapse to potential attack implications, including how the margin correlates with reduced representational separation. revision: yes
Referee: [Empirical Evaluation] The obfuscated-embedding variance is reported as 3.33 +/- 6.23. Because the standard deviation exceeds the mean, the statistic may reflect measurement noise or a small number of extreme outliers rather than consistent instability; no robustness checks, outlier analysis, or per-sample distribution plots are provided to support the interpretation of 'severe latent-space instability'.

Authors: The statistic is the mean intra-class variance computed across obfuscated prompt sets, with the +/- 6.23 giving the standard deviation of those per-set variances over the BERT variants and operator combinations. The large spread is interpreted as reflecting genuine variation in instability rather than noise. To substantiate this, the revised version will include per-sample variance histograms, outlier-robustness checks (e.g., median absolute deviation), and distribution plots that separate the contribution of extreme samples from the overall trend. revision: yes
Referee: [Methods] The paper does not report statistical significance tests for delta or the variance difference, does not compare against single-operator baselines, and gives insufficient detail on dataset size, train/test splits, or exact operator combinations. These omissions limit evaluation of whether the chosen obfuscations and BERT variants are representative enough to support the general claim of geometric fragility.

Authors: We will add paired t-tests (or Wilcoxon tests where normality is violated) for the reported delta and variance differences. Single-operator baselines will be included to show that multi-operator obfuscation produces more pronounced collapse than any individual operator. The methods section will be expanded with precise numbers for clean and obfuscated prompt counts, the 70/30 train/test split, and the exact operator combinations (homoglyph substitution rates, zero-width insertion positions, and punctuation/emoji noise levels). These additions will allow readers to assess representativeness directly. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on direct empirical embedding measurements

full rationale

The paper's central results consist of measured quantities (clean-obfuscated margin delta = 1.02 and obfuscated intra-class variance 3.33 +/- 6.23) obtained from BERT-family encoders on multi-operator obfuscated prompts. These are presented as direct observations of latent embedding collapse rather than outputs of any fitted model, self-referential definition, or prior self-citation chain. The performance-robustness gap is an interpretive inference from these independent measurements; the measurements themselves do not reduce by construction to the paper's own inputs or equations. No self-definitional steps, fitted-input predictions, or ansatz smuggling via citation appear in the derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that embedding-space margin and intra-class variance are valid proxies for representational robustness, and that the tested BERT variants and chosen obfuscation operators generalize to transformer defenses broadly.

axioms (2)

domain assumption BERT-family encoders produce embeddings whose geometry is meaningful for assessing prompt-injection detector robustness
The collapse and margin measurements are interpreted as evidence of fragility only if these embeddings faithfully reflect input distinctions.
domain assumption The multi-operator obfuscations (homoglyphs + zero-width + punctuation/emoji) constitute representative real-world attacks
The observed collapse is tied to these specific combinations; different attack compositions might not produce the same geometry.

pith-pipeline@v0.9.0 · 5725 in / 1558 out tokens · 70660 ms · 2026-05-20T08:44:51.809850+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

multi-operator obfuscated prompts ... can partially collapse onto the embedding manifold of clean prompts, a phenomenon we term latent embedding collapse ... minimal clean-obfuscated margin δ=1.02 ... obfuscated intra-class variance (3.33±6.23)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages · 3 internal anchors

[1]

Chatgpt for good? on opportunities and challenges of large language models for education,

E. Kasneci, K. Seßler, S. K ¨uchemann, M. Bannert, D. Dementieva, F. Fischer, U. Gasser, G. Groh, S. G ¨unnemann, E. H ¨ullermeieret al., “Chatgpt for good? on opportunities and challenges of large language models for education,”Learning and individual differences, vol. 103, p. 102274, 2023

work page 2023
[2]

Large Language Models Market Size — Industry Report, 2030 — grandviewresearch.com,

“Large Language Models Market Size — Industry Report, 2030 — grandviewresearch.com,” https://www.grandviewresearch.com/ industry-analysis/large-language-model-llm-market-report, [Accessed 10-01-2026]

work page 2030
[3]

Prompt Injection attack against LLM-integrated Applications

Y . Liu, G. Deng, Y . Li, K. Wang, Z. Wang, X. Wang, T. Zhang, Y . Liu, H. Wang, Y . Zhenget al., “Prompt injection attack against llm-integrated applications,”arXiv preprint arXiv:2306.05499, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[4]

Optimization-based prompt injection attack to llm-as-a-judge,

J. Shi, Z. Yuan, Y . Liu, Y . Huang, P. Zhou, L. Sun, and N. Z. Gong, “Optimization-based prompt injection attack to llm-as-a-judge,” inProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, 2024, pp. 660–674

work page 2024
[5]

Attention tracker: Detecting prompt injection attacks in llms,

K.-H. Hung, C.-Y . Ko, A. Rawat, I.-H. Chung, W. H. Hsu, and P.-Y . Chen, “Attention tracker: Detecting prompt injection attacks in llms,” inFindings of the Association for Computational Linguistics: NAACL 2025, 2025, pp. 2309–2322

work page 2025
[6]

Defending against prompt injection with a few defensivetokens,

S. Chen, Y . Wang, N. Carlini, C. Sitawarin, and D. Wagner, “Defending against prompt injection with a few defensivetokens,” inProceedings of the 18th ACM Workshop on Artificial Intelligence and Security, 2025, pp. 242–252

work page 2025
[7]

Fine-tuned large language models (llms): Improved prompt injection attacks detection,

M. A. Rahman, H. Shahriar, G. Francia, F. Wu, A. Cuzzocrea, M. Rah- man, M. J. H. Faruk, and S. I. Ahamed, “Fine-tuned large language models (llms): Improved prompt injection attacks detection,” in2025 IEEE 49th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, 2025, pp. 1033–1039

work page 2025
[8]

ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents

H. Chang, Y . Jun, and H. Lee, “Chatinject: Abusing chat templates for prompt injection in llm agents,”arXiv preprint arXiv:2509.22830, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[9]

Not what you’ve signed up for: Compromising real-world llm- integrated applications with indirect prompt injection,

K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real-world llm- integrated applications with indirect prompt injection,” inProceedings of the 16th ACM workshop on artificial intelligence and security, 2023, pp. 79–90

work page 2023
[10]

Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems

D. Lee and M. Tiwari, “Prompt infection: Llm-to-llm prompt injection within multi-agent systems,”arXiv preprint arXiv:2410.07283, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[11]

Uniguardian: A unified defense for detecting prompt injection, backdoor attacks and adversarial attacks in large language models,

H. Lin, Y . Lao, T. Geng, T. Yu, and W. Zhao, “Uniguardian: A unified defense for detecting prompt injection, backdoor attacks and adversarial attacks in large language models,”arXiv preprint arXiv:2502.13141, 2025

work page arXiv 2025
[12]

Jatmo: Prompt injection defense by task-specific finetuning,

J. Piet, M. Alrashed, C. Sitawarin, S. Chen, Z. Wei, E. Sun, B. Alomair, and D. Wagner, “Jatmo: Prompt injection defense by task-specific finetuning,” inEuropean Symposium on Research in Computer Security. Springer, 2024, pp. 105–124

work page 2024
[13]

A survey of adversarial defenses and robustness in nlp,

S. Goyal, S. Doddapaneni, M. M. Khapra, and B. Ravindran, “A survey of adversarial defenses and robustness in nlp,”ACM Computing Surveys, vol. 55, no. 14s, pp. 1–39, 2023

work page 2023
[14]

The comprehensive review on prompt injection attacks and defense mechanisms in large language models,

Q. Wang, “The comprehensive review on prompt injection attacks and defense mechanisms in large language models,”Science and Technology of Engineering, Chemistry and Environmental Protection, vol. 1, no. 3, 2025

work page 2025
[15]

A critical evaluation of defenses against prompt injection attacks,

Y . Jia, Z. Shao, Y . Liu, J. Jia, D. Song, and N. Z. Gong, “A critical evaluation of defenses against prompt injection attacks,”arXiv preprint arXiv:2505.18333, 2025

work page arXiv 2025
[16]

https://doi.org/10

Y . Wang, S. Chen, R. Alkhudair, B. Alomair, and D. Wagner, “Defending against prompt injection with datafilter,”arXiv preprint arXiv:2510.19207, 2025

work page arXiv 2025
[17]

Drip: Defending prompt injection via de-instruction training and residual fusion model architecture,

R. Liu, Y . Lin, and J. S. Dong, “Drip: Defending prompt injection via de-instruction training and residual fusion model architecture,”arXiv e-prints, pp. arXiv–2511, 2025

work page 2025
[18]

Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, le, ica, t-sne),

F. Anowar, S. Sadaoui, and B. Selim, “Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, le, ica, t-sne),”Computer Science Review, vol. 40, p. 100378, 2021

work page 2021

[1] [1]

Chatgpt for good? on opportunities and challenges of large language models for education,

E. Kasneci, K. Seßler, S. K ¨uchemann, M. Bannert, D. Dementieva, F. Fischer, U. Gasser, G. Groh, S. G ¨unnemann, E. H ¨ullermeieret al., “Chatgpt for good? on opportunities and challenges of large language models for education,”Learning and individual differences, vol. 103, p. 102274, 2023

work page 2023

[2] [2]

Large Language Models Market Size — Industry Report, 2030 — grandviewresearch.com,

“Large Language Models Market Size — Industry Report, 2030 — grandviewresearch.com,” https://www.grandviewresearch.com/ industry-analysis/large-language-model-llm-market-report, [Accessed 10-01-2026]

work page 2030

[3] [3]

Prompt Injection attack against LLM-integrated Applications

Y . Liu, G. Deng, Y . Li, K. Wang, Z. Wang, X. Wang, T. Zhang, Y . Liu, H. Wang, Y . Zhenget al., “Prompt injection attack against llm-integrated applications,”arXiv preprint arXiv:2306.05499, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[4] [4]

Optimization-based prompt injection attack to llm-as-a-judge,

J. Shi, Z. Yuan, Y . Liu, Y . Huang, P. Zhou, L. Sun, and N. Z. Gong, “Optimization-based prompt injection attack to llm-as-a-judge,” inProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, 2024, pp. 660–674

work page 2024

[5] [5]

Attention tracker: Detecting prompt injection attacks in llms,

K.-H. Hung, C.-Y . Ko, A. Rawat, I.-H. Chung, W. H. Hsu, and P.-Y . Chen, “Attention tracker: Detecting prompt injection attacks in llms,” inFindings of the Association for Computational Linguistics: NAACL 2025, 2025, pp. 2309–2322

work page 2025

[6] [6]

Defending against prompt injection with a few defensivetokens,

S. Chen, Y . Wang, N. Carlini, C. Sitawarin, and D. Wagner, “Defending against prompt injection with a few defensivetokens,” inProceedings of the 18th ACM Workshop on Artificial Intelligence and Security, 2025, pp. 242–252

work page 2025

[7] [7]

Fine-tuned large language models (llms): Improved prompt injection attacks detection,

M. A. Rahman, H. Shahriar, G. Francia, F. Wu, A. Cuzzocrea, M. Rah- man, M. J. H. Faruk, and S. I. Ahamed, “Fine-tuned large language models (llms): Improved prompt injection attacks detection,” in2025 IEEE 49th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, 2025, pp. 1033–1039

work page 2025

[8] [8]

ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents

H. Chang, Y . Jun, and H. Lee, “Chatinject: Abusing chat templates for prompt injection in llm agents,”arXiv preprint arXiv:2509.22830, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[9] [9]

Not what you’ve signed up for: Compromising real-world llm- integrated applications with indirect prompt injection,

K. Greshake, S. Abdelnabi, S. Mishra, C. Endres, T. Holz, and M. Fritz, “Not what you’ve signed up for: Compromising real-world llm- integrated applications with indirect prompt injection,” inProceedings of the 16th ACM workshop on artificial intelligence and security, 2023, pp. 79–90

work page 2023

[10] [10]

Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems

D. Lee and M. Tiwari, “Prompt infection: Llm-to-llm prompt injection within multi-agent systems,”arXiv preprint arXiv:2410.07283, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[11] [11]

Uniguardian: A unified defense for detecting prompt injection, backdoor attacks and adversarial attacks in large language models,

H. Lin, Y . Lao, T. Geng, T. Yu, and W. Zhao, “Uniguardian: A unified defense for detecting prompt injection, backdoor attacks and adversarial attacks in large language models,”arXiv preprint arXiv:2502.13141, 2025

work page arXiv 2025

[12] [12]

Jatmo: Prompt injection defense by task-specific finetuning,

J. Piet, M. Alrashed, C. Sitawarin, S. Chen, Z. Wei, E. Sun, B. Alomair, and D. Wagner, “Jatmo: Prompt injection defense by task-specific finetuning,” inEuropean Symposium on Research in Computer Security. Springer, 2024, pp. 105–124

work page 2024

[13] [13]

A survey of adversarial defenses and robustness in nlp,

S. Goyal, S. Doddapaneni, M. M. Khapra, and B. Ravindran, “A survey of adversarial defenses and robustness in nlp,”ACM Computing Surveys, vol. 55, no. 14s, pp. 1–39, 2023

work page 2023

[14] [14]

The comprehensive review on prompt injection attacks and defense mechanisms in large language models,

Q. Wang, “The comprehensive review on prompt injection attacks and defense mechanisms in large language models,”Science and Technology of Engineering, Chemistry and Environmental Protection, vol. 1, no. 3, 2025

work page 2025

[15] [15]

A critical evaluation of defenses against prompt injection attacks,

Y . Jia, Z. Shao, Y . Liu, J. Jia, D. Song, and N. Z. Gong, “A critical evaluation of defenses against prompt injection attacks,”arXiv preprint arXiv:2505.18333, 2025

work page arXiv 2025

[16] [16]

https://doi.org/10

Y . Wang, S. Chen, R. Alkhudair, B. Alomair, and D. Wagner, “Defending against prompt injection with datafilter,”arXiv preprint arXiv:2510.19207, 2025

work page arXiv 2025

[17] [17]

Drip: Defending prompt injection via de-instruction training and residual fusion model architecture,

R. Liu, Y . Lin, and J. S. Dong, “Drip: Defending prompt injection via de-instruction training and residual fusion model architecture,”arXiv e-prints, pp. arXiv–2511, 2025

work page 2025

[18] [18]

Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, le, ica, t-sne),

F. Anowar, S. Sadaoui, and B. Selim, “Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, le, ica, t-sne),”Computer Science Review, vol. 40, p. 100378, 2021

work page 2021