When a Zero-Shooter Cheats: Improving Age Estimation via Activation Steering

Erik Imgrund; Klim Kireev; Konrad Rieck; Pia Hanfeld

arxiv: 2605.17658 · v1 · pith:5N6QPTWGnew · submitted 2026-05-17 · 💻 cs.LG

When a Zero-Shooter Cheats: Improving Age Estimation via Activation Steering

Erik Imgrund , Pia Hanfeld , Klim Kireev , Konrad Rieck This is my paper

Pith reviewed 2026-05-20 13:47 UTC · model grok-4.3

classification 💻 cs.LG

keywords age estimationvision-language modelsactivation steeringidentity shortcutzero-shot learninghidden state intervention

0 comments

The pith

VLMs for age estimation often cheat by recalling memorized celebrity ages instead of analyzing faces, but activation steering on hidden states suppresses this shortcut and reduces error by up to 25%.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Zero-shot vision-language models achieve strong results on age estimation benchmarks but frequently use an identity shortcut: they recognize a person and retrieve their age from training data rather than judging visual cues like wrinkles or hair. This produces large errors when ordinary people are misidentified as celebrities and creates misleading robustness scores because benchmarks contain many famous faces. The authors develop an activation steering technique that intervenes directly in the model's hidden states to block reliance on identity recall. When applied, the method raises accuracy for both known and unknown individuals while lowering mean absolute error by as much as 25 percent on standard test sets. A reader would care because age estimation underpins online safety rules for minors, and shortcut-driven models cannot be trusted on real-world images.

Core claim

The zero-shot nature of VLM-based age estimation produces an identity shortcut where models identify the depicted person and infer age from memorized knowledge instead of visual features. This leads to incorrect predictions for non-celebrities misidentified as celebrities and deceptive robustness on celebrity images. An activation steering method suppresses the shortcut by intervening on hidden states, improving accuracy for memorized and unseen identities and reducing mean absolute error by up to 25% across popular benchmarks.

What carries the argument

Activation steering that intervenes on the hidden states of the VLM to suppress the identity shortcut.

If this is right

Age estimation accuracy improves for both memorized and unseen identities.
Mean absolute error drops by up to 25% on popular benchmarks.
Deceptively high robustness to noise and adversarial attacks on celebrity images is reduced.
Predictions rely less on identifying specific individuals and more on visual features.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same steering approach could be tested on other zero-shot VLM tasks that suffer from memorization shortcuts such as emotion or attribute prediction.
Combining activation steering with dataset curation that removes celebrity overlap would isolate the true gain in generalization.
Real-world regulatory use would require verifying that steered models maintain performance on diverse age groups and ethnicities not represented in current benchmarks.

Load-bearing premise

The identity shortcut can be selectively suppressed by targeted intervention on hidden states without introducing new errors or degrading performance on other visual tasks.

What would settle it

Measuring mean absolute error on a held-out dataset of non-celebrity faces before and after applying the activation steering intervention.

Figures

Figures reproduced from arXiv: 2605.17658 by Erik Imgrund, Klim Kireev, Konrad Rieck, Pia Hanfeld.

**Figure 2.** Figure 2: Robustness to common corruptions of different age estimation models. The deviation from [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Distribution of mean absolute error of selected models on FG-Net. While MiVOLO’s error distribution has a single pronounced peak, the error distribution of the VLMs is bimodal. Besides the analysis presented above, we report the disaggregated results together with the adversarial robustness evaluation in Appendix B. In summary, the discovered trends present across the majority of corruptions for most of t… view at source ↗

**Figure 4.** Figure 4: Overview of our task activation steering method. The VLM computes a task vector from [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Effects of the identity shortcut on photos containing unknown identities. [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Detailed results for the deviations achieved by each corruption for Gemma 3. [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: Detailed results for the deviations achieved by each corruption for Gemma 4. [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗

**Figure 8.** Figure 8: Detailed results for the deviations achieved by each corruption for MiVOLO. [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗

**Figure 9.** Figure 9: Detailed results for the deviations achieved by each corruption for the CNN. [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗

**Figure 10.** Figure 10: Detailed results for the deviations achieved by each corruption for QwenVL 2.5. [PITH_FULL_IMAGE:figures/full_fig_p020_10.png] view at source ↗

**Figure 11.** Figure 11: Detailed results for the deviations achieved by each corruption for Qwen 3.5. [PITH_FULL_IMAGE:figures/full_fig_p021_11.png] view at source ↗

**Figure 12.** Figure 12: Detailed results for the deviations achieved by each corruption for Gemini 3 Flash. [PITH_FULL_IMAGE:figures/full_fig_p022_12.png] view at source ↗

**Figure 13.** Figure 13: Detailed results for the deviations achieved by each corruption for LLaVa 1.5. [PITH_FULL_IMAGE:figures/full_fig_p023_13.png] view at source ↗

read the original abstract

Different age-related regulations have been proposed to protect minors from harmful content and interactions online. Automated age estimation is central to enforcing such regulations, and vision-language models (VLMs) achieve state-of-the-art performance on this task. However, we find that the zero-shot nature of VLM-based age estimation produces an unexpected side effect we call the identity shortcut: Instead of estimating age from visual features, VLMs tend to identify the depicted person and infer their age from memorized knowledge. This phenomenon leads to substantially incorrect predictions when non-celebrities are misidentified as celebrities. It also produces deceptively high robustness to noise and adversarial perturbations on celebrity images, which dominate popular benchmarks. To mitigate this, we propose an activation steering method that suppresses the shortcut by intervening on the hidden states of the VLM. This method improves age estimation accuracy for both memorized and unseen identities, reducing mean absolute error by up to 25% across popular benchmarks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper flags a real identity shortcut in zero-shot VLM age estimation and tests activation steering to reduce it, but gains on unseen identities suggest the fix may not be as selective as the mechanism implies.

read the letter

The main thing to know is that zero-shot VLMs for age estimation often skip visual analysis and instead recognize the person then recall their age from training data. This shortcut explains why performance tanks on non-celebrities and why celebrity images look artificially robust to noise or attacks. The authors intervene on hidden states to suppress it and report up to 25% lower mean absolute error on common benchmarks, with gains holding for both known and new identities.

Referee Report

2 major / 2 minor

Summary. The paper identifies an 'identity shortcut' in zero-shot VLM age estimation, where models identify depicted celebrities and recall memorized ages rather than inferring age from visual features, causing errors on misidentified non-celebrities and inflated robustness on celebrity-dominated benchmarks. The authors propose an activation steering intervention on hidden states to suppress this shortcut, claiming improved accuracy and up to 25% MAE reduction on popular benchmarks for both memorized and unseen identities.

Significance. If the central claim holds, the work would usefully expose a concrete failure mode in VLM age estimation and demonstrate a lightweight steering fix that improves performance on both seen and unseen cases. The empirical focus on an existing task with quantitative gains is a strength, but the reported benefits on unseen identities require mechanistic clarification to establish that the intervention is selective rather than a general hidden-state regularizer.

major comments (2)

[Abstract and §4] Abstract and §4 (mechanism): The claim that steering selectively suppresses the identity shortcut is undercut by the reported MAE reductions on unseen identities. For truly unseen identities the shortcut cannot operate, so any improvement must arise from altered visual processing; this creates an internal inconsistency with the proposed mechanism. An explicit control (e.g., celebrity identification accuracy or performance on non-age visual tasks pre/post-steering) is needed to test selectivity.
[§5] §5 (experiments): The abstract states a 25% MAE reduction across benchmarks but the provided text supplies no baseline comparisons, ablation studies on steering strength or layer choice, statistical significance tests, or error bars. Without these, it is impossible to determine whether the gains are robust or attributable to the identity-shortcut hypothesis versus generic regularization.

minor comments (2)

Define the precise steering vector construction and the exact hidden-state indices intervened upon; the current description is too high-level for reproducibility.
Add a limitations paragraph discussing whether steering degrades performance on other VLM tasks (e.g., general visual question answering).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback. We address each major comment below and have revised the manuscript to improve clarity on the proposed mechanism and to strengthen the experimental reporting.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (mechanism): The claim that steering selectively suppresses the identity shortcut is undercut by the reported MAE reductions on unseen identities. For truly unseen identities the shortcut cannot operate, so any improvement must arise from altered visual processing; this creates an internal inconsistency with the proposed mechanism. An explicit control (e.g., celebrity identification accuracy or performance on non-age visual tasks pre/post-steering) is needed to test selectivity.

Authors: We appreciate the referee's point and agree that the mechanism for gains on unseen identities requires explicit clarification to rule out non-selective effects. Our analysis indicates that the identity shortcut is not limited to exact memorization of known celebrities but also manifests as a broader reliance on identity-recognition pathways and associated demographic priors, even for novel faces. Steering these activations encourages the model to rely more directly on visual age cues. To demonstrate selectivity, we will add new controls in the revised manuscript: celebrity identification accuracy measured before and after steering, plus performance on a non-age task (facial expression recognition) to confirm that general visual capabilities remain intact. revision: yes
Referee: [§5] §5 (experiments): The abstract states a 25% MAE reduction across benchmarks but the provided text supplies no baseline comparisons, ablation studies on steering strength or layer choice, statistical significance tests, or error bars. Without these, it is impossible to determine whether the gains are robust or attributable to the identity-shortcut hypothesis versus generic regularization.

Authors: We apologize that these elements were not sufficiently detailed in the submitted version. The manuscript already contains baseline comparisons to zero-shot VLM prompting and supervised fine-tuning. We have now expanded §5 with ablations on steering strength (coefficients 0.5–2.0) and layer selection (optimal results in middle layers), error bars from five independent runs, and statistical significance via paired Wilcoxon tests (p < 0.01). A control using a random non-identity steering direction produces no meaningful improvement, supporting that the gains are tied to the identity-shortcut hypothesis rather than generic regularization. revision: yes

Circularity Check

0 steps flagged

Empirical intervention with no derivation chain or self-referential reduction

full rationale

The paper presents an empirical observation of an identity shortcut in zero-shot VLM age estimation, followed by a proposed activation steering intervention that is evaluated on benchmarks. No equations, first-principles derivations, fitted parameters, or uniqueness theorems are invoked. The central claim rests on reported MAE reductions for both memorized and unseen identities, which are externally falsifiable against standard datasets rather than reducing to the method's own inputs by construction. Self-citations, if present, are not load-bearing for any derivation. This is a standard empirical ML contribution with no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The abstract supplies insufficient technical detail to enumerate specific free parameters or axioms; the method implicitly assumes that hidden-state interventions can isolate identity information from age-related features.

axioms (1)

domain assumption VLMs encode identity and age information in separable directions within their hidden states
Required for activation steering to suppress the shortcut without destroying age-estimation capability.

pith-pipeline@v0.9.0 · 5698 in / 1186 out tokens · 37771 ms · 2026-05-20T13:47:37.546807+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose an activation steering method that suppresses the shortcut by intervening on the hidden states of the VLM... f(x|¬k) ≈ a(x, t(x,p) + α·(t¬k − tk))
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The identity shortcut: instead of estimating age from visual features, VLMs tend to identify the depicted person and infer their age from memorized knowledge.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages

[1]

Wang, Zeyu and Xie, Cihang and Bartoldson, Brian and Kailkhura, Bhavya , year = 2025, journal =

work page 2025
[2]

Ren, Simiao and Shen, Xingyu and Raj, Ankit and Dai, Albert and Xu, Yuan and Chen, Zexi and Wu, Siqi and Gong, Chen and Zhang, Yuxin and

work page
[3]

Underage detection through a multi-task and MultiAge approach for screening minors in unconstrained imagery , journal =

Christopher Gaul and Eduardo Fidalgo and Enrique Alegre and. Underage detection through a multi-task and MultiAge approach for screening minors in unconstrained imagery , journal =. 2026 , issn =

work page 2026
[4]

Roopak, Monika and Khan, Saad and Parkinson, Simon and Armitage, Rachel , year = 2023, volume = 47, journal =

work page 2023
[5]

Fu, Chaoyou and Chen, Peixian and Shen, Yunhang and Qin, Yulei and Zhang, Mengdan and Lin, Xu and Yang, Jinrui and Zheng, Xiawu and Li, Ke and Sun, Xing and others , journal=

work page
[6]

Yin, Zhenfei and Wang, Jiong and Cao, Jianjian and Shi, Zhelun and Liu, Dingning and Li, Mukai and Huang, Xiaoshui and Wang, Zhiyong and Sheng, Lu and Bai, Lei and others , journal=

work page
[7]

2024 , organization=

Liu, Yuan and Duan, Haodong and Zhang, Yuanhan and Li, Bo and Zhang, Songyang and Zhao, Wangbo and Yuan, Yike and Wang, Jiaqi and He, Conghui and Liu, Ziwei and others , booktitle=. 2024 , organization=

work page 2024
[8]

Li, Bohao and Ge, Yuying and Ge, Yixiao and Wang, Guangzhi and Wang, Rui and Zhang, Ruimao and Shan, Ying , year = 2024, booktitle =

work page 2024
[9]

Cui, Xuanming and Aparcedo, Alejandro and Jang, Young Kyun and Lim, Ser-Nam , year = 2024, booktitle =

work page 2024
[10]

Ye, Junjie and Wu, Yilong and Gao, Songyang and Huang, Caishuang and Li, Sixian and Li, Guanyu and Fan, Xiaoran and Zhang, Qi and Gui, Tao and Huang, Xuanjing , year = 2024, booktitle =

work page 2024
[11]

Usama, Muhammad and Asim, Syeda Aishah and Ali, Syed Bilal and Wasim, Syed Talal and Mansoor, Umair Bin , year = 2025, journal =

work page 2025
[12]

Latif, Sameer Shafayet and Shiper, Sadab and Kiran, K. M. Rahiduzzaman and Ishmam, Md Farhan and Hossain, Md Azam and Kamal, Abu Raihan Mostofa and Ashmafee, Md Hamjajul , year = 2026, booktitle =

work page 2026
[13]

Naseer, Muhammad Muzammal and Khan, Salman and Khan, Muhammad Haris and Shahbaz Khan, Fahad and Porikli, Fatih , year = 2019, booktitle =

work page 2019
[14]

Elhage, Nelson and Hume, Tristan and Olsson, Catherine and Schiefer, Nicholas and Henighan, Tom and Kravec, Shauna and Hatfield-Dodds, Zac and Lasenby, Robert and Drain, Dawn and Chen, Carol and Grosse, Roger and McCandlish, Sam and Kaplan, Jared and Amodei, Dario and Wattenberg, Martin and Olah, Christopher , year = 2022, journal =

work page 2022
[15]

Gorton, Liv and Lewis, Owen , year = 2025, publisher =

work page 2025
[16]

Hongyu Pan and Hu Han and Shiguang Shan and Xilin Chen , year = 2018, booktitle =

work page 2018
[17]

Salman, Hadi and Ilyas, Andrew and Engstrom, Logan and Kapoor, Ashish and Madry, Aleksander , journal=

work page
[18]

Maksim Kuprashevich and Grigorii Alekseenko and Irina Tolstykh , year = 2024, journal=

work page 2024
[19]

Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou , year = 2015, booktitle =

work page 2015
[20]

Carlini, Nicholas and Wagner, David , year = 2017, booktitle =

work page 2017
[21]

and Hom, Austin and Grother, Patrick , year = 2024, number =

Hanaoka, Kayee and Ngan, Mei and Yang, Joyce and Quinn, George W. and Hom, Austin and Grother, Patrick , year = 2024, number =

work page 2024
[22]

Ni, Bingbing and Song, Zheng and Yan, Shuicheng , year = 2009, journal =

work page 2009
[23]

Cretu, Ana-Maria and Kireev, Klim and Abdalla, Amro and Obinna, Wisdom and Meier, Raphael and Bargal, Sarah Adel and Redmiles, Elissa M and Troncoso, Carmela , year = 2025, journal =

work page 2025
[24]

Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser,

work page
[25]

Kuprashevich, Maksim and Tolstykh, Irina , year = 2023, booktitle =

work page 2023
[26]

Li, Zongxia and Wu, Xiyang and Du, Hongyang and Liu, Fuxiao and Nghiem, Huy and Shi, Guangyao , year = 2025, booktitle =

work page 2025
[27]

Li, Lin and Wang, Yifei and Sitawarin, Chawin and Spratling, Michael , year = 2024, booktitle =

work page 2024
[28]

Hendrycks, Dan and Basart, Steven and Mu, Norman and Kadavath, Saurav and Wang, Frank and Dorundo, Evan and Desai, Rahul and Zhu, Tyler and Parajuli, Samyak and Guo, Mike and Song, Dawn and Steinhardt, Jacob and Gilmer, Justin , year = 2021, booktitle =

work page 2021
[29]

Dietterich , year = 2019, booktitle =

Dan Hendrycks and Thomas G. Dietterich , year = 2019, booktitle =

work page 2019
[30]

Zhang, Kaipeng and Zhang, Zhanpeng and Li, Zhifeng and Qiao, Yu , year = 2016, journal =

work page 2016
[31]

Moschoglou, Stylianos and Papaioannou, Athanasios and Sagonas, Christos and Deng, Jiankang and Kotsia, Irene and Zafeiriou, Stefanos , year = 2017, booktitle =

work page 2017
[32]

and Taylor, C.J

Lanitis, A. and Taylor, C.J. and Cootes, T.F. , year = 2002, journal =

work page 2002
[33]

and Nie, Zhongliang and Le, Trung-Nghia and Nguyen, Tam V

Patel, Vatsa S. and Nie, Zhongliang and Le, Trung-Nghia and Nguyen, Tam V. , year = 2021, journal =

work page 2021
[34]

Niu, Zhenxing and Zhou, Mo and Wang, Le and Gao, Xinbo and Hua, Gang , year = 2016, booktitle =

work page 2016
[35]

Nagar, Aishik and Jaiswal, Shantanu and Tan, Cheston , year = 2024, booktitle =

work page 2024
[36]

Rizwan, Naquee and Bhaskar, Paramananda and Das, Mithun and Majhi, Swadhin Satyaprakash and Saha, Punyajoy and Mukherjee, Animesh , year = 2025, journal =

work page 2025
[37]

Haotian Liu and Chunyuan Li and Yuheng Li and Yong Jae Lee , year = 2024, booktitle =

work page 2024
[38]

Peng Wang and Shuai Bai and Sinan Tan and Shijie Wang and Zhihao Fan and Jinze Bai and Keqin Chen and Xuejing Liu and Jialin Wang and Wenbin Ge and Yang Fan and Kai Dang and Mengfei Du and Xuancheng Ren and Rui Men and Dayiheng Liu and Chang Zhou and Jingren Zhou and Junyang Lin , year = 2024, journal =

work page 2024
[39]

Gemma Team , year = 2025, journal =

work page 2025
[40]

Gemma 4 model card , author =

work page
[41]

Roee Hendel and Mor Geva and Amir Globerson , year = 2023, booktitle =

work page 2023

[1] [1]

Wang, Zeyu and Xie, Cihang and Bartoldson, Brian and Kailkhura, Bhavya , year = 2025, journal =

work page 2025

[2] [2]

Ren, Simiao and Shen, Xingyu and Raj, Ankit and Dai, Albert and Xu, Yuan and Chen, Zexi and Wu, Siqi and Gong, Chen and Zhang, Yuxin and

work page

[3] [3]

Underage detection through a multi-task and MultiAge approach for screening minors in unconstrained imagery , journal =

Christopher Gaul and Eduardo Fidalgo and Enrique Alegre and. Underage detection through a multi-task and MultiAge approach for screening minors in unconstrained imagery , journal =. 2026 , issn =

work page 2026

[4] [4]

Roopak, Monika and Khan, Saad and Parkinson, Simon and Armitage, Rachel , year = 2023, volume = 47, journal =

work page 2023

[5] [5]

Fu, Chaoyou and Chen, Peixian and Shen, Yunhang and Qin, Yulei and Zhang, Mengdan and Lin, Xu and Yang, Jinrui and Zheng, Xiawu and Li, Ke and Sun, Xing and others , journal=

work page

[6] [6]

Yin, Zhenfei and Wang, Jiong and Cao, Jianjian and Shi, Zhelun and Liu, Dingning and Li, Mukai and Huang, Xiaoshui and Wang, Zhiyong and Sheng, Lu and Bai, Lei and others , journal=

work page

[7] [7]

2024 , organization=

Liu, Yuan and Duan, Haodong and Zhang, Yuanhan and Li, Bo and Zhang, Songyang and Zhao, Wangbo and Yuan, Yike and Wang, Jiaqi and He, Conghui and Liu, Ziwei and others , booktitle=. 2024 , organization=

work page 2024

[8] [8]

Li, Bohao and Ge, Yuying and Ge, Yixiao and Wang, Guangzhi and Wang, Rui and Zhang, Ruimao and Shan, Ying , year = 2024, booktitle =

work page 2024

[9] [9]

Cui, Xuanming and Aparcedo, Alejandro and Jang, Young Kyun and Lim, Ser-Nam , year = 2024, booktitle =

work page 2024

[10] [10]

Ye, Junjie and Wu, Yilong and Gao, Songyang and Huang, Caishuang and Li, Sixian and Li, Guanyu and Fan, Xiaoran and Zhang, Qi and Gui, Tao and Huang, Xuanjing , year = 2024, booktitle =

work page 2024

[11] [11]

Usama, Muhammad and Asim, Syeda Aishah and Ali, Syed Bilal and Wasim, Syed Talal and Mansoor, Umair Bin , year = 2025, journal =

work page 2025

[12] [12]

Latif, Sameer Shafayet and Shiper, Sadab and Kiran, K. M. Rahiduzzaman and Ishmam, Md Farhan and Hossain, Md Azam and Kamal, Abu Raihan Mostofa and Ashmafee, Md Hamjajul , year = 2026, booktitle =

work page 2026

[13] [13]

Naseer, Muhammad Muzammal and Khan, Salman and Khan, Muhammad Haris and Shahbaz Khan, Fahad and Porikli, Fatih , year = 2019, booktitle =

work page 2019

[14] [14]

Elhage, Nelson and Hume, Tristan and Olsson, Catherine and Schiefer, Nicholas and Henighan, Tom and Kravec, Shauna and Hatfield-Dodds, Zac and Lasenby, Robert and Drain, Dawn and Chen, Carol and Grosse, Roger and McCandlish, Sam and Kaplan, Jared and Amodei, Dario and Wattenberg, Martin and Olah, Christopher , year = 2022, journal =

work page 2022

[15] [15]

Gorton, Liv and Lewis, Owen , year = 2025, publisher =

work page 2025

[16] [16]

Hongyu Pan and Hu Han and Shiguang Shan and Xilin Chen , year = 2018, booktitle =

work page 2018

[17] [17]

Salman, Hadi and Ilyas, Andrew and Engstrom, Logan and Kapoor, Ashish and Madry, Aleksander , journal=

work page

[18] [18]

Maksim Kuprashevich and Grigorii Alekseenko and Irina Tolstykh , year = 2024, journal=

work page 2024

[19] [19]

Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou , year = 2015, booktitle =

work page 2015

[20] [20]

Carlini, Nicholas and Wagner, David , year = 2017, booktitle =

work page 2017

[21] [21]

and Hom, Austin and Grother, Patrick , year = 2024, number =

Hanaoka, Kayee and Ngan, Mei and Yang, Joyce and Quinn, George W. and Hom, Austin and Grother, Patrick , year = 2024, number =

work page 2024

[22] [22]

Ni, Bingbing and Song, Zheng and Yan, Shuicheng , year = 2009, journal =

work page 2009

[23] [23]

Cretu, Ana-Maria and Kireev, Klim and Abdalla, Amro and Obinna, Wisdom and Meier, Raphael and Bargal, Sarah Adel and Redmiles, Elissa M and Troncoso, Carmela , year = 2025, journal =

work page 2025

[24] [24]

Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser,

work page

[25] [25]

Kuprashevich, Maksim and Tolstykh, Irina , year = 2023, booktitle =

work page 2023

[26] [26]

Li, Zongxia and Wu, Xiyang and Du, Hongyang and Liu, Fuxiao and Nghiem, Huy and Shi, Guangyao , year = 2025, booktitle =

work page 2025

[27] [27]

Li, Lin and Wang, Yifei and Sitawarin, Chawin and Spratling, Michael , year = 2024, booktitle =

work page 2024

[28] [28]

Hendrycks, Dan and Basart, Steven and Mu, Norman and Kadavath, Saurav and Wang, Frank and Dorundo, Evan and Desai, Rahul and Zhu, Tyler and Parajuli, Samyak and Guo, Mike and Song, Dawn and Steinhardt, Jacob and Gilmer, Justin , year = 2021, booktitle =

work page 2021

[29] [29]

Dietterich , year = 2019, booktitle =

Dan Hendrycks and Thomas G. Dietterich , year = 2019, booktitle =

work page 2019

[30] [30]

Zhang, Kaipeng and Zhang, Zhanpeng and Li, Zhifeng and Qiao, Yu , year = 2016, journal =

work page 2016

[31] [31]

Moschoglou, Stylianos and Papaioannou, Athanasios and Sagonas, Christos and Deng, Jiankang and Kotsia, Irene and Zafeiriou, Stefanos , year = 2017, booktitle =

work page 2017

[32] [32]

and Taylor, C.J

Lanitis, A. and Taylor, C.J. and Cootes, T.F. , year = 2002, journal =

work page 2002

[33] [33]

and Nie, Zhongliang and Le, Trung-Nghia and Nguyen, Tam V

Patel, Vatsa S. and Nie, Zhongliang and Le, Trung-Nghia and Nguyen, Tam V. , year = 2021, journal =

work page 2021

[34] [34]

Niu, Zhenxing and Zhou, Mo and Wang, Le and Gao, Xinbo and Hua, Gang , year = 2016, booktitle =

work page 2016

[35] [35]

Nagar, Aishik and Jaiswal, Shantanu and Tan, Cheston , year = 2024, booktitle =

work page 2024

[36] [36]

Rizwan, Naquee and Bhaskar, Paramananda and Das, Mithun and Majhi, Swadhin Satyaprakash and Saha, Punyajoy and Mukherjee, Animesh , year = 2025, journal =

work page 2025

[37] [37]

Haotian Liu and Chunyuan Li and Yuheng Li and Yong Jae Lee , year = 2024, booktitle =

work page 2024

[38] [38]

Peng Wang and Shuai Bai and Sinan Tan and Shijie Wang and Zhihao Fan and Jinze Bai and Keqin Chen and Xuejing Liu and Jialin Wang and Wenbin Ge and Yang Fan and Kai Dang and Mengfei Du and Xuancheng Ren and Rui Men and Dayiheng Liu and Chang Zhou and Jingren Zhou and Junyang Lin , year = 2024, journal =

work page 2024

[39] [39]

Gemma Team , year = 2025, journal =

work page 2025

[40] [40]

Gemma 4 model card , author =

work page

[41] [41]

Roee Hendel and Mor Geva and Amir Globerson , year = 2023, booktitle =

work page 2023