BackTranslation2.0 -- A Linguistically Motivated Metric to Assess Sign Language Production

Anton Pelykh; Edward Fish; JianHe Low; Karahan Sahin; Maksym Ivashechkin; Oline Ranum; Oliver Cory; Ozge Mercanoglu Sincan; Richard Bowden

arxiv: 2606.28673 · v1 · pith:MLF5DWVUnew · submitted 2026-06-27 · 💻 cs.CV

BackTranslation2.0 -- A Linguistically Motivated Metric to Assess Sign Language Production

Oliver Cory , Maksym Ivashechkin , Karahan Sahin , Oline Ranum , Jianhe Low , Edward Fish , Anton Pelykh , Ozge Mercanoglu Sincan

show 1 more author

Richard Bowden

This is my paper

Pith reviewed 2026-06-30 10:14 UTC · model grok-4.3

classification 💻 cs.CV

keywords sign language evaluationBackTranslation2.0linguistic metricBritish Sign Languagehuman correlationgrammatical correctnessphonological accuracymotion fluency

0 comments

The pith

BackTranslation2.0 scores sign language output on four linguistic dimensions using tool pipelines and cross-checks that match human ratings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents BackTranslation2.0 as a new evaluation method for sign language production that replaces simple backtranslation with a structured pipeline. It breaks assessment into grammatical correctness, phonological accuracy, motion fluency, and generation fidelity. Deterministic tools generate initial scores while LLM modules cross-reference outputs for consistency against linguistic rules. The approach is validated on a British Sign Language dataset with human ratings and outperforms six baseline metrics in correlation strength. This matters because current metrics for generated sign language are too crude to guide system improvement.

Core claim

BackTranslation2.0 adopts an agentic framework in which a deterministic pipeline orchestrates specialised tools to score four dimensions aligned with human rater assessments; LLM-based cross-referential modules then evaluate consistency across tools and against linguistic expectations before final scores are computed through deterministic weighted formulas over validated outputs.

What carries the argument

The agentic framework that combines deterministic tool pipelines for four scoring dimensions with LLM-based cross-referential comparison modules to validate consistency and linguistic alignment.

If this is right

Produces separate scores for grammar, phonology, fluency, and fidelity rather than a single overall number.
Incorporates cross-tool validation to reduce reliance on any single automated measure.
Demonstrates higher correlation with human judgments than existing metrics on the tested BSL data.
Supplies interpretable dimension-level feedback for sign language generation systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could support iterative training of sign language models by supplying detailed per-dimension error signals.
Adaptation to other sign languages would require equivalent linguistic tools and new human-rated validation sets.
Reliance on LLM modules for validation invites tests of score stability when different language models are substituted.

Load-bearing premise

The assumption that deterministic tool outputs plus LLM cross-referential modules will yield scores that genuinely reflect linguistic quality without systematic bias from tool limits or the language models.

What would settle it

A new human-rated dataset in British Sign Language or another sign language where BackTranslation2.0 dimension scores show no stronger correlation with raters than the six baseline metrics.

Figures

Figures reproduced from arXiv: 2606.28673 by Anton Pelykh, Edward Fish, JianHe Low, Karahan Sahin, Maksym Ivashechkin, Oline Ranum, Oliver Cory, Ozge Mercanoglu Sincan, Richard Bowden.

**Figure 1.** Figure 1: Overview of the BackTranslation2.0 (BT2) pipeline. Given a source sentence 𝑠 and generated sign output 𝑦^, multi-modal extraction produces a structured sample for segment-level and sequence-level analysis. Phase 1 base tools extract lexical, spatial, phonological, motion, and visual evidence, which is stored in a shared memory trace. Phase 2 comparison tools cross-reference this evidence against linguistic… view at source ↗

**Figure 2.** Figure 2: Anomaly-alignment summary for the primary comparison set. Panel A: higher values indicate stronger penalisation of videos with more flagged anomalies. Panel B: BT2 final score against total anomaly rate across the 14 active questions (right) [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗

**Figure 3.** Figure 3: Qualitative evaluation of BT2 visual-base tools; overlays show predicted class and score, with the top row rewarding good production and the bottom row penalising poor production. Columns (left to right): Handshape, Non-manuals, Hand fidelity, Contact location. Handshape and Non-manuals compare best production against matched corruptions; Hand fidelity contrasts a reference clip with poor-handed synthesis… view at source ↗

read the original abstract

Sign Languages (SLs) are the primary means of communication for millions of deaf individuals, yet existing evaluation metrics for generated SL remain simplistic and poorly aligned with human judgements. We introduce BackTranslation2.0, a linguistically grounded evaluation metric for text-to-sign translation that moves beyond na\"ive backtranslation. Our approach adopts an agentic framework in which a deterministic pipeline orchestrates a suite of specialised tools to assess four scoring dimensions - grammatical correctness, phonological accuracy, motion fluency, and generation fidelity - aligned with human rater assessments. Tool outputs are not treated independently: a set of large language model (LLM)-based cross-referential comparison modules evaluates consistency across tools and checks outputs against linguistic expectations, enabling structured reasoning over grammatical, phonological, and motion-level evidence. Final dimension scores are computed through deterministic weighted formulas over validated tool outputs. To validate BackTranslation2.0, we introduce and evaluate on a British Sign Language (BSL) dataset rated in a human rater study across the same quality dimensions, following a protocol developed in collaboration between linguists and deaf experts, benchmarking against six baseline metrics. Our method demonstrates strong correlation with human judgements across all dimensions, providing a more comprehensive, interpretable, and linguistically principled evaluation framework for sign language production systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

BackTranslation2.0 describes an agentic pipeline with tools and LLM modules for four SL dimensions but the abstract gives no numbers, no LLM validation, and no error analysis, leaving the correlation claim impossible to check.

read the letter

The main point on this paper is that it presents BackTranslation2.0 as a linguistically motivated metric that runs deterministic tools on grammatical correctness, phonological accuracy, motion fluency, and generation fidelity, then uses LLM cross-referential modules to reconcile outputs before applying weighted formulas. It reports testing on a new human-rated BSL dataset against six baselines and claims strong alignment with raters. From the abstract alone that claim cannot be evaluated because no coefficients, dataset sizes, or statistical details appear.

The concrete combination of the four-dimension breakdown with an agentic tool pipeline plus LLM consistency checks is presented as an advance over plain backtranslation. That framing is new enough in the SL evaluation space to note.

The setup shows some care in aligning dimensions with a rater protocol developed alongside linguists and deaf experts. Benchmarking against multiple baselines on an independent dataset is also a positive step for making the metric more interpretable.

The soft spots are real and central. The abstract asserts strong human correlation without any quantitative evidence or error analysis, which matches the reader's low soundness score. The stress-test concern lands: the LLM modules are load-bearing for filtering and checking tool outputs against linguistic expectations, yet the text supplies no prompt examples, no agreement figures between LLM and expert linguists, and no ablation that removes the LLM stage. Without those, any observed correlations could trace to LLM priors rather than the underlying tools. The full paper may contain the missing numbers and checks, but nothing in the provided text resolves this.

This work targets researchers building sign language generation systems for accessibility. Readers working on multimodal evaluation or SL production could extract the dimension list and pipeline idea even if the validation remains thin.

I would send it to peer review. The topic has practical weight and the multi-tool framing is worth referee scrutiny, but any acceptance would require the quantitative results and LLM validation details to be added and stress-tested.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces BackTranslation2.0, a linguistically motivated metric for evaluating sign language production. It employs an agentic framework with a deterministic pipeline of specialized tools and LLM-based cross-referential modules to score four dimensions (grammatical correctness, phonological accuracy, motion fluency, generation fidelity). The metric is validated on a new BSL dataset with human ratings, claiming strong correlations with human judgements across dimensions and superiority over six baselines.

Significance. If the correlations are substantiated with quantitative evidence and the LLM modules are validated to ensure they do not introduce systematic bias, the work could provide a more comprehensive and interpretable evaluation framework for sign language production systems, filling a gap in aligning automatic metrics with human linguistic assessments.

major comments (2)

[Abstract] Abstract: The abstract asserts 'strong correlation with human judgements across all dimensions' but supplies no quantitative results, error analysis, or dataset statistics. This makes it impossible to verify the central claim that the metric aligns with human ratings after proper controls.
[Description of LLM-based cross-referential modules] Description of LLM-based cross-referential modules: The paper states that LLM-based modules 'check outputs against linguistic expectations' but provides no prompt text, no inter-annotator agreement between LLM and expert linguists, no ablation removing the LLM stage, and no held-out validation set. This is load-bearing because the final dimension scores depend on these modules, and without validation the observed correlations could reflect LLM priors rather than linguistic quality.

minor comments (1)

[Abstract] The term 'na"ive' appears to be an encoding artifact and should be corrected to 'naive'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for these focused comments on the abstract and the LLM modules. Both points identify areas where additional detail will improve verifiability; we address each below and will incorporate the requested information in revision.

read point-by-point responses

Referee: [Abstract] Abstract: The abstract asserts 'strong correlation with human judgements across all dimensions' but supplies no quantitative results, error analysis, or dataset statistics. This makes it impossible to verify the central claim that the metric aligns with human ratings after proper controls.

Authors: We agree the abstract should contain the quantitative evidence that supports its claims. In the revised version we will insert the Pearson correlation coefficients and associated p-values for each of the four dimensions, the number of BSL videos and raters in the human study, and a concise statement of the rating protocol developed with linguists and deaf experts. These additions will allow readers to assess the strength of the reported alignment immediately. revision: yes
Referee: [Description of LLM-based cross-referential modules] Description of LLM-based cross-referential modules: The paper states that LLM-based modules 'check outputs against linguistic expectations' but provides no prompt text, no inter-annotator agreement between LLM and expert linguists, no ablation removing the LLM stage, and no held-out validation set. This is load-bearing because the final dimension scores depend on these modules, and without validation the observed correlations could reflect LLM priors rather than linguistic quality.

Authors: We accept that the current description of the LLM cross-referential modules is insufficient to demonstrate they are not introducing systematic bias. We will add the exact prompt templates, report inter-annotator agreement between LLM outputs and expert linguists on a sampled subset of outputs, include an ablation that removes the LLM stage and recomputes dimension scores, and specify the held-out validation set used to tune the modules. These elements will be placed in the methods section so that readers can evaluate whether the final scores reflect linguistic quality. revision: yes

Circularity Check

0 steps flagged

No significant circularity; metric defined independently and validated on external human data

full rationale

The paper defines BackTranslation2.0 via a deterministic pipeline of tools plus LLM cross-referential modules whose outputs are combined by fixed weighted formulas. It then introduces a separate human-rated BSL dataset (developed with linguists and deaf experts) solely for validation and reports correlations against six baselines. No equation, definition, or self-citation reduces the metric itself to a fit or renaming of the human judgments; the correlation is an external benchmark rather than a definitional identity. This matches the default expectation of a self-contained derivation against independent data.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review supplies insufficient detail to enumerate free parameters or invented entities; the central claim rests on the unelaborated premise that the described tools and LLM modules align with human linguistic judgment.

axioms (1)

domain assumption The four scoring dimensions (grammatical correctness, phonological accuracy, motion fluency, generation fidelity) align with human rater assessments
Stated directly in the abstract as the basis for the metric design and human study protocol

pith-pipeline@v0.9.1-grok · 5796 in / 1314 out tokens · 37239 ms · 2026-06-30T10:14:09.928906+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

70 extracted references · 9 canonical work pages · 4 internal anchors

[1]

In: Universal Access in Human-Computer Interaction

Adamo-Villani, N., Wilbur, R.B.: Asl-pro: American sign language animation with prosodic elements. In: Universal Access in Human-Computer Interaction. Access to Interaction. pp. 307–318. Springer International Publishing (2015)

2015
[2]

Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility (2021)

Al-khazraji, S., Dingman, B., Lee, S., Huenerfauth, M.: At a different pace: Evalu- ating whether users prefer timing parameters in american sign language animations to differ from human signers’ timing. Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility (2021)

2021
[3]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Baltatzis, V., Potamias, R.A., Ververas, E., Sun, G., Deng, J., Zafeiriou, S.: Neu- ral Sign Actors: A Diffusion Model for 3D Sign Language Production from Text. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1985–1995 (2024)

1985
[4]

Bowden, R., Saunders, B., Wheatley, M., Crowley, C., Hirshman, M., Birtles, D.: Taxonomy and Definitions for Terms Related to Automatic Translation of Spoken Language into Sign Language -Version 1.2 (2025)

2025
[5]

Mit Press (1998)

Brentari, D.: A prosodic model of sign language phonology. Mit Press (1998)

1998
[6]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Camgöz, N.C., Hadfield, S., Koller, O., Ney, H., Bowden, R.: Neural Sign Language Translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 7784–7793 (2018)

2018
[7]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Camgöz, N.C., Koller, O., Hadfield, S., Bowden, R.: Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 10023–10033 (2020)

2020
[8]

Psychological bulletin70(4), 213 (1968)

Cohen, J.: Weighted kappa: Nominal scale agreement provision for scaled disagree- ment or partial credit. Psychological bulletin70(4), 213 (1968)

1968
[9]

arXiv preprint arXiv:2603.19059 (2026)

Cory, O., Sincan, O.M., Bowden, R.: SignAgent: Agentic LLMs for linguistically- grounded sign language annotation and dataset curation. arXiv preprint arXiv:2603.19059 (2026)

work page arXiv 2026
[10]

In: Proceedings of the European Conference on Computer Vision Workshops (ECCVW)

Cory, O., Sincan, O.M., Vowels, M., Battisti, A., Holzknecht, F., Tissi, K., Sidler- Miserez,S.,Haug,T.,Ebling,S.,Bowden,R.:ModellingtheDistributionofHuman Motion for Sign Language Assessment. In: Proceedings of the European Conference on Computer Vision Workshops (ECCVW). pp. 1–19 (2025)

2025
[11]

Sign Language & Linguistics11(1), 45–67 (2008)

Crasborn, O., van der Kooij, E., Waters, D., Woll, B., Mesch, J., Bergman, B.: Fre- quency Distribution and Spreading Behavior of Different Types of Mouth Actions in Three Sign Languages. Sign Language & Linguistics11(1), 45–67 (2008)

2008
[12]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Gong, J., Foo, L.G., He, Y., Rahmani, H., Liu, J.: LLMs are Good Sign Language Translators. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 18362–18372 (2024)

2024
[13]

British Journal of Mathematical and Statistical Psychology61(1), 29–48 (2008)

Gwet, K.L.: Computing inter-rater reliability and its variance in the presence of high agreement. British Journal of Mathematical and Statistical Psychology61(1), 29–48 (2008)

2008
[14]

In: 2025 IEEE 19th International Conference on Automatic Face and Gesture Recognition (FG)

He, L.J., Walsh, H., Sincan, O.M., Bowden, R.: Hands-on: Segmenting individual signs from continuous sequences. In: 2025 IEEE 19th International Conference on Automatic Face and Gesture Recognition (FG). pp. 1–5 (2025)

2025
[15]

Huenerfauth, M., Marcus, M., Palmer, M.: Generating American Sign Language classifier predicates for English-to-ASL machine translation. Ph.D. thesis, Univer- sity of Pennsylvania (2006)

2006
[16]

SignSplat: Rendering sign language via Gaussian splatting.arXiv preprint arXiv:2505.02108, 2025

Ivashechkin, M., Mendez, O., Bowden, R.: SignSplat: Rendering Sign Language via Gaussian Splatting. arXiv preprint arXiv:2505.02108 (2025) BT2.0 17

work page arXiv 2025
[17]

In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Jang, Y., Raajesh, H., Momeni, L., Varol, G., Zisserman, A.: Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues. In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 8742–8752 (2025)

2025
[18]

In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR) Workshops

Jiang, S., Sun, B., Wang, L., Bai, Y., Li, K., Fu, Y.: Skeleton aware multi-modal sign language recognition. In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR) Workshops. pp. 3413–3423 (2021)

2021
[19]

RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose,

Jiang, T., Lu, P., Zhang, L., Ma, N., Han, R., Lyu, C., Li, Y., Chen, K.: RTM- Pose: Real-time multi-person pose estimation based on MMPose. arXiv preprint arXiv:2303.07399 (2023)

work page arXiv 2023
[20]

In: Proceedings of the Tenth Conference on Machine Translation

Jiang, Z., Leong, C., Moryossef, A., Cory, O., Ivashechkin, M., Tarigopula, N., Zhang, B., Göhring, A., Rios, A., Sennrich, R., Ebling, S.: Meaningful Pose-Based Sign Language Evaluation. In: Proceedings of the Tenth Conference on Machine Translation. pp. 64–80. Proceedings of the Association for Computational Linguis- tics (ACL) (Nov 2025)

2025
[21]

In: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Jiang, Z., Sant, G., Moryossef, A., Müller, M., Sennrich, R., Ebling, S.: SignCLIP: Connecting Text and Sign Language by Contrastive Learning. In: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. pp. 9171–9193 (Nov 2024)

2024
[22]

ACM Transactions on Graphics42(4) (July 2023)

Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Transactions on Graphics42(4) (July 2023)

2023
[23]

In: The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility

Kipp, M., Nguyen, Q., Heloir, A., Matthes, S.: Assessing the deaf user perspec- tive on sign language avatars. In: The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility. pp. 107–114 (2011)

2011
[24]

Communication Methods and Measures5(1), 1–12 (2011)

Krippendorff, K.: Computing Krippendorff’s alpha-reliability. Communication Methods and Measures5(1), 1–12 (2011)

2011
[25]

In: Innovations in deaf studies: The role of deaf scholars, vol

Kusters, A., De Meulder, M., O’Brien, D.: Innovations in deaf studies: Critically mapping the field. In: Innovations in deaf studies: The role of deaf scholars, vol. 12, pp. 1–53. Oxford University Press Oxford (2017)

2017
[26]

LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

Li, H., Dong, Q., Chen, J., Su, H., Zhou, Y., Ai, Q., Ye, Z., Liu, Y.: Llms-as- judges: a comprehensive survey on llm-based evaluation methods. arXiv preprint arXiv:2412.05579 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[27]

ACM TOG36(6), 194:1–194:17 (2017)

Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. ACM TOG36(6), 194:1–194:17 (2017)

2017
[28]

In: Proceedings of the International Conference on Learning Representations (ICLR) (2025)

Li, Z., Zhou, W., Zhao, W., Wu, K., Hu, H., Li, H.: Uni-Sign: Toward Unified Sign Language Understanding at Scale. In: Proceedings of the International Conference on Learning Representations (ICLR) (2025)

2025
[29]

In: Text Sum- marization Branches Out

Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: Text Sum- marization Branches Out. pp. 74–81. Association for Computational Linguistics (ACL) (Jul 2004)

2004
[30]

In: ICLR (2025)

Liu, Y., Zhu, L., Lin, L., Zhu, Y., Zhang, A., Li, Y.: Teaser: Token enhanced spatial modeling for expressions reconstruction. In: ICLR (2025)

2025
[31]

SignSparK: Efficient Multilingual Sign Language Production via Sparse Keyframe Learning

Low, J., Symeonidis-Herzig, A., Ivashechkin, M., Sincan, O.M., Bowden, R.: SignSparK: Efficient multilingual sign language production via sparse keyframe learning. arXiv preprint arXiv:2603.10446 (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[32]

In: Sign Language in Action, pp

Napier, J., Leeson, L.: Sign language in action. In: Sign Language in Action, pp. 50–84. Palgrave Macmillan UK, London (2016)

2016
[33]

IEEE Transactions on Pattern Analysis and Ma- chine Intelligence27(6), 873–891 (2005) 18 O

Ong, S.C., Ranganath, S.: Automatic Sign Language Analysis: A Survey and the Future Beyond Lexical Meaning. IEEE Transactions on Pattern Analysis and Ma- chine Intelligence27(6), 873–891 (2005) 18 O. Cory et al

2005
[34]

In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL)

Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a Method for Automatic Evaluation of Machine Translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL). pp. 311–318 (Jul 2002)

2002
[35]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Pavlakos, G., Choutas, V., Ghorbani, N., Bolkart, T., Osman, A.A., Tzionas, D., Black, M.J.: Expressive body capture: 3d hands, face, and body from a single im- age. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10975–10985 (2019)

2019
[36]

Pfau, R., Quer, J.: Nonmanuals: Their grammatical and prosodic roles, pp. 381–
[37]

Cambridge University Press (2010)

2010
[38]

In: Pro- ceedings of the Tenth Workshop on Statistical Machine Translation

Popović, M.: chrf: character n-gram F-score for automatic MT evaluation. In: Pro- ceedings of the Tenth Workshop on Statistical Machine Translation. pp. 392–395. Association for Computational Linguistics (Sep 2015)

2015
[39]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Potamias, R.A., Zhang, J., Deng, J., Zafeiriou, S.: Wilor: End-to-end 3d hand localization and reconstruction in-the-wild. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 12242–12254 (2025)

2025
[40]

In: Proceedings of the European Conference on Computer Vision (ECCV)

Qi, F., Duan, Y., Zhang, H., Xu, C.: SignGen: End-to-End Sign Language Video Generation with Latent Diffusion. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 252–270 (2024)

2024
[41]

What's the Point? Spatial Grammar & Index Resolution for Sign Language Processing

Ranum, O., Hadfield, S., Bowden, R.: What’s the point? spatial grammar & index resolution for sign language recognition. arXiv preprint arXiv:2606.08056 (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[42]

ACM TOG36(6), 245:1–245:17 (2017)

Romero, J., Tzionas, D., Black, M.J.: Embodied hands: Modeling and capturing hands and bodies together. ACM TOG36(6), 245:1–245:17 (2017)

2017
[43]

Cam- bridge University Press (2006)

Sandler, W., Lillo-Martin, D.C.: Sign Language and Linguistic Universals. Cam- bridge University Press (2006)

2006
[44]

Advances in Neural Information Processing Systems37, 140032–140065 (2024)

Sárándi, I., Pons-Moll, G.: Neural localizer fields for continuous 3d human pose and shape estimation. Advances in Neural Information Processing Systems37, 140032–140065 (2024)

2024
[45]

In: Proceedings of the British Machine Vision Confer- ence (BMVC) (2020)

Saunders, B., Camgöz, N.C., Bowden, R.: Adversarial Training for Multi-Channel Sign Language Production. In: Proceedings of the British Machine Vision Confer- ence (BMVC) (2020)

2020
[46]

In: Proceedings of the European Conference on Computer Vision (ECCV)

Saunders, B., Camgöz, N.C., Bowden, R.: Progressive Transformers for End-to- End Sign Language Production. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 687–705 (2020)

2020
[47]

In: ProceedingsoftheIEEE/CVFConferenceonComputerVisionandPatternRecog- nition (CVPR)

Saunders, B., Camgöz, N.C., Bowden, R.: Signing at Scale: Learning to Co- Articulate Signs for Large-Scale Photo-Realistic Sign Language Production. In: ProceedingsoftheIEEE/CVFConferenceonComputerVisionandPatternRecog- nition (CVPR). pp. 5141–5151 (2022)

2022
[48]

Language Documentation and Conservation7, 136– 154 (2013)

Schembri, A., Fenlon, J., Rentelis, R., Reynolds, S., Cormier, K.: Building the British Sign Language corpus. Language Documentation and Conservation7, 136– 154 (2013)

2013
[49]

In: NeurIPS

Schick, T., Dwivedi-Yu, J., Dessì, R., Raileanu, R., Lomeli, M., Hambro, E., Zettle- moyer, L., Cancedda, N., Scialom, T.: Toolformer: Language Models Can Teach Themselves to Use Tools. In: NeurIPS. pp. 68539–68551 (2023)

2023
[50]

In: Proceedings 58th Annual Meeting of the Association for Compu- tational Linguistics (ACL)

Sellam, T., Das, D., Parikh, A.P.: BLEURT: Learning Robust Metrics for Text Generation. In: Proceedings 58th Annual Meeting of the Association for Compu- tational Linguistics (ACL). pp. 7881–7892 (2020)

2020
[51]

In: Proceed- ings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2025) BT2.0 19

Shen, X., Wang, X., Shen, L., Zhang, K., Yu, X.: Cross-View Isolated Sign Lan- guage Recognition via View Synthesis and Feature Disentanglement. In: Proceed- ings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2025) BT2.0 19

2025
[52]

Psychological Bulletin86(2), 420–428 (1979)

Shrout, P.E., Fleiss, J.L.: Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin86(2), 420–428 (1979)

1979
[53]

Signapse: Signstream api: Real-time british sign language translation.https:// www.signapse.ai/signstream-api(2026), accessed: 2026-03-05

2026
[54]

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)20(12), 1371–1375 (1998)

Starner, T., Weaver, J., Pentland, A.: Real-Time American Sign Language Recog- nition Using Desk and Wearable Computer Based Video. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)20(12), 1371–1375 (1998)

1998
[55]

International Journal of Computer Vision128, 891–908 (2020)

Stoll, S., Camgöz, N.C., Hadfield, S., Bowden, R.: Text2Sign: Towards Sign Lan- guage Production Using Neural Machine Translation and Generative Adversarial Networks. International Journal of Computer Vision128, 891–908 (2020)

2020
[56]

In: Proceedings of the AAAI Conference on Artificial Intelligence

Tang, S., He, J., Guo, D., Wei, Y., Li, F., Hong, R.: Sign-idd: Iconicity disentangled diffusion for sign language production. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 39, pp. 7266–7274 (2025)

2025
[57]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Vogler, C., Metaxas, D.: Parallel Hidden Markov Models for American Sign Lan- guage Recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 116–122 (1999)

1999
[58]

arXiv preprint arXiv:2405.07663 (2024)

Walsh, H., Saunders, B., Bowden, R.: Sign stitching: A novel approach to sign language production. arXiv preprint arXiv:2405.07663 (2024)

work page arXiv 2024
[59]

Wan: Open and Advanced Large-Scale Video Generative Models

Wan,T.,Wang,A.,Ai,B.,Wen,B.,Mao,C.,Xie,C.W.,Chen,D.,Yu,F.,Zhao,H., Yang, J., et al.: Wan: Open and Advanced Large-Scale Video Generative Models. arXiv preprint arXiv:2503.20314 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[60]

In: Proceedings of the IEEE/CVF international conference on com- puter vision

Wong, R., Camgoz, N.C., Bowden, R.: Signrep: Enhancing self-supervised sign rep- resentations. In: Proceedings of the IEEE/CVF international conference on com- puter vision. pp. 22804–22814 (2025)

2025
[61]

In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS) (2024)

Yang,L.,Kang,B.,Huang,Z.,Zhao,Z.,Xu,X.,Feng,J.,Zhao,H.:DepthAnything V2. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS) (2024)

2024
[62]

In: Proceedings of the International Conference on Learning Representations (ICLR) (2023)

Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., Cao, Y.: ReAct: Synergizing Reasoning and Acting in Language Models. In: Proceedings of the International Conference on Learning Representations (ICLR) (2023)

2023
[63]

In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL)

Yin, A., Li, H., Shen, K., Tang, S., Zhuang, Y.: T2S-GPT: Dynamic Vector Quanti- zation for Autoregressive Sign Language Production from Text. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL). pp. 3345–3356 (2024)

2024
[64]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Yin, A., Zhong, T., Tang, L., Jin, W., Jin, T., Zhao, Z.: Gloss Attention for Gloss- Free Sign Language Translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2551–2562 (2023)

2023
[65]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Zhang, D., Liu, Y., Lin, L., Zhu, Y., Li, Y., Qin, M., Li, Y., Wang, H.: GUAVA: Generalizable Upper Body 3D Gaussian Avatar. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 14205–14217 (October 2025)

2025
[66]

In: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems

Zhang, H., Shalev-Arkushin, R., Baltatzis, V., Gillis, C., Laput, G., Kushalna- gar, R., Quandt, L.C., Findlater, L., Bedri, A., Lea, C.: Towards AI-driven Sign Language Generation with Non-manual Markers. In: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery (2025)

2025
[67]

Advances in neural information processing systems36, 46595–46623 (2023) 20 O

Zheng, L., Chiang, W.L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., et al.: Judging llm-as-a-judge with mt-bench and chatbot arena. Advances in neural information processing systems36, 46595–46623 (2023) 20 O. Cory et al

2023
[68]

arXiv preprint arXiv:2601.19577 (01 2026)

Zuo, R., Potamias, R., Sun, Q., Ververas, E., Deng, J., Zafeiriou, S.: MaDiS: Taming Masked Diffusion Language Models for Sign Language Generation. arXiv preprint arXiv:2601.19577 (01 2026)

work page arXiv 2026
[69]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Zuo, R., Potamias, R.A., Ververas, E., Deng, J., Zafeiriou, S.: Signs as Tokens: A Retrieval-Enhanced Multilingual Sign Language Generator. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 23806– 23816 (2025)

2025
[70]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Zuo, R., Wei, F., Mak, B.: Natural Language-Assisted Sign Language Recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 14890–14900 (2023)

2023

[1] [1]

In: Universal Access in Human-Computer Interaction

Adamo-Villani, N., Wilbur, R.B.: Asl-pro: American sign language animation with prosodic elements. In: Universal Access in Human-Computer Interaction. Access to Interaction. pp. 307–318. Springer International Publishing (2015)

2015

[2] [2]

Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility (2021)

Al-khazraji, S., Dingman, B., Lee, S., Huenerfauth, M.: At a different pace: Evalu- ating whether users prefer timing parameters in american sign language animations to differ from human signers’ timing. Proceedings of the 23rd International ACM SIGACCESS Conference on Computers and Accessibility (2021)

2021

[3] [3]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Baltatzis, V., Potamias, R.A., Ververas, E., Sun, G., Deng, J., Zafeiriou, S.: Neu- ral Sign Actors: A Diffusion Model for 3D Sign Language Production from Text. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1985–1995 (2024)

1985

[4] [4]

Bowden, R., Saunders, B., Wheatley, M., Crowley, C., Hirshman, M., Birtles, D.: Taxonomy and Definitions for Terms Related to Automatic Translation of Spoken Language into Sign Language -Version 1.2 (2025)

2025

[5] [5]

Mit Press (1998)

Brentari, D.: A prosodic model of sign language phonology. Mit Press (1998)

1998

[6] [6]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Camgöz, N.C., Hadfield, S., Koller, O., Ney, H., Bowden, R.: Neural Sign Language Translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 7784–7793 (2018)

2018

[7] [7]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Camgöz, N.C., Koller, O., Hadfield, S., Bowden, R.: Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 10023–10033 (2020)

2020

[8] [8]

Psychological bulletin70(4), 213 (1968)

Cohen, J.: Weighted kappa: Nominal scale agreement provision for scaled disagree- ment or partial credit. Psychological bulletin70(4), 213 (1968)

1968

[9] [9]

arXiv preprint arXiv:2603.19059 (2026)

Cory, O., Sincan, O.M., Bowden, R.: SignAgent: Agentic LLMs for linguistically- grounded sign language annotation and dataset curation. arXiv preprint arXiv:2603.19059 (2026)

work page arXiv 2026

[10] [10]

In: Proceedings of the European Conference on Computer Vision Workshops (ECCVW)

Cory, O., Sincan, O.M., Vowels, M., Battisti, A., Holzknecht, F., Tissi, K., Sidler- Miserez,S.,Haug,T.,Ebling,S.,Bowden,R.:ModellingtheDistributionofHuman Motion for Sign Language Assessment. In: Proceedings of the European Conference on Computer Vision Workshops (ECCVW). pp. 1–19 (2025)

2025

[11] [11]

Sign Language & Linguistics11(1), 45–67 (2008)

Crasborn, O., van der Kooij, E., Waters, D., Woll, B., Mesch, J., Bergman, B.: Fre- quency Distribution and Spreading Behavior of Different Types of Mouth Actions in Three Sign Languages. Sign Language & Linguistics11(1), 45–67 (2008)

2008

[12] [12]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Gong, J., Foo, L.G., He, Y., Rahmani, H., Liu, J.: LLMs are Good Sign Language Translators. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 18362–18372 (2024)

2024

[13] [13]

British Journal of Mathematical and Statistical Psychology61(1), 29–48 (2008)

Gwet, K.L.: Computing inter-rater reliability and its variance in the presence of high agreement. British Journal of Mathematical and Statistical Psychology61(1), 29–48 (2008)

2008

[14] [14]

In: 2025 IEEE 19th International Conference on Automatic Face and Gesture Recognition (FG)

He, L.J., Walsh, H., Sincan, O.M., Bowden, R.: Hands-on: Segmenting individual signs from continuous sequences. In: 2025 IEEE 19th International Conference on Automatic Face and Gesture Recognition (FG). pp. 1–5 (2025)

2025

[15] [15]

Huenerfauth, M., Marcus, M., Palmer, M.: Generating American Sign Language classifier predicates for English-to-ASL machine translation. Ph.D. thesis, Univer- sity of Pennsylvania (2006)

2006

[16] [16]

SignSplat: Rendering sign language via Gaussian splatting.arXiv preprint arXiv:2505.02108, 2025

Ivashechkin, M., Mendez, O., Bowden, R.: SignSplat: Rendering Sign Language via Gaussian Splatting. arXiv preprint arXiv:2505.02108 (2025) BT2.0 17

work page arXiv 2025

[17] [17]

In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Jang, Y., Raajesh, H., Momeni, L., Varol, G., Zisserman, A.: Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues. In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 8742–8752 (2025)

2025

[18] [18]

In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR) Workshops

Jiang, S., Sun, B., Wang, L., Bai, Y., Li, K., Fu, Y.: Skeleton aware multi-modal sign language recognition. In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR) Workshops. pp. 3413–3423 (2021)

2021

[19] [19]

RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose,

Jiang, T., Lu, P., Zhang, L., Ma, N., Han, R., Lyu, C., Li, Y., Chen, K.: RTM- Pose: Real-time multi-person pose estimation based on MMPose. arXiv preprint arXiv:2303.07399 (2023)

work page arXiv 2023

[20] [20]

In: Proceedings of the Tenth Conference on Machine Translation

Jiang, Z., Leong, C., Moryossef, A., Cory, O., Ivashechkin, M., Tarigopula, N., Zhang, B., Göhring, A., Rios, A., Sennrich, R., Ebling, S.: Meaningful Pose-Based Sign Language Evaluation. In: Proceedings of the Tenth Conference on Machine Translation. pp. 64–80. Proceedings of the Association for Computational Linguis- tics (ACL) (Nov 2025)

2025

[21] [21]

In: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Jiang, Z., Sant, G., Moryossef, A., Müller, M., Sennrich, R., Ebling, S.: SignCLIP: Connecting Text and Sign Language by Contrastive Learning. In: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. pp. 9171–9193 (Nov 2024)

2024

[22] [22]

ACM Transactions on Graphics42(4) (July 2023)

Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Transactions on Graphics42(4) (July 2023)

2023

[23] [23]

In: The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility

Kipp, M., Nguyen, Q., Heloir, A., Matthes, S.: Assessing the deaf user perspec- tive on sign language avatars. In: The proceedings of the 13th international ACM SIGACCESS conference on Computers and accessibility. pp. 107–114 (2011)

2011

[24] [24]

Communication Methods and Measures5(1), 1–12 (2011)

Krippendorff, K.: Computing Krippendorff’s alpha-reliability. Communication Methods and Measures5(1), 1–12 (2011)

2011

[25] [25]

In: Innovations in deaf studies: The role of deaf scholars, vol

Kusters, A., De Meulder, M., O’Brien, D.: Innovations in deaf studies: Critically mapping the field. In: Innovations in deaf studies: The role of deaf scholars, vol. 12, pp. 1–53. Oxford University Press Oxford (2017)

2017

[26] [26]

LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

Li, H., Dong, Q., Chen, J., Su, H., Zhou, Y., Ai, Q., Ye, Z., Liu, Y.: Llms-as- judges: a comprehensive survey on llm-based evaluation methods. arXiv preprint arXiv:2412.05579 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[27] [27]

ACM TOG36(6), 194:1–194:17 (2017)

Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. ACM TOG36(6), 194:1–194:17 (2017)

2017

[28] [28]

In: Proceedings of the International Conference on Learning Representations (ICLR) (2025)

Li, Z., Zhou, W., Zhao, W., Wu, K., Hu, H., Li, H.: Uni-Sign: Toward Unified Sign Language Understanding at Scale. In: Proceedings of the International Conference on Learning Representations (ICLR) (2025)

2025

[29] [29]

In: Text Sum- marization Branches Out

Lin, C.Y.: Rouge: A package for automatic evaluation of summaries. In: Text Sum- marization Branches Out. pp. 74–81. Association for Computational Linguistics (ACL) (Jul 2004)

2004

[30] [30]

In: ICLR (2025)

Liu, Y., Zhu, L., Lin, L., Zhu, Y., Zhang, A., Li, Y.: Teaser: Token enhanced spatial modeling for expressions reconstruction. In: ICLR (2025)

2025

[31] [31]

SignSparK: Efficient Multilingual Sign Language Production via Sparse Keyframe Learning

Low, J., Symeonidis-Herzig, A., Ivashechkin, M., Sincan, O.M., Bowden, R.: SignSparK: Efficient multilingual sign language production via sparse keyframe learning. arXiv preprint arXiv:2603.10446 (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[32] [32]

In: Sign Language in Action, pp

Napier, J., Leeson, L.: Sign language in action. In: Sign Language in Action, pp. 50–84. Palgrave Macmillan UK, London (2016)

2016

[33] [33]

IEEE Transactions on Pattern Analysis and Ma- chine Intelligence27(6), 873–891 (2005) 18 O

Ong, S.C., Ranganath, S.: Automatic Sign Language Analysis: A Survey and the Future Beyond Lexical Meaning. IEEE Transactions on Pattern Analysis and Ma- chine Intelligence27(6), 873–891 (2005) 18 O. Cory et al

2005

[34] [34]

In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL)

Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a Method for Automatic Evaluation of Machine Translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL). pp. 311–318 (Jul 2002)

2002

[35] [35]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Pavlakos, G., Choutas, V., Ghorbani, N., Bolkart, T., Osman, A.A., Tzionas, D., Black, M.J.: Expressive body capture: 3d hands, face, and body from a single im- age. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10975–10985 (2019)

2019

[36] [36]

Pfau, R., Quer, J.: Nonmanuals: Their grammatical and prosodic roles, pp. 381–

[37] [37]

Cambridge University Press (2010)

2010

[38] [38]

In: Pro- ceedings of the Tenth Workshop on Statistical Machine Translation

Popović, M.: chrf: character n-gram F-score for automatic MT evaluation. In: Pro- ceedings of the Tenth Workshop on Statistical Machine Translation. pp. 392–395. Association for Computational Linguistics (Sep 2015)

2015

[39] [39]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Potamias, R.A., Zhang, J., Deng, J., Zafeiriou, S.: Wilor: End-to-end 3d hand localization and reconstruction in-the-wild. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 12242–12254 (2025)

2025

[40] [40]

In: Proceedings of the European Conference on Computer Vision (ECCV)

Qi, F., Duan, Y., Zhang, H., Xu, C.: SignGen: End-to-End Sign Language Video Generation with Latent Diffusion. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 252–270 (2024)

2024

[41] [41]

What's the Point? Spatial Grammar & Index Resolution for Sign Language Processing

Ranum, O., Hadfield, S., Bowden, R.: What’s the point? spatial grammar & index resolution for sign language recognition. arXiv preprint arXiv:2606.08056 (2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[42] [42]

ACM TOG36(6), 245:1–245:17 (2017)

Romero, J., Tzionas, D., Black, M.J.: Embodied hands: Modeling and capturing hands and bodies together. ACM TOG36(6), 245:1–245:17 (2017)

2017

[43] [43]

Cam- bridge University Press (2006)

Sandler, W., Lillo-Martin, D.C.: Sign Language and Linguistic Universals. Cam- bridge University Press (2006)

2006

[44] [44]

Advances in Neural Information Processing Systems37, 140032–140065 (2024)

Sárándi, I., Pons-Moll, G.: Neural localizer fields for continuous 3d human pose and shape estimation. Advances in Neural Information Processing Systems37, 140032–140065 (2024)

2024

[45] [45]

In: Proceedings of the British Machine Vision Confer- ence (BMVC) (2020)

Saunders, B., Camgöz, N.C., Bowden, R.: Adversarial Training for Multi-Channel Sign Language Production. In: Proceedings of the British Machine Vision Confer- ence (BMVC) (2020)

2020

[46] [46]

In: Proceedings of the European Conference on Computer Vision (ECCV)

Saunders, B., Camgöz, N.C., Bowden, R.: Progressive Transformers for End-to- End Sign Language Production. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 687–705 (2020)

2020

[47] [47]

In: ProceedingsoftheIEEE/CVFConferenceonComputerVisionandPatternRecog- nition (CVPR)

Saunders, B., Camgöz, N.C., Bowden, R.: Signing at Scale: Learning to Co- Articulate Signs for Large-Scale Photo-Realistic Sign Language Production. In: ProceedingsoftheIEEE/CVFConferenceonComputerVisionandPatternRecog- nition (CVPR). pp. 5141–5151 (2022)

2022

[48] [48]

Language Documentation and Conservation7, 136– 154 (2013)

Schembri, A., Fenlon, J., Rentelis, R., Reynolds, S., Cormier, K.: Building the British Sign Language corpus. Language Documentation and Conservation7, 136– 154 (2013)

2013

[49] [49]

In: NeurIPS

Schick, T., Dwivedi-Yu, J., Dessì, R., Raileanu, R., Lomeli, M., Hambro, E., Zettle- moyer, L., Cancedda, N., Scialom, T.: Toolformer: Language Models Can Teach Themselves to Use Tools. In: NeurIPS. pp. 68539–68551 (2023)

2023

[50] [50]

In: Proceedings 58th Annual Meeting of the Association for Compu- tational Linguistics (ACL)

Sellam, T., Das, D., Parikh, A.P.: BLEURT: Learning Robust Metrics for Text Generation. In: Proceedings 58th Annual Meeting of the Association for Compu- tational Linguistics (ACL). pp. 7881–7892 (2020)

2020

[51] [51]

In: Proceed- ings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2025) BT2.0 19

Shen, X., Wang, X., Shen, L., Zhang, K., Yu, X.: Cross-View Isolated Sign Lan- guage Recognition via View Synthesis and Feature Disentanglement. In: Proceed- ings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2025) BT2.0 19

2025

[52] [52]

Psychological Bulletin86(2), 420–428 (1979)

Shrout, P.E., Fleiss, J.L.: Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin86(2), 420–428 (1979)

1979

[53] [53]

Signapse: Signstream api: Real-time british sign language translation.https:// www.signapse.ai/signstream-api(2026), accessed: 2026-03-05

2026

[54] [54]

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)20(12), 1371–1375 (1998)

Starner, T., Weaver, J., Pentland, A.: Real-Time American Sign Language Recog- nition Using Desk and Wearable Computer Based Video. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)20(12), 1371–1375 (1998)

1998

[55] [55]

International Journal of Computer Vision128, 891–908 (2020)

Stoll, S., Camgöz, N.C., Hadfield, S., Bowden, R.: Text2Sign: Towards Sign Lan- guage Production Using Neural Machine Translation and Generative Adversarial Networks. International Journal of Computer Vision128, 891–908 (2020)

2020

[56] [56]

In: Proceedings of the AAAI Conference on Artificial Intelligence

Tang, S., He, J., Guo, D., Wei, Y., Li, F., Hong, R.: Sign-idd: Iconicity disentangled diffusion for sign language production. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 39, pp. 7266–7274 (2025)

2025

[57] [57]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Vogler, C., Metaxas, D.: Parallel Hidden Markov Models for American Sign Lan- guage Recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 116–122 (1999)

1999

[58] [58]

arXiv preprint arXiv:2405.07663 (2024)

Walsh, H., Saunders, B., Bowden, R.: Sign stitching: A novel approach to sign language production. arXiv preprint arXiv:2405.07663 (2024)

work page arXiv 2024

[59] [59]

Wan: Open and Advanced Large-Scale Video Generative Models

Wan,T.,Wang,A.,Ai,B.,Wen,B.,Mao,C.,Xie,C.W.,Chen,D.,Yu,F.,Zhao,H., Yang, J., et al.: Wan: Open and Advanced Large-Scale Video Generative Models. arXiv preprint arXiv:2503.20314 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[60] [60]

In: Proceedings of the IEEE/CVF international conference on com- puter vision

Wong, R., Camgoz, N.C., Bowden, R.: Signrep: Enhancing self-supervised sign rep- resentations. In: Proceedings of the IEEE/CVF international conference on com- puter vision. pp. 22804–22814 (2025)

2025

[61] [61]

In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS) (2024)

Yang,L.,Kang,B.,Huang,Z.,Zhao,Z.,Xu,X.,Feng,J.,Zhao,H.:DepthAnything V2. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS) (2024)

2024

[62] [62]

In: Proceedings of the International Conference on Learning Representations (ICLR) (2023)

Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., Cao, Y.: ReAct: Synergizing Reasoning and Acting in Language Models. In: Proceedings of the International Conference on Learning Representations (ICLR) (2023)

2023

[63] [63]

In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL)

Yin, A., Li, H., Shen, K., Tang, S., Zhuang, Y.: T2S-GPT: Dynamic Vector Quanti- zation for Autoregressive Sign Language Production from Text. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL). pp. 3345–3356 (2024)

2024

[64] [64]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Yin, A., Zhong, T., Tang, L., Jin, W., Jin, T., Zhao, Z.: Gloss Attention for Gloss- Free Sign Language Translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2551–2562 (2023)

2023

[65] [65]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Zhang, D., Liu, Y., Lin, L., Zhu, Y., Li, Y., Qin, M., Li, Y., Wang, H.: GUAVA: Generalizable Upper Body 3D Gaussian Avatar. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 14205–14217 (October 2025)

2025

[66] [66]

In: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems

Zhang, H., Shalev-Arkushin, R., Baltatzis, V., Gillis, C., Laput, G., Kushalna- gar, R., Quandt, L.C., Findlater, L., Bedri, A., Lea, C.: Towards AI-driven Sign Language Generation with Non-manual Markers. In: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery (2025)

2025

[67] [67]

Advances in neural information processing systems36, 46595–46623 (2023) 20 O

Zheng, L., Chiang, W.L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., et al.: Judging llm-as-a-judge with mt-bench and chatbot arena. Advances in neural information processing systems36, 46595–46623 (2023) 20 O. Cory et al

2023

[68] [68]

arXiv preprint arXiv:2601.19577 (01 2026)

Zuo, R., Potamias, R., Sun, Q., Ververas, E., Deng, J., Zafeiriou, S.: MaDiS: Taming Masked Diffusion Language Models for Sign Language Generation. arXiv preprint arXiv:2601.19577 (01 2026)

work page arXiv 2026

[69] [69]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Zuo, R., Potamias, R.A., Ververas, E., Deng, J., Zafeiriou, S.: Signs as Tokens: A Retrieval-Enhanced Multilingual Sign Language Generator. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 23806– 23816 (2025)

2025

[70] [70]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Zuo, R., Wei, F., Mak, B.: Natural Language-Assisted Sign Language Recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 14890–14900 (2023)

2023