pith. machine review for the scientific record. sign in

arxiv: 2605.05616 · v1 · submitted 2026-05-07 · 💻 cs.CV · cs.LG

Recognition: unknown

RAM-H1200: A Unified Evaluation and Dataset on Hand Radiographs for Rheumatoid Arthritis

Authors on Pith no claims yet

Pith reviewed 2026-05-08 14:55 UTC · model grok-4.3

classification 💻 cs.CV cs.LG
keywords rheumatoid arthritishand radiographsbone erosioninstance segmentationSvdH scoringmedical imaging datasetbenchmarkRA assessment
0
0 comments X

The pith

RAM-H1200 supplies the first public dataset of 1,200 hand radiographs with joint annotations for bone structure, erosion masks, and clinical SvdH scoring.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces RAM-H1200 to fill gaps in public resources that previously lacked full-hand coverage, fine-grained pathology labels, and integration with clinical scoring for rheumatoid arthritis assessment. It supplies 1,200 radiographs from six centers together with whole-hand bone instance segmentation, pixel-level bone erosion masks, defined joint regions, and joint-level scores for both bone erosion and joint space narrowing. This design lets models address anatomical structure, localized erosive changes, and standardized severity metrics in one setting. A sympathetic reader would care because it opens the door to quantitative lesion analysis that goes beyond coarse grading and supports more complete evaluation of RA from everyday hand X-rays.

Core claim

RAM-H1200 is the first public large-scale benchmark that jointly supports whole-hand bone structure instance segmentation, pixel-level BE delineation, and clinically grounded joint-level SvdH scoring for both BE and JSN. The proposed BE masks enable quantitative analysis of lesion extent and morphology with explicit spatial supervision. Benchmark results across the supported tasks show that anatomical modeling achieves strong performance while quantitative bone erosion segmentation remains a major open challenge.

What carries the argument

The RAM-H1200 dataset and its four coordinated annotation layers: whole-hand bone instance segmentation, pixel-level bone erosion masks, SvdH-defined joint regions of interest, and joint-level SvdH scores for bone erosion and joint space narrowing.

If this is right

  • Models can now be trained and tested on the combined tasks of anatomical structure capture, localized lesion mapping, and clinical severity scoring within one dataset.
  • Quantitative measurement of bone erosion extent and shape becomes possible through explicit pixel-level supervision rather than categorical grades alone.
  • A single benchmark framework now exists for comprehensive RA analysis that links structure modeling directly to standardized clinical metrics.
  • Progress on mature bone segmentation tasks can be tracked separately from the still-difficult quantitative erosion delineation task.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If erosion segmentation models advance on this data, clinicians could obtain finer-grained tracking of disease progression between follow-up visits.
  • The dataset structure could be replicated for other joints or imaging modalities to build similar unified benchmarks for RA or related conditions.
  • The observed performance gap between anatomical and erosion tasks points to a concrete priority for algorithm development focused on fine-grained lesion morphology.
  • Wider adoption might encourage development of automated pipelines that output both visual segmentations and numeric SvdH scores in routine practice.

Load-bearing premise

The multi-level annotations collected across six centers are accurate, consistent, and representative enough to serve as reliable ground truth for training and evaluating models on the unified tasks.

What would settle it

A direct measurement showing substantial disagreement among the six-center annotations or poor generalization of models trained on the dataset to new clinical sites would indicate the ground truth is not reliable.

Figures

Figures reproduced from arXiv: 2605.05616 by Haolin Wang, Hongruixuan Chen, Jian Song, Junmu Peng, Lin Fan, Masatoshi Okutomi, Masayuki Ikebe, Shinya Takamaeda-Yamazaki, Songxiao Yang, Tamotsu Kamishima, Yafei Ou, Yao Fu.

Figure 1
Figure 1. Figure 1: RAM-H1200 contains 1,200 high-resolution hand radiographs collected from six medical view at source ↗
Figure 2
Figure 2. Figure 2: Distribution and Statistics for the RAM-H1200 dataset. (A) Data distribution by center. view at source ↗
Figure 3
Figure 3. Figure 3: Hand bone structure segmentation visualization results. view at source ↗
Figure 4
Figure 4. Figure 4: Hand BE segmentation visualization results. view at source ↗
Figure 5
Figure 5. Figure 5: Confusion matrices for SvdH BE scoring across models. view at source ↗
Figure 6
Figure 6. Figure 6: Confusion matrices for SvdH JSN scoring across models. view at source ↗
Figure 7
Figure 7. Figure 7: Overview of the data collection and processing pipeline for RAM-H1200. A total of 1376 view at source ↗
Figure 8
Figure 8. Figure 8: Overview of the tasks supported in RAM-H1200. (A) Original hand radiograph (CR). view at source ↗
Figure 9
Figure 9. Figure 9: Hand bone structure segmentation task on radiographs. (A) Input hand CR image. (B) view at source ↗
Figure 10
Figure 10. Figure 10: Illustration of the hand BE segmentation task on hand radiographs. (A) Input hand CR view at source ↗
Figure 11
Figure 11. Figure 11: SvdH-based BE scoring task on hand radiographs. (A) Input CR image. (B) Predicted view at source ↗
Figure 12
Figure 12. Figure 12: Illustration of the SvdH JSN scoring task. (A) Input hand radiograph. (B) Predicted JSN view at source ↗
Figure 13
Figure 13. Figure 13: Hand bone structure segmentation results (A). view at source ↗
Figure 14
Figure 14. Figure 14: Hand bone structure segmentation results (B). view at source ↗
Figure 15
Figure 15. Figure 15: SvdH-BE-90 segmentation results (A). 35 view at source ↗
Figure 16
Figure 16. Figure 16: SvdH-BE-90 segmentation results (B). F.1.3 Correlation Analysis Between Bone Segmentation-Derived Overlap Size and Ground-Truth Total SvdH JSN Score The clinical relevance of overlap size was evaluated using Spearman’s rank correlation analysis with the total JSN score. As summarized in view at source ↗
Figure 17
Figure 17. Figure 17: Multi-class BE segmentation results (A). view at source ↗
Figure 18
Figure 18. Figure 18: Multi-class BE segmentation results (B). view at source ↗
Figure 19
Figure 19. Figure 19: Joint-wise confusion matrices of SvdH BE scoring (A) view at source ↗
Figure 20
Figure 20. Figure 20: Joint-wise confusion matrices of SvdH BE scoring (B) view at source ↗
Figure 21
Figure 21. Figure 21: Joint-wise confusion matrices of SvdH BE scoring (C) view at source ↗
Figure 22
Figure 22. Figure 22: Joint-wise confusion matrices of SvdH JSN scoring (A). view at source ↗
Figure 23
Figure 23. Figure 23: Joint-wise confusion matrices of SvdH JSN scoring (B). view at source ↗
Figure 24
Figure 24. Figure 24: Joint-wise confusion matrices of SvdH JSN scoring (C). view at source ↗
read the original abstract

Rheumatoid arthritis (RA) assessment from hand radiographs requires multi-level analysis and modeling of anatomical structures and fine-grained local pathological changes. However, existing public resources do not support such unified multi-level analysis, often lacking full-hand coverage, fine-grained annotations, and consistent integration with clinical scoring systems. In particular, annotations that enable quantitative analysis of bone erosion (BE) remain scarce. RAM-H1200 contains 1,200 hand radiographs collected from six medical centers, with multi-level annotations including (i) whole-hand bone structure instance segmentation, (ii) pixel-level BE masks, (iii) SvdH-defined joint regions of interest, and (iv) joint-level SvdH scores for both BE and joint space narrowing (JSN). It is designed to evaluate whether models can jointly capture anatomical structure, localized erosive pathology, and clinically standardized RA severity from hand radiographs. The proposed BE masks enable, for the first time, quantitative BE analysis beyond coarse categorical grading by providing explicit spatial supervision for lesion extent and morphology. To our knowledge, RAM-H1200 is the first public large-scale benchmark that jointly supports whole-hand bone structure instance segmentation, pixel-level BE delineation, and clinically grounded joint-level SvdH scoring for both BE and JSN. Results across benchmark tasks show that anatomical modeling is substantially more mature than quantitative BE analysis: whole-hand bone segmentation achieves strong performance, whereas BE segmentation remains a major open challenge. By unifying anatomical structure modeling, quantitative lesion analysis, and clinically grounded SvdH scoring, RAM-H1200 provides a single benchmark for comprehensive RA analysis on hand radiographs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces RAM-H1200, a dataset of 1,200 hand radiographs collected from six medical centers. It supplies multi-level annotations consisting of whole-hand bone structure instance segmentation, pixel-level bone erosion (BE) masks, SvdH-defined joint regions of interest, and joint-level SvdH scores for both BE and joint space narrowing (JSN). The work positions the resource as the first public large-scale benchmark jointly supporting anatomical instance segmentation, quantitative BE delineation, and clinically standardized RA severity scoring, and reports benchmark observations that anatomical modeling is substantially more mature than quantitative BE analysis.

Significance. If the annotations are shown to be reliable, the dataset would provide a meaningful advance by unifying anatomical structure modeling, pixel-level lesion analysis, and standardized clinical scoring within a single multi-center resource. This addresses documented gaps in prior public hand radiograph collections that lack full-hand coverage or fine-grained BE supervision. The explicit release of pixel-level BE masks enables quantitative rather than purely categorical erosion assessment, which is a concrete strength for future model development.

major comments (2)
  1. [Abstract and dataset construction description] The central claim that RAM-H1200 constitutes a reliable unified benchmark rests on the multi-level annotations serving as accurate, consistent ground truth. The abstract and dataset description supply no information on annotation protocols, number of annotators per image, inter-rater agreement (e.g., Dice coefficients for BE masks or weighted kappa for SvdH scores), or adjudication procedures. Without these metrics, the reported performance gap between whole-hand bone segmentation and BE segmentation cannot be confidently attributed to task difficulty rather than label noise.
  2. [Benchmark results summary] The abstract asserts that 'whole-hand bone segmentation achieves strong performance' while 'BE segmentation remains a major open challenge,' yet the provided text contains no quantitative metrics, data splits, model specifications, or table/figure references supporting these observations. This omission is load-bearing for the claim that anatomical modeling is substantially more mature than quantitative BE analysis.
minor comments (2)
  1. [Abstract] The abstract would be strengthened by a brief reference to the specific tables or figures that report the benchmark numbers, allowing readers to locate the supporting evidence immediately.
  2. [Conclusion or data availability] Ensure the data availability statement explicitly confirms public release of the full annotations and images, including any licensing terms.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments highlighting the need for greater transparency on annotation reliability and benchmark evidence. We address each major point below and will revise the manuscript to incorporate the requested details.

read point-by-point responses
  1. Referee: [Abstract and dataset construction description] The central claim that RAM-H1200 constitutes a reliable unified benchmark rests on the multi-level annotations serving as accurate, consistent ground truth. The abstract and dataset description supply no information on annotation protocols, number of annotators per image, inter-rater agreement (e.g., Dice coefficients for BE masks or weighted kappa for SvdH scores), or adjudication procedures. Without these metrics, the reported performance gap between whole-hand bone segmentation and BE segmentation cannot be confidently attributed to task difficulty rather than label noise.

    Authors: We agree that explicit details on annotation quality are necessary to support claims of ground-truth reliability. In the revised manuscript we will expand the Dataset Construction section with a new subsection that fully describes the annotation protocols, the number of annotators per image, the inter-rater agreement metrics (Dice coefficients for BE masks and weighted kappa for SvdH scores), and the adjudication procedures employed. These additions will allow readers to assess the contribution of label noise to the observed performance differences. revision: yes

  2. Referee: [Benchmark results summary] The abstract asserts that 'whole-hand bone segmentation achieves strong performance' while 'BE segmentation remains a major open challenge,' yet the provided text contains no quantitative metrics, data splits, model specifications, or table/figure references supporting these observations. This omission is load-bearing for the claim that anatomical modeling is substantially more mature than quantitative BE analysis.

    Authors: The abstract is a concise summary; the full manuscript body contains the quantitative metrics, data splits, model specifications, and explicit references to tables and figures in the Benchmark Experiments section. We will revise the abstract to include brief quantitative highlights of the key results and strengthen cross-references to the relevant tables, figures, and sections so that the supporting evidence is immediately visible. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset release with no derivations or self-referential modeling

full rationale

The paper introduces RAM-H1200 as a new multi-center dataset and benchmark for hand radiograph analysis in RA. It contains no equations, fitted parameters, predictions, or derivation chains. Claims rest solely on data collection, multi-level annotations, and task definitions (instance segmentation, BE masks, SvdH scoring). No self-citations are load-bearing for any result, and no ansatz, uniqueness theorem, or renaming of prior results occurs. The central assertion that it is the 'first public large-scale benchmark' is a factual claim about resource availability, not a modeled output that reduces to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the curation of a new multi-center dataset and the validity of standard clinical SvdH scoring; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption SvdH scoring system is a valid and consistent clinical standard for assessing bone erosion and joint space narrowing in RA
    Invoked to define joint ROIs and provide joint-level scores.

pith-pipeline@v0.9.0 · 5647 in / 1319 out tokens · 87487 ms · 2026-05-08T14:55:39.223344+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

79 extracted references · 8 canonical work pages · 1 internal anchor

  1. [1]

    Diagnosis and management of rheumatoid arthritis: a review.Jama, 320(13):1360–1372, 2018

    Daniel Aletaha and Josef S Smolen. Diagnosis and management of rheumatoid arthritis: a review.Jama, 320(13):1360–1372, 2018

  2. [2]

    Ai automated radiographic scoring in rheumatoid arthritis: Shedding light on barriers to implementation through comprehensive evaluation

    Alix Bird, Lauren Oakden-Rayner, Katrina Chakradeo, Ranjeny Thomas, Drishti Gupta, Suyash Jain, Rohan Jacob, Shonket Ray, Mihir D Wechalekar, Susanna Proudman, et al. Ai automated radiographic scoring in rheumatoid arthritis: Shedding light on barriers to implementation through comprehensive evaluation. InSeminars in Arthritis and Rheumatism, volume 74, p...

  3. [3]

    Deep learning models to automate the scoring of hand radiographs for rheumatoid arthritis

    Zhiyan Bo, Laura C Coates, and Bartłomiej W Papie˙z. Deep learning models to automate the scoring of hand radiographs for rheumatoid arthritis. InAnnual Conference on Medical Image Understanding and Analysis, pages 398–413. Springer, 2024

  4. [4]

    Radiographic scoring methods as outcome measures in rheumatoid arthritis: properties and advantages.Annals of the rheumatic diseases, 60(9):817–827, 2001

    S Boini and F Guillemin. Radiographic scoring methods as outcome measures in rheumatoid arthritis: properties and advantages.Annals of the rheumatic diseases, 60(9):817–827, 2001

  5. [5]

    Emerging mri methods in rheumatoid arthritis.Nature Reviews Rheumatology, 7(2):85–95, 2011

    Camilo G Borrero, James M Mountz, and John D Mountz. Emerging mri methods in rheumatoid arthritis.Nature Reviews Rheumatology, 7(2):85–95, 2011

  6. [6]

    Karin Bruynesteyn, Désirée van der Heijde, Maarten Boers, Ariane Saudan, Paul Peloso, Harold Paulus, Harry Houben, Bridget Griffiths, John Edmonds, Barry Bresnihan, et al. Determination of the minimal clinically important difference in rheumatoid arthritis joint damage of the sharp/van der heijde and larsen/scott scoring methods by clinical experts and co...

  7. [7]

    TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

    Jieneng Chen, Yongyi Lu, Qihang Yu, Xiangde Luo, Ehsan Adeli, Yan Wang, Le Lu, Alan L Yuille, and Yuyin Zhou. Transunet: Transformers make strong encoders for medical image segmentation.arXiv preprint arXiv:2102.04306, 2021

  8. [8]

    Detection of bone erosions in rheumatoid arthritis wrist joints with magnetic resonance imaging, computed tomography and radiography.Arthritis research & therapy, 10(1):R25, 2008

    Uffe Møller Døhn, Bo J Ejbjerg, Maria Hasselquist, Eva Narvestad, Jakob Møller, Henrik S Thomsen, and Mikkel Østergaard. Detection of bone erosions in rheumatoid arthritis wrist joints with magnetic resonance imaging, computed tomography and radiography.Arthritis research & therapy, 10(1):R25, 2008

  9. [9]

    Hand bone extraction and segmentation based on a convolutional neural network.Biomedical Signal Processing and Control, 89:105788, 2024

    Hongbo Du, Hai Wang, Chunlai Yang, Luyando Kabalata, Henian Li, and Changfu Qiang. Hand bone extraction and segmentation based on a convolutional neural network.Biomedical Signal Processing and Control, 89:105788, 2024

  10. [10]

    Bo Ejbjerg, Eva Narvestad, Egill Rostrup, Marcin Szkudlarek, Søren Jacobsen, Henrik S Thomsen, and Mikkel Østergaard. Magnetic resonance imaging of wrist and finger joints in healthy subjects occasionally shows changes resembling erosions and synovitis as seen in rheumatoid arthritis.Arthritis & Rheumatism: Official Journal of the American College of Rheu...

  11. [11]

    Progress in imaging in rheumatology

    Emilio Filippucci, Luca Di Geso, and Walter Grassi. Progress in imaging in rheumatology. Nature Reviews Rheumatology, 10(10):628–634, 2014

  12. [12]

    Motoshi Fujimori, Tamotsu Kamishima, Masaru Kato, Yumika Seno, Kenneth Sutherland, Hiroyuki Sugimori, Mutsumi Nishida, and Tatsuya Atsumi. Composite assessment of power doppler ultrasonography and mri in rheumatoid arthritis: a pilot study of predictive value in radiographic progression after one year.The British Journal of Radiology, 91(1086):20170748, 2018

  13. [13]

    Bone age assessment of children using a digital hand atlas.Computerized medical imaging and graphics, 31(4-5):322–331, 2007

    Arkadiusz Gertych, Aifeng Zhang, James Sayre, Sylwia Pospiech-Kurkowska, and HK Huang. Bone age assessment of children using a digital hand atlas.Computerized medical imaging and graphics, 31(4-5):322–331, 2007. 10

  14. [14]

    Levit: a vision transformer in convnet’s clothing for faster inference

    Benjamin Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervé Jégou, and Matthijs Douze. Levit: a vision transformer in convnet’s clothing for faster inference. InProceedings of the IEEE/CVF international conference on computer vision, pages 12259– 12269, 2021

  15. [15]

    Bridging the gap: combining treat-to- target and difficult-to-treat strategies in the management of rheumatoid arthritis.Nature Reviews Rheumatology, pages 1–9, 2026

    Lilla Gunkl-Tóth, Iain B McInnes, and György Nagy. Bridging the gap: combining treat-to- target and difficult-to-treat strategies in the management of rheumatoid arthritis.Nature Reviews Rheumatology, pages 1–9, 2026

  16. [16]

    The rsna pediatric bone age machine learning challenge.Radiology, 290(2):498– 503, 2019

    Safwan S Halabi, Luciano M Prevedello, Jayashree Kalpathy-Cramer, Artem B Mamonov, Alexander Bilbily, Mark Cicero, Ian Pan, Lucas Araújo Pereira, Rafael Teixeira Sousa, Nitamar Abdala, et al. The rsna pediatric bone age machine learning challenge.Radiology, 290(2):498– 503, 2019

  17. [17]

    Mambavision: A hybrid mamba-transformer vision backbone

    Ali Hatamizadeh and Jan Kautz. Mambavision: A hybrid mamba-transformer vision backbone. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 25261– 25270, 2025

  18. [18]

    Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images

    Ali Hatamizadeh, Vishwesh Nath, Yucheng Tang, Dong Yang, Holger R Roth, and Daguang Xu. Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images. In International MICCAI brainlesion workshop, pages 272–284. Springer, 2021

  19. [19]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016

  20. [20]

    Evaluation method of rheumatoid arthritis by the x-ray photograph using deep learning

    Yuri Hioki, Koji Makino, Kensuke Koyama, Hirotaka Haro, and Hidetsugu Terada. Evaluation method of rheumatoid arthritis by the x-ray photograph using deep learning. In2021 IEEE 3rd Global Conference on Life Sciences and Technologies (LifeTech), pages 444–447. IEEE, 2021

  21. [21]

    Toru Hirano, Masayuki Nishide, Naoki Nonaka, Jun Seita, Kosuke Ebina, Kazuhiro Sakurada, and Atsushi Kumanogoh. Development and validation of a deep-learning model for scoring of radiographic finger joint destruction in rheumatoid arthritis.Rheumatology advances in practice, 3(2):rkz047, 2019

  22. [22]

    Densely connected convolutional networks

    Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017

  23. [23]

    nnu-net: a self-configuring method for deep learning-based biomedical image segmentation.Nature methods, 18(2):203–211, 2021

    Fabian Isensee, Paul F Jaeger, Simon AA Kohl, Jens Petersen, and Klaus H Maier-Hein. nnu-net: a self-configuring method for deep learning-based biomedical image segmentation.Nature methods, 18(2):203–211, 2021

  24. [24]

    Predictors of radiographic joint damage in patients with early rheumatoid arthritis

    LMA Jansen, IE Van der Horst-Bruinsma, D Van Schaardenburg, PD Bezemer, and BAC Dijkmans. Predictors of radiographic joint damage in patients with early rheumatoid arthritis. Annals of the rheumatic diseases, 60(10):924–927, 2001

  25. [25]

    Automatic segmentation for favourable delineation of ten wrist bones on wrist radiographs using convolutional neural network.Journal of Personalized Medicine, 12(5):776, 2022

    Bo-kyeong Kang, Yelin Han, Jaehoon Oh, Jongwoo Lim, Jongbin Ryu, Myeong Seong Yoon, Juncheol Lee, and Soorack Ryu. Automatic segmentation for favourable delineation of ten wrist bones on wrist radiographs using convolutional neural network.Journal of Personalized Medicine, 12(5):776, 2022

  26. [26]

    Segmentation of radiographs of hands with joint damage using customized active appearance models

    JA Kauffman, Cornelis H Slump, and HJ Bernelot Moens. Segmentation of radiographs of hands with joint damage using customized active appearance models. In15th Annual Workshop on Circuits, Systems and Signal Processing, ProRisc 2004, 2004

  27. [27]

    An introduction to machine learning and analysis of its use in rheumatic diseases.Nature Reviews Rheumatology, 17(12):710–730, 2021

    Kathryn M Kingsmore, Christopher E Puglisi, Amrie C Grammer, and Peter E Lipsky. An introduction to machine learning and analysis of its use in rheumatic diseases.Nature Reviews Rheumatology, 17(12):710–730, 2021

  28. [28]

    Segment anything

    Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. Segment anything. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4015–4026, 2023. 11

  29. [29]

    Mechanisms of joint destruction in rheumatoid arthritis—immune cell–fibroblast–bone interactions.Nature Reviews Rheumatology, 18(7):415– 429, 2022

    Noriko Komatsu and Hiroshi Takayanagi. Mechanisms of joint destruction in rheumatoid arthritis—immune cell–fibroblast–bone interactions.Nature Reviews Rheumatology, 18(7):415– 429, 2022

  30. [30]

    Model-based erosion spotting and visualization in rheumatoid arthritis.Academic radiology, 14(10):1179–1188, 2007

    Georg Langs, Philipp Peloschek, Horst Bischof, and Franz Kainberger. Model-based erosion spotting and visualization in rheumatoid arthritis.Academic radiology, 14(10):1179–1188, 2007

  31. [31]

    Automatic quantification of joint space narrowing and erosions in rheumatoid arthritis.IEEE transactions on medical imaging, 28(1):151–164, 2008

    Georg Langs, Philipp Peloschek, Horst Bischof, and Franz Kainberger. Automatic quantification of joint space narrowing and erosions in rheumatoid arthritis.IEEE transactions on medical imaging, 28(1):151–164, 2008

  32. [32]

    Osteoporosis prediction from hand and wrist x-rays using image segmentation and self-supervised learning

    Hyungeun Lee, Ung Hwang, Seungwon Yu, Chang-Hun Lee, and Kijung Yoon. Osteoporosis prediction from hand and wrist x-rays using image segmentation and self-supervised learning. arXiv preprint arXiv:2311.06834, 2023

  33. [33]

    Efficientformer: Vision transformers at mobilenet speed.Advances in Neural Information Processing Systems, 35:12934–12949, 2022

    Yanyu Li, Geng Yuan, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, and Jian Ren. Efficientformer: Vision transformers at mobilenet speed.Advances in Neural Information Processing Systems, 35:12934–12949, 2022

  34. [34]

    Chung-Yueh Lien, Hao-Jan Wang, Cheng-Kai Lu, Tzu-Hsuan Hsu, Woei-Chyn Chu, and Chien- Chih Lai. Deep learning with an attention mechanism for enhancing automated modified total sharp/van der heijde scoring of hand x-ray images in rheumatoid arthritis.Journal of Medical and Biological Engineering, pages 1–9, 2025

  35. [35]

    Precision medicine: the precision gap in rheumatic disease.Nature Reviews Rheumatology, 18(12):725–733, 2022

    Chung MA Lin, Faye AH Cooles, and John D Isaacs. Precision medicine: the precision gap in rheumatic disease.Nature Reviews Rheumatology, 18(12):725–733, 2022

  36. [36]

    Focal loss for dense object detection

    Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. Focal loss for dense object detection. InProceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017

  37. [37]

    Swin-umamba: Mamba-based unet with imagenet-based pretraining

    Jiarun Liu, Hao Yang, Hong-Yu Zhou, Yan Xi, Lequan Yu, Cheng Li, Yong Liang, Guangming Shi, Yizhou Yu, Shaoting Zhang, et al. Swin-umamba: Mamba-based unet with imagenet-based pretraining. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 615–625. Springer, 2024

  38. [38]

    Segment anything in medical images.Nature Communications, 15:654, 2024

    Jun Ma, Yuting He, Feifei Li, Lin Han, Chenyu You, and Bo Wang. Segment anything in medical images.Nature Communications, 15:654, 2024

  39. [39]

    U-mamba: Enhancing long-range dependency for biomedical image segmentation

    Jun Ma, Feifei Li, and Bo Wang. U-mamba: Enhancing long-range dependency for biomedical image segmentation.arXiv preprint arXiv:2401.04722, 2024

  40. [40]

    Deep learning for rheumatoid arthritis: Joint detection and damage scoring in x-rays.arXiv preprint arXiv:2104.13915, 2021

    Krzysztof Maziarz, Anna Krason, and Zbigniew Wojna. Deep learning for rheumatoid arthritis: Joint detection and damage scoring in x-rays.arXiv preprint arXiv:2104.13915, 2021

  41. [41]

    Mobilevit: Light- weight, general-purpose, and mobile-friendly vision trans- former

    Sachin Mehta and Mohammad Rastegari. Mobilevit: light-weight, general-purpose, and mobile- friendly vision transformer.arXiv preprint arXiv:2110.02178, 2021

  42. [42]

    Imaging in inflammatory arthritis: progress towards precision medicine.Nature Reviews Rheumatology, 19(10):650–665, 2023

    Ioanna Minopoulou, Arnd Kleyer, Melek Yalcin-Mutlu, Filippo Fagni, Stefan Kemenes, Chris- tian Schmidkonz, Armin Atzinger, Milena Pachowsky, Klaus Engel, Lukas Folle, et al. Imaging in inflammatory arthritis: progress towards precision medicine.Nature Reviews Rheumatology, 19(10):650–665, 2023

  43. [43]

    Deep learning-based automatic-bone-destruction-evaluation system using contextual information from other joints

    Kazuki Miyama, Ryoma Bise, Satoshi Ikemura, Kazuhiro Kai, Masaya Kanahori, Shinkichi Arisumi, Taisuke Uchida, Yasuharu Nakashima, and Seiichi Uchida. Deep learning-based automatic-bone-destruction-evaluation system using contextual information from other joints. Arthritis Research & Therapy, 24(1):227, 2022

  44. [44]

    Seiichi Murakami, Kazuhiro Hatano, JooKooi Tan, Hyoungseop Kim, and Takatoshi Aoki. Automatic identification of bone erosions in rheumatoid arthritis from hand radiographs based on deep convolutional neural network.Multimedia tools and applications, 77(9):10921–10937, 2018. 12

  45. [45]

    A pediatric wrist trauma x-ray dataset (grazpedwri-dx) for machine learning.Scientific data, 9(1):222, 2022

    Eszter Nagy, Michael Janisch, Franko Hrži ´c, Erich Sorantin, and Sebastian Tschauner. A pediatric wrist trauma x-ray dataset (grazpedwri-dx) for machine learning.Scientific data, 9(1):222, 2022

  46. [46]

    A sub-pixel accurate quantifi- cation of joint space narrowing progression in rheumatoid arthritis.IEEE Journal of Biomedical and Health Informatics, 27(1):53–64, 2022

    Yafei Ou, Prasoon Ambalathankandy, Ryunosuke Furuya, Seiya Kawada, Tianyu Zeng, Yujie An, Tamotsu Kamishima, Kenichi Tamura, and Masayuki Ikebe. A sub-pixel accurate quantifi- cation of joint space narrowing progression in rheumatoid arthritis.IEEE Journal of Biomedical and Health Informatics, 27(1):53–64, 2022

  47. [47]

    Automatic radiographic quantification of joint space narrowing progression in rheumatoid arthritis using poc

    Yafei Ou, Prasoon Ambalathankandy, Takeshi Shimada, Tamotsu Kamishima, and Masayuki Ikebe. Automatic radiographic quantification of joint space narrowing progression in rheumatoid arthritis using poc. In2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), pages 1183–1187. IEEE, 2019

  48. [48]

    Computer-aided diagnosis and monitoring of rheumatoid arthritis in conventional radiography: Advancements and future opportunities

    Yafei Ou, Wahyu Rahmaniar, Dichao Liu, Hiroko Oshibe, Ze Jin, Tamotsu Kamishima, and Kenji Suzuki. Computer-aided diagnosis and monitoring of rheumatoid arthritis in conventional radiography: Advancements and future opportunities. InArtificial Intelligence in Diagnostics and Imaging Technologies in Healthcare: In honour of Professor Dr. George A. Tsihrint...

  49. [49]

    Machine learning in rheumatology approaches the clinic.Nature Reviews Rheumatology, 16(2):69–70, 2020

    Aridaman Pandit and Timothy RDJ Radstake. Machine learning in rheumatology approaches the clinic.Nature Reviews Rheumatology, 16(2):69–70, 2020

  50. [50]

    Raj Ponnusamy, Ming Zhang, Zhiheng Chang, Yue Wang, Carmine Guida, Samantha Kuang, Xinyue Sun, Jordan Blackadar, Jeffrey B Driban, Timothy McAlindon, et al. Automatic measuring of finger joint space width on hand radiograph using deep learning and conventional computer vision methods.Biomedical signal processing and control, 84:104713, 2023

  51. [51]

    Mura: Large dataset for abnormality detection in musculoskeletal radiographs.arXiv preprint arXiv:1712.06957, 2017

    Pranav Rajpurkar, Jeremy Irvin, Aarti Bagul, Daisy Ding, Tony Duan, Hershel Mehta, Brandon Yang, Kaylie Zhu, Dillon Laird, Robyn L Ball, et al. Mura: Large dataset for abnormality detection in musculoskeletal radiographs.arXiv preprint arXiv:1712.06957, 2017

  52. [52]

    Bone erosion scoring for rheumatoid arthritis with deep convolutional neural networks.Computers & Electrical Engi- neering, 78:472–481, 2019

    Janick Rohrbach, Tobias Reinhard, Beate Sick, and Oliver Dürr. Bone erosion scoring for rheumatoid arthritis with deep convolutional neural networks.Computers & Electrical Engi- neering, 78:472–481, 2019

  53. [53]

    U-net: Convolutional networks for biomedical image segmentation

    Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InMedical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, pages 234–241. Springer, 2015

  54. [54]

    Bone erosion in rheumatoid arthritis: mechanisms, diagnosis and treatment.Nature Reviews Rheumatology, 8(11):656–664, 2012

    Georg Schett and Ellen Gravallese. Bone erosion in rheumatoid arthritis: mechanisms, diagnosis and treatment.Nature Reviews Rheumatology, 8(11):656–664, 2012

  55. [55]

    Rheumatoid arthritis in review: Clinical, anatomical, cellular and molecular points of view.Clinical Anatomy, 31(2):216–223, 2018

    Kassem Sharif, Alaa Sharif, Fareed Jumah, Rod Oskouian, and R Shane Tubbs. Rheumatoid arthritis in review: Clinical, anatomical, cellular and molecular points of view.Clinical Anatomy, 31(2):216–223, 2018

  56. [56]

    Variability of precision in scoring radiographic abnormalities in rheumatoid arthritis by experienced readers.The Journal of rheumatology, 31(6):1062–1072, 2004

    John T Sharp, Frederick Wolfe, Marissa Lassere, Maarten Boers, Désirée Van Der Heijde, Arvi Larsen, Harold Paulus, Rolf Rau, and Vibeke Strand. Variability of precision in scoring radiographic abnormalities in rheumatoid arthritis by experienced readers.The Journal of rheumatology, 31(6):1062–1072, 2004

  57. [57]

    John T Sharp, Donald Y Young, Gilbert B Bluhm, Andrew Brook, Anne C Brower, Mary Corbett, John L Decker, Harry K Genant, J Philip Gofton, Neal Goodman, et al. How many joints in the hands and wrists should be included in a score of radiologic abnormalities used to assess rheumatoid arthritis?Arthritis & Rheumatism: Official Journal of the American College...

  58. [58]

    Rheumatoid arthritis therapy reappraisal: strategies, opportunities and challenges.Nature Reviews Rheumatology, 11(5):276–289, 2015

    Josef S Smolen and Daniel Aletaha. Rheumatoid arthritis therapy reappraisal: strategies, opportunities and challenges.Nature Reviews Rheumatology, 11(5):276–289, 2015. 13

  59. [59]

    Smolen, Daniel Aletaha, Anne Barton, Gerd R

    Josef S. Smolen, Daniel Aletaha, Anne Barton, Gerd R. Burmester, Paul Emery, Gary S. Firestein, Arthur Kavanaugh, Iain B. McInnes, Daniel H. Solomon, Vibeke Strand, and Kazuhiko Yamamoto. Rheumatoid arthritis.Nature Reviews Disease Primers, 4(1):18001, 2018

  60. [60]

    Deep learning in rheumatological image interpretation.Nature Reviews Rheumatology, 20(3):182–195, 2024

    Berend C Stoel, Marius Staring, Monique Reijnierse, and Annette HM van der Helm-van Mil. Deep learning in rheumatological image interpretation.Nature Reviews Rheumatology, 20(3):182–195, 2024

  61. [61]

    Rheumavit: Transformer-based model for automated scoring of hand joints in rheumatoid arthritis

    Alexander Stolpovsky, Elizaveta Dakhova, Polina Druzhinina, Polina Postnikova, Daniil Kudinsky, Alexander Smirnov, Anastasia Sukhinina, Alexander Lila, and Anvar Kurmukov. Rheumavit: Transformer-based model for automated scoring of hand joints in rheumatoid arthritis. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 2522–2531, 2023

  62. [62]

    A crowdsourcing approach to develop machine learning models to quantify radiographic joint damage in rheumatoid arthritis.JAMA network open, 5(8):e2227423–e2227423, 2022

    Dongmei Sun, Thanh M Nguyen, Robert J Allaway, Jelai Wang, Verena Chung, Thomas V Yu, Michael Mason, Isaac Dimitrovsky, Lars Ericson, Hongyang Li, et al. A crowdsourcing approach to develop machine learning models to quantify radiographic joint damage in rheumatoid arthritis.JAMA network open, 5(8):e2227423–e2227423, 2022

  63. [63]

    Efficientnetv2: Smaller models and faster training

    Mingxing Tan and Quoc Le. Efficientnetv2: Smaller models and faster training. InInternational conference on machine learning, pages 10096–10106. PMLR, 2021

  64. [64]

    Hands-on experience with active appearance models

    Hans Henrik Thodberg. Hands-on experience with active appearance models. InMedical Imaging 2002: Image Processing, volume 4684, pages 495–506. SPIE, 2002

  65. [65]

    Plain x-rays in rheumatoid arthritis: overview of scoring methods, their reliability and applicability.Bailliere’s clinical rheumatology, 10(3):435–453, 1996

    Desiree MFM Van der Heijde. Plain x-rays in rheumatoid arthritis: overview of scoring methods, their reliability and applicability.Bailliere’s clinical rheumatology, 10(3):435–453, 1996

  66. [66]

    How to read radiographs according to the sharp/van der heijde method

    DMFM Van der Heijde. How to read radiographs according to the sharp/van der heijde method. The Journal of rheumatology, 27(1):261–263, 2000

  67. [67]

    Richard J Wakefield, Wayne W Gibbon, Philip G Conaghan, Philip O’Connor, Dennis McGo- nagle, Colin Pease, Michael J Green, Douglas J Veale, John D Isaacs, and Paul Emery. The value of sonography in the detection of bone erosions in patients with rheumatoid arthritis: a comparison with conventional radiography.Arthritis & Rheumatism, 43(12):2762–2770, 2000

  68. [68]

    Deep learning-based computer-aided diagnosis of rheumatoid arthritis with hand x-ray images conforming to modified total sharp/van der heijde score.Biomedicines, 10(6):1355, 2022

    Hao-Jan Wang, Chi-Ping Su, Chien-Chih Lai, Wun-Rong Chen, Chi Chen, Liang-Ying Ho, Woei-Chyn Chu, and Chung-Yueh Lien. Deep learning-based computer-aided diagnosis of rheumatoid arthritis with hand x-ray images conforming to modified total sharp/van der heijde score.Biomedicines, 10(6):1355, 2022

  69. [69]

    Bls-gan: A deep layer separation framework for eliminating bone overlap in conventional radiographs

    Haolin Wang, Yafei Ou, Prasoon Ambalathankandy, Gen Ota, Pengyu Dai, Masayuki Ikebe, Kenji Suzuki, and Tamotsu Kamishima. Bls-gan: A deep layer separation framework for eliminating bone overlap in conventional radiographs. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 7674–7681, 2025

  70. [70]

    Layer separation: Towards adjustable joint space width images synthesis

    Haolin Wang, Yafei Ou, Prasoon Ambalathankandy, Gen Ota, Pengyu Dai, Masayuki Ikebe, Kenji Suzuki, and Tamotsu Kamishima. Layer separation: Towards adjustable joint space width images synthesis. InProceedings of the 33rd ACM International Conference on Multimedia, pages 8273–8282, 2025

  71. [71]

    A deep registration method for accurate quantification of joint space narrowing progression in rheumatoid arthritis

    Haolin Wang, Yafei Ou, Wanxuan Fang, Prasoon Ambalathankandy, Naoto Goto, Gen Ota, Taichi Okino, Jun Fukae, Kenneth Sutherland, Masayuki Ikebe, et al. A deep registration method for accurate quantification of joint space narrowing progression in rheumatoid arthritis. Computerized Medical Imaging and Graphics, 108:102273, 2023

  72. [72]

    Convnext v2: Co-designing and scaling convnets with masked autoencoders

    Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, and Saining Xie. Convnext v2: Co-designing and scaling convnets with masked autoencoders. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16133–16142, 2023. 14

  73. [73]

    Thasia G Woodworth, Olga Morgacheva, Olga L Pimienta, Orrin M Troum, Veena K Ranganath, and Daniel E Furst. Examining the validity of the rheumatoid arthritis magnetic resonance imaging score according to the omeract filter—a systematic literature review.Rheumatology, 56(7):1177–1188, 2017

  74. [74]

    Segformer: Simple and efficient design for semantic segmentation with transformers.Advances in neural information processing systems, 34:12077–12090, 2021

    Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, and Ping Luo. Segformer: Simple and efficient design for semantic segmentation with transformers.Advances in neural information processing systems, 34:12077–12090, 2021

  75. [75]

    Deep learning approach for automatic segmentation of ulna and radius in dual-energy x-ray imaging.Insights into Imaging, 12:1–9, 2021

    Fan Yang, Xin Weng, Yuehong Miao, Yuhui Wu, Hong Xie, and Pinggui Lei. Deep learning approach for automatic segmentation of ulna and radius in dual-energy x-ray imaging.Insights into Imaging, 12:1–9, 2021

  76. [76]

    Ram-w600: A multi-task wrist dataset and benchmark for rheumatoid arthritis

    Songxiao Yang, Haolin Wang, Yao Fu, Ye Tian, Tamotsu Kamishima, Masayuki Ikebe, Yafei Ou, and Masatoshi Okutomi. Ram-w600: A multi-task wrist dataset and benchmark for rheumatoid arthritis. InAdvances in Neural Information Processing Systems 38 (NeurIPS 2025), 2025

  77. [77]

    Ap-dpm: A dual-path merging network via adversarial anatomical prior guidance for wrist bone segmentation

    Songxiao Yang, Haolin Wang, Masayuki Ikebe, Tamotsu Kamishima, Yafei Ou, and Okutomi Masatoshi. Ap-dpm: A dual-path merging network via adversarial anatomical prior guidance for wrist bone segmentation. In2025 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 4335–4340. IEEE, 2025

  78. [78]

    arXiv preprint arXiv:2403.03849 (2024)

    Yubiao Yue and Zhenzhang Li. Medmamba: Vision mamba for medical image classification. arXiv preprint arXiv:2403.03849, 2024

  79. [79]

    images

    Zongwei Zhou, Md Mahfuzur Rahman Siddiquee, Nima Tajbakhsh, and Jianming Liang. Unet++: A nested u-net architecture for medical image segmentation. InDeep learning in medical image analysis and multimodal learning for clinical decision support: 4th international workshop, DLMIA 2018, and 8th international workshop, ML-CDS 2018, held in conjunction with MI...