pith. sign in

arxiv: 2607.00057 · v1 · pith:QMAHI4IWnew · submitted 2026-06-30 · 💻 cs.CV · cs.AI

Enhancing Oracle Bone Inscription Recognition via Multi-Scale Layer Attention

Pith reviewed 2026-07-02 20:07 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords Oracle Bone InscriptionsMulti-Scale Layer AttentionAttention MechanismsImage RecognitionDeep LearningAncient ScriptsComputer VisionFeature Interactions
0
0 comments X

The pith

Multi-Scale Layer Attention improves Oracle Bone Inscription recognition by modeling fine-grained details across scales and layers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to establish that standard layer attention falls short on the irregular and degraded forms of ancient Chinese oracle bone inscriptions, but explicitly adding multi-scale feature modeling overcomes this gap. It introduces MSLA to combine multi-scale and cross-layer interactions so that representations capture subtle variations more effectively. Experiments on large OBIs datasets show consistent gains in accuracy while keeping computation low. A sympathetic reader would care because better automated recognition could reduce reliance on slow expert manual analysis for studying early Chinese culture. The central mechanism is the enrichment of features through simultaneous scale and layer attention.

Core claim

The authors propose Multi-Scale Layer Attention (MSLA), a novel paradigm that explicitly models both multi-scale and cross-layer feature interactions. By enriching the representation with fine-grained details across multiple spatial scales, MSLA enables more accurate and robust OBIs recognition than existing layer attention techniques, which only yield marginal gains.

What carries the argument

Multi-Scale Layer Attention (MSLA), a module that jointly models multi-scale spatial features and cross-layer interactions to enrich fine-grained representations for irregular image patterns.

If this is right

  • MSLA achieves higher recognition accuracy than prior attention mechanisms on OBIs data.
  • The method preserves computational efficiency during inference and training.
  • It handles complex irregular and degraded shapes more robustly by capturing subtle variations.
  • Multi-scale enrichment leads to more reliable automated analysis of historical inscriptions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same multi-scale cross-layer approach might apply to other degraded script or artifact recognition tasks beyond OBIs.
  • Integration with additional domain priors such as stroke order could further reduce error rates on edge cases.
  • The efficiency claim suggests MSLA could scale to real-time processing pipelines for cultural heritage digitization.
  • Testing on non-Chinese ancient scripts with similar irregularity would clarify how general the multi-scale benefit is.

Load-bearing premise

Existing layer attention methods show only marginal gains on OBIs, so adding explicit multi-scale modeling will produce substantially better results.

What would settle it

Running the same large-scale OBIs dataset experiments and finding no accuracy improvement or efficiency loss compared to standard layer attention would falsify the central claim.

Figures

Figures reproduced from arXiv: 2607.00057 by Chaowen Yan, Jianlong Xiong, Kaishen Wang, Tao He, Yong Wang.

Figure 1
Figure 1. Figure 1: (a) Samples of real-world OBIs. (b) Deciphering OBI characters into modern Chinese characters via traditional recognition method [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Deciphering oracle bone script inscriptions via Convolutional Neural Networks (CNNs). local cross-channel interaction mechanism without dimen￾sionality reduction, achieving improved performance with minimal computational overhead. Spatial attention [63, 79, 75, 5] focuses on identifying salient spatial regions within feature maps, enabling the model to capture long-range dependencies and contextual informa… view at source ↗
Figure 3
Figure 3. Figure 3: Architecture-level comparison among vanilla, layer aggregation, and layer attention. Skip connections are omitted for clarity. 3.2. Preliminary We formulate existing layer interaction methods into two patterns: layer aggregation and layer attention. Layer Aggregation Layer aggregation enhances inter￾layer interaction through various novel modules applied along the network depth, which can be abstracted int… view at source ↗
Figure 4
Figure 4. Figure 4: Training and validation loss curves on the Oracle-MNIST dataset for various layer attention methods with ResNet-20 as the backbone. 400 epochs. We evaluated the performance using ResNet￾50 and ResNet-101 backbones, and compared them against multiple other models. All experiments were conducted on a single NVIDIA RTX 4090 GPU. Experimental Results. As shown in [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Ablation studies: (a) Evaluation on different patch size 𝑃 and (b) Evaluation on local tokens [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
read the original abstract

Oracle Bone Inscriptions (OBIs) recognition plays a crucial role in understanding ancient Chinese culture. However, accurately recognizing OBIs remains highly challenging due to their complex, irregular, and often degraded shapes. Traditional methods rely on expert knowledge and manual analysis, which are time-consuming and error-prone. Although deep learning has greatly advanced general image recognition, existing methods struggle to capture the fine-grained details and subtle variations inherent in OBIs, resulting in limited performance. Even most recent and effective layer attention techniques are designed to capture fine-grained dependencies through enhanced inter-layer interactions, yet they still exhibit only marginal improvements in OBIs recognition. To address these limitations, we propose Multi-Scale Layer Attention (MSLA), a novel paradigm that explicitly models both multi-scale and cross-layer feature interactions. By enriching the representation with fine-grained details across multiple spatial scales, MSLA enables more accurate and robust OBIs recognition. Extensive experiments on large-scale OBIs datasets demonstrate that MSLA consistently outperforms existing attention mechanisms while maintaining computational efficiency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper proposes Multi-Scale Layer Attention (MSLA), a novel attention paradigm that explicitly models both multi-scale spatial feature interactions and cross-layer dependencies to improve recognition of Oracle Bone Inscriptions (OBIs), which are challenging due to irregular, degraded shapes. It claims that prior layer attention methods yield only marginal gains on OBIs and that adding explicit multi-scale modeling overcomes this, with extensive experiments on large-scale OBIs datasets showing consistent outperformance over existing attention mechanisms while preserving computational efficiency.

Significance. If the reported gains hold under rigorous evaluation, MSLA could provide a practical improvement for fine-grained recognition tasks on degraded historical imagery, with potential applications in cultural heritage digitization. The work builds on existing layer attention ideas by adding multi-scale modeling, but its impact depends on whether the gains are reproducible and larger than those from standard multi-scale backbones or attention variants.

major comments (1)
  1. [Abstract] Abstract: the central claim that MSLA 'consistently outperforms existing attention mechanisms' on large-scale OBIs datasets is unsupported by any quantitative metrics, baselines, error bars, dataset sizes, or experimental protocol. This absence makes the primary performance assertion impossible to evaluate and is load-bearing for the paper's contribution.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the concern regarding the abstract below and will make the requested changes to strengthen the presentation of our results.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that MSLA 'consistently outperforms existing attention mechanisms' on large-scale OBIs datasets is unsupported by any quantitative metrics, baselines, error bars, dataset sizes, or experimental protocol. This absence makes the primary performance assertion impossible to evaluate and is load-bearing for the paper's contribution.

    Authors: We agree that the abstract, as currently written, summarizes the experimental outcomes at a high level without embedding specific quantitative details. The full manuscript (Sections 4 and 5) contains the complete experimental protocol, dataset sizes, baselines (including prior layer attention methods), accuracy metrics, efficiency comparisons, and error bars. To directly address the concern and make the central claim evaluable from the abstract alone, we will revise the abstract to incorporate representative quantitative results while preserving conciseness. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper introduces MSLA as an architectural extension to layer attention for OBI recognition and supports its claims solely through empirical experiments on datasets. No equations, derivations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text. The central claim rests on comparative performance results rather than any reduction to prior inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no free parameters, axioms, or invented entities are described.

pith-pipeline@v0.9.1-grok · 5708 in / 970 out tokens · 26421 ms · 2026-07-02T20:07:42.298728+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

80 extracted references · 11 canonical work pages · 4 internal anchors

  1. [1]

    Anzhu,L.,etal.,2020.Oracle-boneinscriptionsandculturalmemory. Front. Art. Res 2, 63–73

  2. [2]

    Channel attention networks, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp

    Bastidas, A.A., Tang, H., 2019. Channel attention networks, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 0–0

  3. [3]

    Early chinese writing

    Boltz, W.G., 1986. Early chinese writing. World Archaeology 17, 420–436

  4. [4]

    Enhancing feature fusion of u-like networks with dynamic skip connections

    Cao,Y.,He,Q.,Wang,K.,Xiong,J.,Yi,Z.,He,T.,2026. Enhancing feature fusion of u-like networks with dynamic skip connections. Medical Image Analysis , 104010

  5. [5]

    End-to-end object detection with transformers, in: European conference on computer vision, Springer

    Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S., 2020. End-to-end object detection with transformers, in: European conference on computer vision, Springer. pp. 213–229

  6. [6]

    Video person re-identification with competitive snippet-similarity aggregation and co-attentive snippet embedding, in: CVPR, pp

    Chen, D., Li, H., Xiao, T., Yi, S., Wang, X., 2018. Video person re-identification with competitive snippet-similarity aggregation and co-attentive snippet embedding, in: CVPR, pp. 1169–1178

  7. [7]

    Thecityscapes dataset for semantic urban scene understanding, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp

    Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson,R.,Franke,U.,Roth,S.,Schiele,B.,2016. Thecityscapes dataset for semantic urban scene understanding, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3213–3223

  8. [8]

    Reslt: Residual learning for long-tailed recognition

    Cui, J., Liu, S., Tian, Z., Zhong, Z., Jia, J., 2022. Reslt: Residual learning for long-tailed recognition. IEEE transactions on pattern analysis and machine intelligence 45, 3695–3706

  9. [9]

    Parametriccontrastive learning, in: Proceedings of the IEEE/CVF international conference on computer vision, pp

    Cui,J.,Zhong,Z.,Liu,S.,Yu,B.,Jia,J.,2021. Parametriccontrastive learning, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 715–724

  10. [10]

    Histograms of oriented gradients for human detection, in: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), IEEE

    Dalal, N., Triggs, B., 2005. Histograms of oriented gradients for human detection, in: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), IEEE. pp. 886–893

  11. [11]

    Imagenet: A large-scale hierarchical image database, in: 2009 IEEE conferenceoncomputervisionandpatternrecognition,Ieee.pp.248– 255

    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L., 2009. Imagenet: A large-scale hierarchical image database, in: 2009 IEEE conferenceoncomputervisionandpatternrecognition,Ieee.pp.248– 255

  12. [12]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Dosovitskiy,A.,2020. Animageisworth16x16words:Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

  13. [13]

    Cross- layer retrospective retrieving via layer attention

    Fang, Y., Cai, Y., Chen, J., Zhao, J., Tian, G., Li, G., 2023. Cross- layer retrospective retrieving via layer attention. arXiv preprint arXiv:2302.03985

  14. [14]

    Divination and power: a multiregional view of the development of oracle bone divination in early china

    Flad, R.K., 2008. Divination and power: a multiregional view of the development of oracle bone divination in early china. Current Anthropology 49, 403–437

  15. [15]

    Improvement of oracle bone inscription recognition accuracy: A deep learning perspective

    Fu, X., Yang, Z., Zeng, Z., Zhang, Y., Zhou, Q., 2022. Improvement of oracle bone inscription recognition accuracy: A deep learning perspective. ISPRS International Journal of Geo-Information 11, 45

  16. [16]

    Fujikawa, Y., Li, H., Yue, X., Aravinda, C., Prabhu, G.A., Meng, L.,

  17. [17]

    International Journal of Digital Humanities 5, 65– 79

    Recognition of oracle bone inscriptions by using two deep learning models. International Journal of Digital Humanities 5, 65– 79

  18. [18]

    Distinguishing oracle variants based on the isomorphism and symmetry invariances of oracle-bone inscriptions

    Gao, J., Liang, X., 2020. Distinguishing oracle variants based on the isomorphism and symmetry invariances of oracle-bone inscriptions. IEEE access 8, 152258–152275

  19. [19]

    An open dataset for the evolution of oracle bone characters: Evobc

    Guan, H., Wan, J., Liu, Y., Wang, P., Zhang, K., Kuang, Z., Wang, X., Bai, X., Jin, L., 2024. An open dataset for the evolution of oracle bone characters: Evobc. arXiv preprint arXiv:2401.12467

  20. [20]

    Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp

    He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778

  21. [21]

    He, Q., Yao, X., Wu, J., Yi, Z., He, T., 2024a. A lightweight u-like network utilizing neural memory ordinary differential equations for slimming the decoder, in: Proceedings of the Thirty-Third Interna- tional Joint Conference on Artificial Intelligence, pp. 821–829

  22. [22]

    Cascade-refine model for cephalometric landmark detection in high- resolution orthodontic images

    He, T., Guo, J., Tang, W., Zeng, W., He, P., Zeng, F., Yi, Z., 2023. Cascade-refine model for cephalometric landmark detection in high- resolution orthodontic images. Knowledge-Based Systems 265, 110332

  23. [23]

    Anchor ball regression model for large-scale 3d skull landmark detection

    He, T., Xu, G., Cui, L., Tang, W., Long, J., Guo, J., 2024b. Anchor ball regression model for large-scale 3d skull landmark detection. Neurocomputing 567, 127051

  24. [24]

    Squeeze-and-excitation networks,in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp

    Hu,J.,Shen, L.,Sun,G.,2018. Squeeze-and-excitation networks,in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141

  25. [25]

    Agtgan:Unpairedimagetranslation for photographic ancient character generation, in: Proceedings of the 30th ACM international conference on multimedia, pp

    Huang,H.,Yang,D.,Dai,G.,Han,Z.,Wang,Y.,Lam,K.M.,Yang,F., Huang,S.,Liu,Y.,He,M.,2022. Agtgan:Unpairedimagetranslation for photographic ancient character generation, in: Proceedings of the 30th ACM international conference on multimedia, pp. 5456–5467

  26. [26]

    Obc306:Alarge- scaleoraclebonecharacterrecognitiondataset,in:2019International ConferenceonDocumentAnalysisandRecognition(ICDAR),IEEE

    Huang,S.,Wang,H.,Liu,Y.,Shi,X.,Jin,L.,2019. Obc306:Alarge- scaleoraclebonecharacterrecognitiondataset,in:2019International ConferenceonDocumentAnalysisandRecognition(ICDAR),IEEE. pp. 681–688

  27. [27]

    Dianet: Dense-and- implicit attention network, in: Proceedings of the AAAI Conference on Artificial Intelligence, pp

    Huang, Z., Liang, S., Liang, M., Yang, H., 2020. Dianet: Dense-and- implicit attention network, in: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 4206–4214

  28. [28]

    Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer,K.,2016.Squeezenet:Alexnet-levelaccuracywith50xfewer parametersand<0.5mbmodelsize.arXivpreprintarXiv:1602.07360

  29. [29]

    Diviners and astrologers under the eastern zhou: Transmitted texts and recent archaeological discoveries, in: EarlyChineseReligion,PartOne:ShangthroughHan(1250BC-220 AD)(2 vols.)

    Kalinowski, M., 2009. Diviners and astrologers under the eastern zhou: Transmitted texts and recent archaeological discoveries, in: EarlyChineseReligion,PartOne:ShangthroughHan(1250BC-220 AD)(2 vols.). Brill, pp. 341–396

  30. [30]

    Decoupling representation and classifier for long-tailed recognition,

    Kang, B., Xie, S., Rohrbach, M., Yan, Z., Gordo, A., Feng, J., Kalantidis, Y., 2019. Decoupling representation and classifier for long-tailed recognition. arXiv preprint arXiv:1910.09217

  31. [31]

    The shang state as seen in the oracle-bone inscriptions

    Keightley, D.N., 1979. The shang state as seen in the oracle-bone inscriptions. Early China 5, 25–34

  32. [32]

    Imagenet classifi- cation with deep convolutional neural networks

    Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. Imagenet classifi- cation with deep convolutional neural networks. Advances in neural information processing systems 25

  33. [33]

    Li, B., Zhang, J., Yu, N., Zhang, Z., Liu, Y., Han, Y., 2024. Oracle character prototype-guided cyclic disentanglement for oracle bone inscriptionsdetection,in:InternationalConferenceonPatternRecog- nition and Artificial Intelligence, Springer. pp. 212–226

  34. [34]

    Enhancing Layer Attention Efficiency through Pruning Redundant Retrievals

    Li,H.,Huang,X.,2025. Enhancinglayerattentionefficiencythrough pruning redundant retrievals. arXiv preprint arXiv:2503.06473

  35. [35]

    Towards better long-tailed oracle character recognition with adversarial data augmentation

    Li, J., Wang, Q.F., Huang, K., Yang, X., Zhang, R., Goulermas, J.Y., 2023a. Towards better long-tailed oracle character recognition with adversarial data augmentation. Pattern Recognition 140, 109534

  36. [36]

    Diff-oracle:Decipheringoraclebonescriptswithcontrollable diffusion model

    Li, J., Wang, Q.F., Wang, S., Zhang, R., Huang, K., Cambria, E., 2023b. Diff-oracle:Decipheringoraclebonescriptswithcontrollable diffusion model. arXiv preprint arXiv:2312.13631

  37. [37]

    Selective kernel networks, in: CVPR, pp

    Li, X., Wang, W., Hu, X., Yang, J., 2019. Selective kernel networks, in: CVPR, pp. 510–519. :Preprint submitted to Elsevier Page 12 of 13

  38. [38]

    Microsoft coco: Common objects in context, in: European conference on computer vision, Springer

    Lin,T.Y.,Maire,M.,Belongie,S.,Hays,J.,Perona,P.,Ramanan,D., Dollár, P., Zitnick, C.L., 2014. Microsoft coco: Common objects in context, in: European conference on computer vision, Springer. pp. 740–755

  39. [39]

    Radical-based extract and recognition networks for oracle character recognition

    Lin, X., Chen, S., Zhao, F., Qiu, X., 2022. Radical-based extract and recognition networks for oracle character recognition. International Journal on Document Analysis and Recognition (IJDAR) 25, 219– 235

  40. [40]

    Oracle bone inscriptions recognition based on deep convolutional neural network

    Liu, M., Liu, G., Liu, Y., Jiao, Q., 2020. Oracle bone inscriptions recognition based on deep convolutional neural network. Journal of image and graphics 8, 114–119

  41. [41]

    Distinctive image features from scale-invariant keypoints

    Lowe, D.G., 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision 60, 91–110

  42. [42]

    Oracle bone inscription character recognition based on a novel convolutional neural network architecture

    Mai, C., Penava, P., Buettner, R., 2024. Oracle bone inscription character recognition based on a novel convolutional neural network architecture. IEEE Access

  43. [43]

    Recognitionoforacleboneinscriptionsbyextracting line features on image processing., in: ICPRAM, pp

    Meng,L.,2017. Recognitionoforacleboneinscriptionsbyextracting line features on image processing., in: ICPRAM, pp. 606–611

  44. [44]

    Recognition of oracular bone inscriptions using template matching, in: 2016 International Journal of Computer Theory and Engineering (IJCTE), IJCTE

    Meng, L., Han, S., Song, X., Li, Y., 2016. Recognition of oracular bone inscriptions using template matching, in: 2016 International Journal of Computer Theory and Engineering (IJCTE), IJCTE. pp. 6–10

  45. [45]

    Recognition of oracle bone inscriptions using deep learning based on data augmentation, in: 2018 metrology for archaeology and cultural heritage (MetroAr- chaeo), IEEE

    Meng, L., Kamitoku, N., Yamazaki, K., 2018. Recognition of oracle bone inscriptions using deep learning based on data augmentation, in: 2018 metrology for archaeology and cultural heritage (MetroAr- chaeo), IEEE. pp. 33–38

  46. [46]

    Automatic segmentation of oracle bone inscriptions using yolov8

    Meng, X., Pu, H., Meng, F., 2024. Automatic segmentation of oracle bone inscriptions using yolov8. Procedia Computer Science 242, 1074–1081

  47. [47]

    Long-tail learning via logit adjustment

    Menon,A.K.,Jayasumana,S.,Rawat,A.S.,Jain,H.,Veit,A.,Kumar, S., 2020. Long-tail learning via logit adjustment. arXiv preprint arXiv:2007.07314

  48. [48]

    Multiresolution gray- scale and rotation invariant texture classification with local binary patterns

    Ojala, T., Pietikäinen, M., Mäenpää, T., 2002. Multiresolution gray- scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intel- ligence 24, 971–987

  49. [49]

    Fcanet: Frequency channel attention networks, in: Proceedings of the IEEE/CVF international conference on computer vision, pp

    Qin, Z., Zhang, P., Wu, F., Li, X., 2021. Fcanet: Frequency channel attention networks, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 783–792

  50. [50]

    Divination and prediction in early China and ancient Greece

    Raphals, L., 2013. Divination and prediction in early China and ancient Greece. Cambridge University Press

  51. [51]

    Very Deep Convolutional Networks for Large-Scale Image Recognition

    Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  52. [52]

    Srivastava,R.K.,Greff,K.,Schmidhuber,J.,2015.Trainingverydeep networks. NeurIPS

  53. [53]

    Chinese oracle bone inscriptions, holy moun- tains, and the garden of god

    Stough II, M., 2011. Chinese oracle bone inscriptions, holy moun- tains, and the garden of god

  54. [54]

    Rethinking the inception architecture for computer vision, in: Pro- ceedings of the IEEE conference on computer vision and pattern recognition, pp

    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z., 2016. Rethinking the inception architecture for computer vision, in: Pro- ceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818–2826

  55. [55]

    Towards a more rigorous methodology of deciphering oracle-bone inscriptions

    Takashima, K., 2000. Towards a more rigorous methodology of deciphering oracle-bone inscriptions. T’oung Pao 86, 363–399

  56. [56]

    Attention is all you need

    Vaswani, A., 2017. Attention is all you need. Advances in Neural Information Processing Systems

  57. [57]

    Strengthen- ing layer interaction via dynamic layer attention

    Wang, K., Xia, X., Liu, J., Yi, Z., He, T., 2024a. Strengthen- ing layer interaction via dynamic layer attention. arXiv preprint arXiv:2406.13392

  58. [58]

    Oracle-mnist: a realistic image dataset for benchmarking machine learning algorithms

    Wang, M., Deng, W., 2022. Oracle-mnist: a realistic image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:2205.09442 2

  59. [59]

    A dataset of oracle characters for benchmarking machine learning algorithms

    Wang, M., Deng, W., 2024. A dataset of oracle characters for benchmarking machine learning algorithms. Scientific Data 11, 87

  60. [60]

    Unsupervised structure- texture separation network for oracle character recognition

    Wang, M., Deng, W., Liu, C.L., 2022. Unsupervised structure- texture separation network for oracle character recognition. IEEE Transactions on Image Processing 31, 3137–3150

  61. [61]

    Oracle character recognition using unsupervised discriminative consistency network

    Wang, M., Deng, W., Su, S., 2024b. Oracle character recognition using unsupervised discriminative consistency network. Pattern Recognition 148, 110180

  62. [62]

    An open dataset for oracle bonecharacterrecognitionanddecipherment.ScientificData11,976

    Wang, P., Zhang, K., Wang, X., Han, S., Liu, Y., Wan, J., Guan, H., Kuang, Z., Jin, L., Bai, X., et al., 2024c. An open dataset for oracle bonecharacterrecognitionanddecipherment.ScientificData11,976

  63. [63]

    Eca-net: Efficient channel attention for deep convolutional neural networks, in:ProceedingsoftheIEEE/CVFconferenceoncomputervisionand pattern recognition, pp

    Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q., 2020. Eca-net: Efficient channel attention for deep convolutional neural networks, in:ProceedingsoftheIEEE/CVFconferenceoncomputervisionand pattern recognition, pp. 11534–11542

  64. [64]

    Non-local neural networks,in:ProceedingsoftheIEEEconferenceoncomputervision and pattern recognition, pp

    Wang, X., Girshick, R., Gupta, A., He, K., 2018. Non-local neural networks,in:ProceedingsoftheIEEEconferenceoncomputervision and pattern recognition, pp. 7794–7803

  65. [65]

    Cbam: Convolutional block attention module, in: Proceedings of the European conference on computer vision (ECCV), pp

    Woo, S., Park, J., Lee, J.Y., Kweon, I.S., 2018. Cbam: Convolutional block attention module, in: Proceedings of the European conference on computer vision (ECCV), pp. 3–19

  66. [66]

    Xiao, L., Liu, J., Li, Y., Li, X., Yin, Z., 2025. Faa-yolo: frequency- augmentedattentionmechanismfororacleboneinscriptiondetection, in: Fourth International Conference on Machine Vision, Automatic Identification, and Detection (MVAID 2025), SPIE. pp. 255–260

  67. [67]

    Xing, J., Liu, G., Xiong, J., 2019. Oracle bone inscription detection: a survey of oracle bone inscription detection based on deep learning algorithm,in:Proceedingsoftheinternationalconferenceonartificial intelligence, information processing and cloud computing, pp. 1–8

  68. [68]

    Jointly attentive spatial-temporal pooling networks for video-based person re-identification, in: ICCV, pp

    Xu,S.,Cheng,Y.,Gu,K.,Yang,Y.,Chang,S.,Zhou,P.,2017. Jointly attentive spatial-temporal pooling networks for video-based person re-identification, in: ICCV, pp. 4733–4742

  69. [69]

    Cnm- unet: Continuous ordinary differential equations for medical image segmentation, in: Proceedings of the AAAI Conference on Artificial Intelligence, pp

    Xu,T.,Zhu,Y.,He,Q.,Cao,Y.,Wang,K.,Yi,Z.,He,T.,2026. Cnm- unet: Continuous ordinary differential equations for medical image segmentation, in: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 11406–11414

  70. [70]

    Anovelmask- ing model for buddhist literature understanding by using generative adversarialnetworks.ExpertSystemswithApplications258,125241

    Yan,C.,Wang,Y.,Chang,L.,Zhang,Q.,He,T.,2024. Anovelmask- ing model for buddhist literature understanding by using generative adversarialnetworks.ExpertSystemswithApplications258,125241

  71. [71]

    Mobileode: An extra lightweight network

    Yu, L., Wu, J., Gou, B., Min, X., Zhang, L., Yi, Z., He, T., 2026. Mobileode: An extra lightweight network. Advances in Neural Information Processing Systems 38, 120931–120956

  72. [72]

    Deciphering ancient chinese oracle bone inscriptions using case-based reasoning, in: In- ternationalConferenceonCase-BasedReasoning,Springer.pp.309– 324

    Zhang, G., Liu, D., Smyth, B., Dong, R., 2021. Deciphering ancient chinese oracle bone inscriptions using case-based reasoning, in: In- ternationalConferenceonCase-BasedReasoning,Springer.pp.309– 324

  73. [73]

    Adding conditional control to text-to-image diffusion models, in: Proceedings of the IEEE/CVF international conference on computer vision, pp

    Zhang, L., Rao, A., Agrawala, M., 2023. Adding conditional control to text-to-image diffusion models, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 3836–3847

  74. [74]

    Rubbing oracle bone character recognition based on improved yolov8 network, in: 2024 IEEE International Conference on Mechatronics and Automation (ICMA), IEEE

    Zhang, W., Han, F., Wang, B., Zhao, Y., 2024. Rubbing oracle bone character recognition based on improved yolov8 network, in: 2024 IEEE International Conference on Mechatronics and Automation (ICMA), IEEE. pp. 531–536

  75. [75]

    Self-supervised ag- gregation of diverse experts for test-agnostic long-tailed recognition

    Zhang, Y., Hooi, B., Hong, L., Feng, J., 2022. Self-supervised ag- gregation of diverse experts for test-agnostic long-tailed recognition. Advancesinneuralinformationprocessingsystems35,34077–34090

  76. [76]

    Psanet: Point-wise spatial attention network for scene parsing, in: ProceedingsoftheEuropeanconferenceoncomputervision(ECCV), pp

    Zhao, H., Zhang, Y., Liu, S., Shi, J., Loy, C.C., Lin, D., Jia, J., 2018. Psanet: Point-wise spatial attention network for scene parsing, in: ProceedingsoftheEuropeanconferenceoncomputervision(ECCV), pp. 267–283

  77. [77]

    Recurrencealongdepth:Deepconvo- lutional neural networks with recurrent layer aggregation

    Zhao,J.,Fang,Y.,Li,G.,2021. Recurrencealongdepth:Deepconvo- lutional neural networks with recurrent layer aggregation. Advances in Neural Information Processing Systems 34, 10627–10640

  78. [78]

    Ba-net: Bridge atten- tionfordeepconvolutionalneuralnetworks,in:EuropeanConference on Computer Vision, Springer

    Zhao, Y., Chen, J., Zhang, Z., Zhang, R., 2022. Ba-net: Bridge atten- tionfordeepconvolutionalneuralnetworks,in:EuropeanConference on Computer Vision, Springer. pp. 297–312

  79. [79]

    Astronomy on oracleboneinscriptions

    Zhen-Tao, X., Stephenson, F., Yao-Tiao, J., 1995. Astronomy on oracleboneinscriptions. QuarterlyJournaloftheRoyalAstronomical Society, Vol. 36, p. 397 36, 397

  80. [80]

    An empirical study of spatial attention mechanisms in deep networks, in: Proceed- ings of the IEEE/CVF international conference on computer vision, pp

    Zhu, X., Cheng, D., Zhang, Z., Lin, S., Dai, J., 2019. An empirical study of spatial attention mechanisms in deep networks, in: Proceed- ings of the IEEE/CVF international conference on computer vision, pp. 6688–6697. :Preprint submitted to Elsevier Page 13 of 13