Recoverable Identifier
advisory
doi_compliance
recoverable_identifier
DOI in the printed bibliography is fragmented by whitespace or line breaks. A longer candidate (10.1162/tacla) was visible in the surrounding text but could not be confirmed against doi.org as printed.
Paper page Integrity report arXiv Try DOI
Evidence text
doi: 10.1162/tacl a 00708. URL https: //aclanthology.org/2024.tacl-1.74/. Chen, S., Sheen, H., Wang, T., and Yang, Z. Unveiling induction heads: Provable training dynamics and feature learning in transformers. In Globerson, A., Mackey, L., Belgrave, D., Fan, A., Paquet, U., Tomczak, J., and Zhang, C. (eds.),Advances in Neural Information Processing Sys- tems, volume 37, pp. 66479–66567. Curran Associates, Inc., 2024a. doi: 10.52202/079017-2127. URL https: //proceedings.neurips.cc/paper_files /paper/2024/file/7aae9e3ec211249e05b d07271a6b1441-Paper-Conference.pdf. Chen, Z.-A. and Luo, T. From condensation to rank col- lapse: A two-stage analysis of transformer training dy- namics. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https: //openreview.net/forum?id=gm5mkiTGOy. Chen, Z.-A., Li, Y ., Luo, T., Zhou, Z., and Xu, Z.-Q. J. Phase diagram of initial condensation for two-layer neural networks.CSIAM Transactions on Applied Mathematics, 5(3):448–514, 2024b. ISSN 2708-0579. doi: https: 9 Submission and Formatting Instructions for ICML 2026 //doi.org/10.4208/csiam-am.SO-2023-0016. URL https://global-sci.com/article/91025 /phase-diagram-of-initial-condensatio n-for-two-layer-neural-networks. Chen, Z.-A., Luo, T., and Wang, G. On multi-stage loss dynamics in neural networks: Mechanisms of plateau and descent stages.arXiv preprint arXiv:2410.20119, 2024c. Eldan, R. and Li, Y . Tinystories: How small can language models be and still speak co
Evidence payload
{
"printed_excerpt": "doi: 10.1162/tacl a 00708. URL https: //aclanthology.org/2024.tacl-1.74/. Chen, S., Sheen, H., Wang, T., and Yang, Z. Unveiling induction heads: Provable training dynamics and feature learning in transformers. In Globerson, A., Mackey, L., ",
"reconstructed_doi": "10.1162/tacla",
"ref_index": 1,
"resolved_title": null,
"verdict_class": "incontrovertible"
}