A large-scale foundation model enables simulation-to-real adaptation for nuclear magnetic resonance-based molecular structure analysis
Pith reviewed 2026-06-26 15:29 UTC · model grok-4.3
The pith
UltraNMR pre-trained on 158 million simulated NMR spectra adapts to experimental data for state-of-the-art molecular structure analysis.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
UltraNMR is trained on 158 million paired simulated 1H and 13C NMR spectra using domain-specific pre-training objectives that capture intra- and inter-spectral dependencies. Adaptation of this model to molecular structure analysis tasks on real experimental NMR spectra produces state-of-the-art results that surpass those from models trained directly on the downstream experimental data. The model further enables encoding of simulated spectra into a library covering 94 million molecules for structure-aware retrieval and has been applied to elucidate structures of previously unknown natural products.
What carries the argument
UltraNMR, a foundation model pre-trained on simulated NMR spectra with objectives designed to capture spectral dependencies for simulation-to-real transfer.
Load-bearing premise
Simulated NMR spectra capture enough of the statistical properties and noise characteristics of real experimental spectra for pre-trained representations to transfer without major corrections.
What would settle it
Observing that a model trained solely on real experimental NMR data achieves higher accuracy than the simulation-pre-trained UltraNMR on the same set of molecular structure analysis tasks would falsify the central claim.
read the original abstract
Nuclear Magnetic Resonance (NMR) spectroscopy is a powerful tool for molecular structure analysis, and spectral artificial intelligence offers great potential for its rapid and automated interpretation. However, the scarcity of experimental NMR datasets has constrained deep learning in this domain to narrow, task-specific applications that lack broad generalization. Here, we introduce UltraNMR, a large-scale foundation model for NMR that leverages the intrinsic properties of NMR spectra to learn generalizable spectral representations. We collected 158 million paired simulated $^{1}$H and $^{13}$C NMR spectra to train UltraNMR, employing multiple domain-specific pre-training objectives. UltraNMR captures both intra-spectral and inter-spectral dependencies, enabling seamless simulation-to-real adaptation. We demonstrate that adapting UltraNMR to a range of molecular structure analysis tasks on experimental NMR spectra consistently yields state-of-the-art performance and clearly outperforms UltraNMR variants trained directly on downstream data without simulation pre-training. We also construct a large-scale NMR spectral vector library by encoding simulated NMR spectra using UltraNMR, covering 94 million unique molecules and enabling effective structure-aware retrieval. In real-world applications, UltraNMR facilitates the structural elucidation of two previously unknown natural products from Chinese herbal medicines recorded in the Chinese Pharmacopoeia. These results suggest that large-scale simulation pre-training can effectively bridge the simulation-to-real gap, enabling robust and generalizable molecular structure analysis of real-world NMR spectra.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces UltraNMR, a large-scale foundation model pre-trained on 158 million simulated paired 1H and 13C NMR spectra using multiple domain-specific pre-training objectives to capture intra- and inter-spectral dependencies. It claims that fine-tuning/adapting this model to experimental NMR spectra yields state-of-the-art performance across molecular structure analysis tasks, clearly outperforming UltraNMR variants trained directly on the downstream experimental data without simulation pre-training. The work also constructs a vector library encoding 94 million unique molecules for structure-aware retrieval and applies the model to elucidate structures of two previously unknown natural products.
Significance. If the simulation-to-real transfer results hold under rigorous validation, the work would be significant for NMR-based molecular analysis: it shows that large-scale simulation pre-training can address experimental data scarcity and enable generalizable representations, with potential for broad impact in automated spectral interpretation and retrieval in chemistry.
major comments (2)
- [Abstract] Abstract (and central claim): the assertion that simulation pre-training 'clearly outperforms' direct-training variants and achieves consistent SOTA requires quantitative support (e.g., specific accuracy/F1 metrics, dataset sizes for downstream tasks, error bars, and statistical tests) to establish that gains arise from the pre-training rather than model scale or optimization choices; without these, the simulation-to-real adaptation cannot be verified as the load-bearing factor.
- [Results / Methods] The weakest assumption (simulated spectra reproducing real experimental joint statistics in shifts, couplings, noise, solvent effects, impurities, and artifacts) is load-bearing for all transfer claims; the manuscript must include explicit validation (e.g., distributional comparisons or ablation on simulator fidelity) in the results or methods sections, as omission of these effects could explain observed gains independently of the pre-training strategy.
minor comments (2)
- [Abstract] The phrase 'seamless simulation-to-real adaptation' is imprecise; the adaptation procedure (e.g., fine-tuning protocol, any domain-adversarial components, or retrieval method) should be defined with concrete steps and hyperparameters.
- [Abstract] Clarify the exact number and nature of the 'range of molecular structure analysis tasks' and the composition of the experimental test sets to allow reproducibility assessment.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. The comments highlight important points for strengthening the presentation of our results on simulation-to-real transfer. We address each major comment below and will incorporate revisions as noted.
read point-by-point responses
-
Referee: [Abstract] Abstract (and central claim): the assertion that simulation pre-training 'clearly outperforms' direct-training variants and achieves consistent SOTA requires quantitative support (e.g., specific accuracy/F1 metrics, dataset sizes for downstream tasks, error bars, and statistical tests) to establish that gains arise from the pre-training rather than model scale or optimization choices; without these, the simulation-to-real adaptation cannot be verified as the load-bearing factor.
Authors: We agree that the abstract would be strengthened by including explicit quantitative metrics. The full manuscript already reports detailed performance numbers (accuracy, F1, dataset sizes) with comparisons to direct-training baselines across multiple tasks, including error bars from multiple runs. In the revision we will add a concise summary of these key metrics, dataset sizes, and statistical significance indicators directly into the abstract to make the central claims self-contained and verifiable. revision: yes
-
Referee: [Results / Methods] The weakest assumption (simulated spectra reproducing real experimental joint statistics in shifts, couplings, noise, solvent effects, impurities, and artifacts) is load-bearing for all transfer claims; the manuscript must include explicit validation (e.g., distributional comparisons or ablation on simulator fidelity) in the results or methods sections, as omission of these effects could explain observed gains independently of the pre-training strategy.
Authors: We acknowledge that explicit validation of simulator fidelity is important for supporting the transfer claims. The current manuscript demonstrates successful downstream transfer on experimental data but does not contain the requested distributional comparisons or simulator ablations. We will add these analyses (e.g., shift/coupling distribution overlays and fidelity ablations) to the Methods and Results sections in the revision to directly address this point. revision: yes
Circularity Check
No circularity: empirical ML pipeline with no derivations or self-referential reductions.
full rationale
The paper describes a standard simulation-pretrain-then-adapt pipeline on 158M simulated NMR pairs followed by fine-tuning and evaluation on experimental spectra. No equations, derivations, or parameter-fitting steps are presented that could reduce a claimed prediction to its own inputs by construction. All performance claims are external empirical comparisons (SOTA on real data, outperforming direct-training baselines), with no load-bearing self-citations or ansatzes invoked. This is a self-contained empirical result whose validity rests on data and benchmarks outside the model itself.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Jon M Fukuto, Samantha J Carrington, Dean J Tantillo, Jason G Harrison, Louis J Ignarro, Bruce A Freeman, Andrew Chen, and David A Wink. Small molecule signaling agents: the integrated chemistry and biochemistry of nitrogen oxides, oxides of carbon, dioxygen, hydrogen sulfide, and their derived species.Chemical research in toxicology, 25(4):769–793, 2012
2012
-
[2]
Small molecule metabo- lites: discovery of biomarkers and therapeutic targets.Signal Transductionand TargetedTherapy, 8(1):132, 2023
Shi Qiu, Ying Cai, Hong Yao, Chunsheng Lin, Yiqiang Xie, Songqi Tang, and Aihua Zhang. Small molecule metabo- lites: discovery of biomarkers and therapeutic targets.Signal Transductionand TargetedTherapy, 8(1):132, 2023
2023
-
[3]
Natural products as sources of new drugs over the nearly four decades from 01/1981 to 09/2019.Journal of natural products, 83(3):770–803, 2020
David J Newman and Gordon M Cragg. Natural products as sources of new drugs over the nearly four decades from 01/1981 to 09/2019.Journal of natural products, 83(3):770–803, 2020
1981
-
[4]
Introduction to small molecule drug discovery and preclinical develop- ment
Michelle WY Southey and Michael Brunavs. Introduction to small molecule drug discovery and preclinical develop- ment. Frontiersin Drug Discovery, 3:1314077, 2023
2023
-
[5]
Chemical space.Nature, 432(7019):823–824, 2004
Peter Kirkpatrick and Clare Ellis. Chemical space.Nature, 432(7019):823–824, 2004
2004
-
[6]
How much of the chemical space has been explored? selecting the right exploration measure for drug discovery
Yutong Xie, Ziqiao Xu, Jiaqi Ma, and Qiaozhu Mei. How much of the chemical space has been explored? selecting the right exploration measure for drug discovery. InICML 2022 2nd AI for Science Workshop, 2022
2022
-
[7]
Pubchem 2025 update.Nucleic acids research, 53(D1):D1516–D1525, 2025
Sunghwan Kim, Jie Chen, Tiejun Cheng, Asta Gindulyte, Jia He, Siqian He, Qingliang Li, Benjamin A Shoemaker, Paul A Thiessen, Bo Yu, et al. Pubchem 2025 update.Nucleic acids research, 53(D1):D1516–D1525, 2025
2025
-
[8]
Robust auto- mated backbone triple resonance nmr assignments of proteins using bayesian-based simulated annealing.Nature Communications, 14(1):1556, 2023
Anthony C Bishop, Glorisé Torres-Montalvo, Sravya Kotaru, Kyle Mimun, and A Joshua Wand. Robust auto- mated backbone triple resonance nmr assignments of proteins using bayesian-based simulated annealing.Nature Communications, 14(1):1556, 2023
2023
-
[9]
Challenges and perspectives in quantitative nmr.Magnetic Resonance in Chemistry, 55(1):61–69, 2017
Patrick Giraudeau. Challenges and perspectives in quantitative nmr.Magnetic Resonance in Chemistry, 55(1):61–69, 2017
2017
-
[10]
Artificialintelligenceinspectroscopy: advancingchemistry from prediction to generation and beyond
Kehan Guo, Yili Shen, Gisela Abigail Gonzalez-Montiel, Yue Huang, Yujun Zhou, Mihir Surve, Zhichun Guo, Payel Das, NiteshV.Chawla, OlafWiest, andXiangliangZhang. Artificialintelligenceinspectroscopy: advancingchemistry from prediction to generation and beyond. IJCAI ’25, 2025. ISBN 978-1-956792-06-5. doi: 10.24963/ijcai.2025/1160. URLhttps://doi.org/10.24...
-
[11]
Deep learning and its applications in nuclear magnetic resonance spectroscopy.Progress in Nuclear Magnetic Resonance Spectroscopy, 146:101556, 2025
Yao Luo, Xiaoxu Zheng, Mengjie Qiu, Yaoping Gou, Zhengxian Yang, Xiaobo Qu, Zhong Chen, and Yanqin Lin. Deep learning and its applications in nuclear magnetic resonance spectroscopy.Progress in Nuclear Magnetic Resonance Spectroscopy, 146:101556, 2025
2025
-
[12]
Deepsat: learning molecular structures from nuclear magnetic resonance data
Hyun Woo Kim, Chen Zhang, Raphael Reher, Mingxun Wang, Kelsey L Alexander, Louis-Félix Nothias, Yoo Kyong Han, Hyeji Shin, Ki Yong Lee, Kyu Hyeong Lee, et al. Deepsat: learning molecular structures from nuclear magnetic resonance data. Journal of Cheminformatics, 15(1):71, 2023
2023
-
[13]
Nmr-solver: automated structure elucidation via large-scale spectral matching and physics-guided fragment optimization
Yongqi Jin, Jun-Jie Wang, Fanjie Xu, Xiaohong Ji, Zhifeng Gao, Linfeng Zhang, Guolin Ke, Rong Zhu, and Weinan E. Nmr-solver: automated structure elucidation via large-scale spectral matching and physics-guided fragment optimization. Nature Communications, 2026
2026
-
[14]
Cross-modal retrieval between 13c nmr spectra and structures for compound identification using deep contrastive learning.Analytical Chemistry, 93(50):16947–16955, 2021
Zhuo Yang, Jianfei Song, Minjian Yang, Lin Yao, Jiahua Zhang, Hui Shi, Xiangyang Ji, Yafeng Deng, and Xiao- jian Wang. Cross-modal retrieval between 13c nmr spectra and structures for compound identification using deep contrastive learning.Analytical Chemistry, 93(50):16947–16955, 2021
2021
-
[15]
Learning the language of nmr: structure elucidation from nmr spectra using transformer models
Marvin Alberts, Federico Zipoli, and Alain Vaucher. Learning the language of nmr: structure elucidation from nmr spectra using transformer models. InAI for Accelerated Materials Design-NeurIPS 2023 Workshop, 2023. 17 SpectraAI Research Article
2023
-
[16]
A transformer based generative chemical language ai model for structural elucidation of organic compounds
Xiaofeng Tan. A transformer based generative chemical language ai model for structural elucidation of organic compounds. Journal of cheminformatics, 17(1):103, 2025
2025
-
[17]
Nmrmind: A transformer-based model enabling the elucidation from multidimensional nmr to structures
Xi Xue, Hanyu Sun, Jingying Sun, Luc Patiny, Xiangying Liu, Kai Chen, Jingjie Yan, Liangning Li, Xue Liu, Shu Xu, et al. Nmrmind: A transformer-based model enabling the elucidation from multidimensional nmr to structures. Analytical Chemistry, 97(41):22603–22614, 2025
2025
-
[18]
Diffnmr: Diffusion models for nuclear magnetic resonance spectra elucidation.Materials Futures, 2025
Qingsong Yang, Binglan Wu, Xuwei Liu, Bo Chen, Wei Li, Gen Long, Xin Chen, and Mingjun Xiao. Diffnmr: Diffusion models for nuclear magnetic resonance spectra elucidation.Materials Futures, 2025
2025
-
[19]
Atomic diffusion models for small molecule structure elucidation from nmr spectra.Advancesin Neural Information Processing Systems, 38:115995–116031, 2026
Ziyu Xiong, Yichi Zhang, Foyez Alauddin, Chu Xin Cheng, Joon An, Mohammad Seyedsayamdost, and Ellen Zhong. Atomic diffusion models for small molecule structure elucidation from nmr spectra.Advancesin Neural Information Processing Systems, 38:115995–116031, 2026
2026
-
[20]
Identifying molecular functional groups of organic compounds by deep learning of nmr data.Magnetic Resonance in Chemistry, 60(11):1061–1069, 2022
Chongcan Li, Yong Cong, and Weihua Deng. Identifying molecular functional groups of organic compounds by deep learning of nmr data.Magnetic Resonance in Chemistry, 60(11):1061–1069, 2022
2022
-
[21]
Machine-learning approach to identify organic functional groups from ft-ir and nmr spectral data.ACS omega, 10(12):12717–12723, 2025
Gwanho Lee, Hyekyoung Shim, Juhyun Cho, and Sang-Il Choi. Machine-learning approach to identify organic functional groups from ft-ir and nmr spectral data.ACS omega, 10(12):12717–12723, 2025
2025
-
[22]
Accurate and efficient structure elucidation from routine one-dimensional nmr spectra using multitask machine learning
Frank Hu, Michael S Chen, Grant M Rotskoff, Matthew W Kanan, and Thomas E Markland. Accurate and efficient structure elucidation from routine one-dimensional nmr spectra using multitask machine learning. ACS Central Science, 10(11):2162–2170, 2024
2024
-
[23]
A pilot study for fragment identification using 2d nmr and deep learning.Magnetic Resonance in Chemistry, 60(11):1052–1060, 2022
Stefan Kuhn, Eda Tumer, Simon Colreavy-Donnelly, and Ricardo Moreira Borges. A pilot study for fragment identification using 2d nmr and deep learning.Magnetic Resonance in Chemistry, 60(11):1052–1060, 2022
2022
-
[24]
Prediction of natural product classes using machine learning and 13c nmr spectroscopic data
Saul H Martinez-Trevino, Victor Uc-Cetina, María A Fernández-Herrera, and Gabriel Merino. Prediction of natural product classes using machine learning and 13c nmr spectroscopic data. Journal of Chemical Information and Modeling, 60(7):3376–3386, 2020
2020
-
[25]
Dinov3.arXiv preprint arXiv:2508.10104, 2025
Oriane Siméoni, Huy V Vo, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Michaël Ramamonjisoa, et al. Dinov3.arXiv preprint arXiv:2508.10104, 2025
Pith/arXiv arXiv 2025
-
[26]
A foundation model for generalizable disease detection from retinal images.Nature, 622(7981):156–163, 2023
Yukun Zhou, Mark A Chia, Siegfried K Wagner, Murat S Ayhan, Dominic J Williamson, Robbert R Struyven, Timing Liu, Moucheng Xu, Mateo G Lozano, Peter Woodward-Court, et al. A foundation model for generalizable disease detection from retinal images.Nature, 622(7981):156–163, 2023
2023
-
[27]
Self- supervised learning of molecular representations from millions of tandem mass spectra using dreams
Roman Bushuiev, Anton Bushuiev, Raman Samusevich, Corinna Brungs, Josef Sivic, and Tomáš Pluskal. Self- supervised learning of molecular representations from millions of tandem mass spectra using dreams. Nature Biotechnology, pages 1–11, 2025
2025
-
[28]
scgpt: toward building a foundation model for single-cell multi-omics using generative ai.Nature methods, 21(8):1470–1480, 2024
Haotian Cui, Chloe Wang, Hassaan Maan, Kuan Pang, Fengning Luo, Nan Duan, and Bo Wang. scgpt: toward building a foundation model for single-cell multi-omics using generative ai.Nature methods, 21(8):1470–1480, 2024
2024
-
[29]
Learning transferable visual models from natural language supervision
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PmLR, 2021
2021
-
[30]
Exploring the limits of transfer learning with a unified text-to-text transformer.Journal of machine learning research, 21(140):1–67, 2020
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer.Journal of machine learning research, 21(140):1–67, 2020
2020
-
[31]
M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A. Robb, J. R. Cheeseman, G. Scalmani, V. Barone, G. A. Petersson, H. Nakatsuji, X. Li, M. Caricato, A. V. Marenich, J. Bloino, B. G. Janesko, R. Gomperts, B. Men- nucci, H. P. Hratchian, J. V. Ortiz, A. F. Izmaylov, J. L. Sonnenberg, D. Williams-Young, F. Ding, F. Lipparini, F. Egidi, J. Goin...
2016
-
[32]
Predicting chemical shifts with graph neural networks
Ziyue Yang, Maghesree Chakraborty, and Andrew D White. Predicting chemical shifts with graph neural networks. Chemical science, 12(32):10802–10809, 2021
2021
-
[33]
Impression–prediction of nmr parameters for 3-dimensional chemical structures using machine learning with near quantum chemical accuracy.Chemical science, 11(2):508–515, 2020
Will Gerrard, Lars A Bratholm, Martin J Packer, Adrian J Mulholland, David R Glowacki, and Craig P Butts. Impression–prediction of nmr parameters for 3-dimensional chemical structures using machine learning with near quantum chemical accuracy.Chemical science, 11(2):508–515, 2020
2020
-
[34]
Toward a unified benchmark and framework for deep learning-based prediction of nuclear magnetic resonance chemical shifts.Nature Computational Science, pages 1–9, 2025
Fanjie Xu, Wentao Guo, Feng Wang, Lin Yao, Hongshuai Wang, Fujie Tang, Zhifeng Gao, Linfeng Zhang, Weinan E, Zhong-Qun Tian, et al. Toward a unified benchmark and framework for deep learning-based prediction of nuclear magnetic resonance chemical shifts.Nature Computational Science, pages 1–9, 2025
2025
-
[35]
Attention is all you need.Advancesin neural information processing systems, 30, 2017
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advancesin neural information processing systems, 30, 2017
2017
-
[36]
Bert: Pre-training of deep bidirectional transformers for language understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. InProceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pages 4171–4186, 2019
2019
-
[37]
Umap: Uniform manifold approximation and projection for dimension reduction
Leland McInnes, John Healy, and James Melville. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018
Pith/arXiv arXiv 2018
-
[38]
Sirius 4: a rapid tool for turning tandem mass spectra into metabolite structure information
Kai Dührkop, Markus Fleischauer, Marcus Ludwig, Alexander A Aksenov, Alexey V Melnik, Marvin Meusel, Pieter C Dorrestein, Juho Rousu, and Sebastian Böcker. Sirius 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nature methods, 16(4):299–302, 2019
2019
-
[39]
13c nmr dereplication using mixonat software: a practical guide to decipher natural products mixtures.Planta Medica, 87(12/13):1061–1068, 2021
Antoine Bruguière, Séverine Derbré, Dimitri Bréard, Félix Tomi, Jean-Marc Nuzillard, and Pascal Richomme. 13c nmr dereplication using mixonat software: a practical guide to decipher natural products mixtures.Planta Medica, 87(12/13):1061–1068, 2021
2021
-
[40]
Nmrexp: A database of 3.3 million experimental nmr spectra.Scientific Data, 12(1):1954, 2025
Jun-Jie Wang, Yongqi Jin, Chen-Yu Zhi, Yu-Jie Liu, Xu-Hao Huang, Fanjie Xu, Xiaohong Ji, Xi Fang, Haoyi Tao, Weinan E, et al. Nmrexp: A database of 3.3 million experimental nmr spectra.Scientific Data, 12(1):1954, 2025
1954
-
[41]
Zheng Fang, Chen Yang, Hai-tao Yu, Haoming Luo, Haitao He, Jiaqing Xie, Zhuo Yang, and Jun Xia. Nmrgym: A comprehensive benchmark for nuclear magnetic resonance based molecular structure elucidation.arXiv preprint arXiv:2601.15763, 2026
arXiv 2026
-
[42]
A simple framework for contrastive learning of visual representations
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InInternational conference on machine learning, pages 1597–1607. PmLR, 2020
2020
-
[43]
Self-supervised learning from images with a joint-embedding predictive architecture
Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, and Nicolas Ballas. Self-supervised learning from images with a joint-embedding predictive architecture. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15619–15629, 2023
2023
-
[44]
Momentum contrast for unsupervised visual representation learning
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020
2020
-
[45]
Npclassifier: a deep neural network-based structural classification tool for natural products
Hyun Woo Kim, Mingxun Wang, Christopher A Leber, Louis-Félix Nothias, Raphael Reher, Kyo Bin Kang, Justin JJ Van Der Hooft, Pieter C Dorrestein, William H Gerwick, and Garrison W Cottrell. Npclassifier: a deep neural network-based structural classification tool for natural products. Journal of natural products, 84(11):2795–2807, 2021
2021
-
[46]
Stefan Kuhn and Nils E Schlörer. Facilitating quality control for spectra assignments of small organic molecules: nmrshiftdb2–a free in-house nmr database with integrated lims for academic service laboratories.Magnetic Resonance in Chemistry, 53(8):582–589, 2015
2015
-
[47]
On the spectral bias of neural networks
Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred Hamprecht, Yoshua Bengio, and Aaron Courville. On the spectral bias of neural networks. InInternational conference on machine learning, pages 5301–5310. PMLR, 2019
2019
-
[48]
Fourier features let networks learn high frequency functions in low dimensional domains
Matthew Tancik, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan Barron, and Ren Ng. Fourier features let networks learn high frequency functions in low dimensional domains. Advancesin neural information processing systems, 33:7537–7547, 2020. 19 SpectraAI Research Article
2020
-
[49]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016
2016
-
[50]
Layer normalization.arXiv preprint arXiv:1607.06450, 2016
Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. Layer normalization.arXiv preprint arXiv:1607.06450, 2016
Pith/arXiv arXiv 2016
-
[51]
Gaussian error linear units (gelus).arXiv preprint arXiv:1606.08415, 2016
D Hendrycks. Gaussian error linear units (gelus).arXiv preprint arXiv:1606.08415, 2016
Pith/arXiv arXiv 2016
-
[52]
Jinhong Wang, Jintai Chen, Jian Liu, Dongqi Tang, Danny Z Chen, and Jian Wu. A survey on ordinal regression: Applications, advances and prospects.arXiv preprint arXiv:2503.00952, 2025
arXiv 2025
-
[53]
Age estimation based on a single network with soft softmax of aging modeling
Zichang Tan, Shuai Zhou, Jun Wan, Zhen Lei, and Stan Z Li. Age estimation based on a single network with soft softmax of aging modeling. InAsian Conference on Computer Vision, pages 203–216. Springer, 2016
2016
-
[54]
Focal loss for dense object detection
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017
2017
-
[55]
Supervised contrastive learning leads to more reasonable spectral embeddings
Peng Xiong, Hongtao Xu, and Haoran Zheng. Supervised contrastive learning leads to more reasonable spectral embeddings. Analytical Chemistry, 97(37):20137–20146, 2025
2025
-
[56]
Anno- tating metabolite mass spectra with domain-inspired chemical formula transformers.Nature Machine Intelligence, 5 (9):965–979, 2023
Samuel Goldman, Jeremy Wohlwend, Martin Stražar, Guy Haroush, Ramnik J Xavier, and Connor W Coley. Anno- tating metabolite mass spectra with domain-inspired chemical formula transformers.Nature Machine Intelligence, 5 (9):965–979, 2023
2023
-
[57]
Pytorch: An imperative style, high-performance deep learning library
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. Advancesin neural information processing systems, 32, 2019
2019
-
[58]
Seyone Chithrananda, Gabriel Grand, and Bharath Ramsundar. Chemberta: large-scale self-supervised pretraining for molecular property prediction.arXiv preprint arXiv:2010.09885, 2020. 20 SpectraAI Research Article Appendix A Implementation Details For simulated NMR data preprocessing, we align the representation of simulated13C NMR shifts with that of the ...
arXiv 2010
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.