pith. machine review for the scientific record. sign in

arxiv: 2605.02003 · v2 · submitted 2026-05-03 · 💻 cs.LG · cs.AI

Recognition: unknown

RamanBench: A Large-Scale Benchmark for Machine Learning on Raman Spectroscopy

Authors on Pith no claims yet

Pith reviewed 2026-05-09 17:19 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords Raman spectroscopymachine learning benchmarktabular foundation modelsspectral classificationgeneralization gaptime series modelsreproducible evaluation
0
0 comments X

The pith

RamanBench unifies 74 datasets to show tabular foundation models beat domain-specific methods on Raman spectra yet no approach generalizes across tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces RamanBench as a reproducible collection of 74 Raman spectroscopy datasets totaling 325668 spectra to fix the problem of scattered data and inconsistent testing in this field. It runs a fixed protocol on 28 models spanning classical regression, Raman-specific networks, tabular foundation models, and time-series classifiers for both classification and regression under varied conditions. The results establish that tabular foundation models deliver the strongest average performance while time-series methods stay close behind, yet every method drops sharply on some datasets. This pattern points to an open gap in building models that handle the full variability of spectral measurements. The setup includes open data access and a live leaderboard meant to let others add new datasets or models.

Core claim

RamanBench assembles 74 datasets (16 newly released) across four domains and applies one evaluation protocol to 28 models; under that protocol tabular foundation models such as TabPFN outperform both Raman-specific networks and gradient-boosting baselines while time-series models like ROCKET remain competitive, but every tested method fails to maintain performance when moved to a different dataset.

What carries the argument

RamanBench, the unified dataset collection plus fixed preprocessing, train-test splits, and scoring code that turns fragmented Raman spectra into directly comparable supervised learning tasks.

If this is right

  • Tabular foundation models become the default reference point for new Raman spectroscopy tasks.
  • Effort shifts toward architectures that explicitly address cross-dataset shifts in spectral features.
  • Time-series methods receive renewed attention as low-cost alternatives for certain spectral patterns.
  • The living leaderboard structure lets the community add datasets and close the observed gap over time.
  • Reproducible baselines now exist for applications that rely on Raman data such as material identification.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the generalization gap persists, future work may test hybrid models that combine foundation-model pretraining with Raman-specific feature extractors.
  • The same benchmark protocol could be reused for related spectroscopic techniques such as infrared or fluorescence data.
  • Practical systems may need dataset-specific adaptation layers rather than single universal models.
  • Adding more datasets with extreme experimental conditions would test whether the current gap is an artifact of the initial collection.

Load-bearing premise

The 74 chosen datasets and the single preprocessing-plus-split protocol capture enough of the real experimental variability in Raman measurements to support claims of consistent model rankings and a universal generalization gap.

What would settle it

A model that reaches high accuracy on every one of the 74 datasets under the published splits, or a newly collected Raman dataset on which all current top models collapse while a simple baseline succeeds.

Figures

Figures reproduced from arXiv: 2605.02003 by Christoph Lange, Erik Rodner, Felix Biessmann, Mariano N. Cruz Bournazou, Mario Koddenbrock, Martin J\"ager, Martin K\"ogler, Peter Neubauer, Robin Legner.

Figure 1
Figure 1. Figure 1: RamanBench: High dimensional, low sample ML. Left: Sample count vs. feature count for RamanBench (pink) and four reference benchmark collections; RamanBench occupies a distinct high-dimensional, low-sample regime. TabArena [9] and TALENT [10]: tabular ML; UCR [11] and UEA [12]: Time Series Classification (TSC). Right: Model performance (Elo) vs. release year. RamanBench enables a retrospective view: Partia… view at source ↗
Figure 2
Figure 2. Figure 2: Representative Raman spectra from the four application domains in RamanBench. Each panel shows spectra from one domain, colored by class (classification) or by the target analyte value (regression, gradient from low to high). The thick line is the mean spectrum; shaded bands show ±1 standard deviation. Spectral ranges, sample sizes, noise levels, and analytical tasks differ substantially across domains, il… view at source ↗
Figure 3
Figure 3. Figure 3: Benchmark composition overview: domain distribution (left two donuts), task distribution view at source ↗
Figure 4
Figure 4. Figure 4: RamanBench: 74 datasets, 163 targets and 325,668 spectra across 4 application domains. The overview shows per-dataset characteristics sorted by size (largest top) and split into two halves. Each half shows four panels: Instances (spectrum count, log scale), Features (number of wavenumber points), Spectral Range (cm−1 ), and Targets (regression targets, or 1 for classification). The colors of the dataset na… view at source ↗
Figure 5
Figure 5. Figure 5: RamanBench-v0.1 Leaderboard. The top tier is dominated by Tabular Foundation Models (TFMs). The only models that can keep up are the time-series classifiers Arsenal and ROCKET (*=only evaluated on classification tasks). Over the full benchmark, ReZeroNet is the highest-ranking Raman-specific architecture and the first to challenge the TFM block. Elo ratings are anchored at Random Forest = 1 000, with 95 % … view at source ↗
Figure 6
Figure 6. Figure 6: TFM define the high-performance end of the Pareto frontier; ReZeroNet is the only non-TFM contender, while KNN qualifies through speed alone. Normalized F1 (classification, left) and normalized RMSE (regression, right) vs. mean total runtime (train + predict, log scale). Metrics normalized per dataset following Salinas and Erickson [68]: best = 1, median = 0, clipped at 0. Runtime excludes TFM model pretra… view at source ↗
Figure 7
Figure 7. Figure 7: Top-ranked models win broadly across the benchmark; lower-ranked models show consistent losses against most competitors. Pairwise win counts across all 163 prediction targets. Each cell shows the number of targets on which the y-axis model outperforms the x-axis model (ties count as 0.5). Cell color encodes the win rate: green cells indicate a high win rate for the row model; red cells indicate a low win r… view at source ↗
Figure 8
Figure 8. Figure 8: Foundation models achieve the best average rank and dominate first-place finishes; tree-based models achieve at most one first-place finish each. Combined model ranking across all regression and classification targets. Metrics are averaged over seeds per (target, model) before ranking, so each prediction target counts as exactly one win. Left: average rank pooled over all targets (rank 1 = best); regressio… view at source ↗
Figure 9
Figure 9. Figure 9: Arsenal and RealMLP are the slowest models by training time; XGBoost and tree￾based methods offer the lowest inference latency. Computational efficiency of all evaluated models across three dimensions. Training time (left, log scale): total wall-clock time for fitting on the training split. Peak memory (center): maximum RAM/VRAM footprint during training in GB. Inference latency (right, log scale): mean pr… view at source ↗
Figure 10
Figure 10. Figure 10: TFM anchor the low-improvability end of the Pareto frontier; ReZeroNet is the only Raman-specific model near it, while KNN qualifies through speed alone. Mean improvability (%) vs. mean total time (train + predict, s) on a log scale, shown separately for classification (left) and regression (right). Improvability of 0% indicates optimal performance within the evaluated model pool; higher values indicate l… view at source ↗
Figure 11
Figure 11. Figure 11: Regression: both TabPFN variants, AutoGluon, and TabICL v2 form a statistically indistinguishable leading group of four; no model is significantly superior to this group. CD diagram for RMSE across all regression targets (lower rank is better). Generated via Friedman test and Nemenyi post-hoc test (α = 0.05) using AutoRank [73]. Models connected by a horizontal bar are not significantly different. 29 28 2… view at source ↗
Figure 12
Figure 12. Figure 12: Classification: seven models form a statistically indistinguishable leading group, remarkably, both time-series classifiers (Arsenal, ROCKET) as well as ReZeroNet rank within it alongside the three top performing TFM. CD diagram for macro-averaged F1 across all classifica￾tion targets (lower rank is better). Generated via Friedman test and Nemenyi post-hoc test (α = 0.05) using AutoRank [73]. Models conne… view at source ↗
Figure 13
Figure 13. Figure 13: Representative Raman spectra from the Kaiser E. coli datasets, 5 random samples each. view at source ↗
Figure 14
Figure 14. Figure 14: Representative Raman spectra from the Time-Gated E. coli datasets, 5 random samples view at source ↗
Figure 15
Figure 15. Figure 15 view at source ↗
Figure 15
Figure 15. Figure 15: Representative Raman spectra from the Streptococcus Thermophilus datasets, 5 random view at source ↗
Figure 16
Figure 16. Figure 16: Representative Raman spectra from the E. coli Metabolites datasets, 5 random samples view at source ↗
Figure 17
Figure 17. Figure 17: Representative Raman spectra from the Bio-Catalysis AXP dataset showing 5 random view at source ↗
Figure 18
Figure 18. Figure 18: Representative Raman spectra from the Yeast Fermentation dataset showing 5 random view at source ↗
Figure 19
Figure 19. Figure 19: Representative Raman spectra from the R. eutropha Copolymer Fermentations dataset showing 5 random samples. A.13.7 Gasoline Properties: Benchtop and Handheld Raman Measurements These two datasets contain Raman spectra of the same commercial gasoline samples recorded with two different spectrometers for the prediction of Research Octane Number (RON), Motor Octane Number (MON), and oxygenated additive conce… view at source ↗
Figure 20
Figure 20. Figure 20: Representative Raman spectra from the Gasoline Properties (Benchtop) dataset showing 5 view at source ↗
Figure 21
Figure 21. Figure 21: Representative Raman spectra from the Gasoline Properties (Handheld) dataset showing 5 view at source ↗
Figure 22
Figure 22. Figure 22: Representative SERS spectra from the Adenine (Colloidal) dataset, 5 random samples per view at source ↗
Figure 23
Figure 23. Figure 23: Representative SERS spectra from the Adenine (Solid) dataset, 5 random samples per view at source ↗
Figure 24
Figure 24. Figure 24: Representative Raman spectra from MLROD showing 5 random samples. view at source ↗
Figure 25
Figure 25. Figure 25: Representative Raman spectra from the RRUFF Database (raw subset) showing 5 random view at source ↗
Figure 26
Figure 26. Figure 26: Representative Raman spectra from the SOP Spectral Library (raw subset) showing 5 view at source ↗
Figure 27
Figure 27. Figure 27: Representative Raman spectra from the Weathered Microplastics dataset showing 5 view at source ↗
Figure 28
Figure 28. Figure 28: Representative Raman spectra from the Bioprocess Analytes dataset across all 8 spectrom view at source ↗
Figure 29
Figure 29. Figure 29: Representative Raman spectra from the Bioprocess Monitoring dataset showing 5 random view at source ↗
Figure 30
Figure 30. Figure 30: Representative SERS spectra from the Cancer Cell dataset, 5 random samples per view at source ↗
Figure 31
Figure 31. Figure 31: Representative Raman spectra from the E. coli Fermentation dataset showing 5 random view at source ↗
Figure 32
Figure 32. Figure 32 view at source ↗
Figure 32
Figure 32. Figure 32: Representative Raman spectra from the Mutant Wheat dataset showing 5 random leaf view at source ↗
Figure 33
Figure 33. Figure 33: Representative SERS spectra from the Alzheimer’s Serum dataset showing 5 random view at source ↗
Figure 34
Figure 34. Figure 34: Representative Raman spectra from the Diabetes Skin dataset, 5 random samples per view at source ↗
Figure 35
Figure 35. Figure 35: Representative Raman spectra from the Head & Neck Cancer dataset showing 5 random view at source ↗
Figure 36
Figure 36. Figure 36: Representative SERS spectra from the Pathogenic Bacteria dataset showing 5 random view at source ↗
Figure 37
Figure 37. Figure 37 view at source ↗
Figure 37
Figure 37. Figure 37: Representative Raman spectra from the Pharmaceutical Ingredients dataset showing 5 view at source ↗
Figure 38
Figure 38. Figure 38: Representative SERS spectra from the Prostate Cancer Serum dataset showing 5 random view at source ↗
Figure 39
Figure 39. Figure 39: Representative Raman spectra from the Saliva COVID-19 dataset showing 5 random view at source ↗
Figure 40
Figure 40. Figure 40: Representative Raman spectra from the Saliva Alzheimer dataset showing 5 random samples. Saliva Parkinson [5] Salivary Raman spectra for Parkinson’s disease screening are from the same collection as Saliva COVID-19 and Saliva Alzheimer. The spectra were preprocessed via an aluminium substrate background subtraction. The 1,476 spectra cover PD patients and healthy controls. Statistics are given in view at source ↗
Figure 41
Figure 41. Figure 41: Representative Raman spectra from the Saliva Parkinson dataset showing 5 random view at source ↗
Figure 42
Figure 42. Figure 42: Representative SERS spectra from the Stroke Serum dataset showing 5 random samples. view at source ↗
Figure 43
Figure 43. Figure 43: Representative Raman spectra from the Acetic Concentration dataset showing 5 random view at source ↗
Figure 44
Figure 44. Figure 44: Representative Raman spectra from the Amino Acid LC dataset, 5 random samples per view at source ↗
Figure 45
Figure 45. Figure 45: Representative Raman spectra from the Citric Concentration dataset showing 5 random view at source ↗
Figure 46
Figure 46. Figure 46: Representative Raman spectra from the Formic Concentration dataset showing 5 random view at source ↗
Figure 47
Figure 47. Figure 47: Representative SERS spectra from the Hair Dyes dataset showing 5 random samples. view at source ↗
Figure 48
Figure 48. Figure 48: Representative Raman spectra from the Itaconic Concentration dataset showing 5 random view at source ↗
Figure 49
Figure 49. Figure 49: Representative Raman spectra from the Levulinic Concentration dataset showing 5 random view at source ↗
Figure 50
Figure 50. Figure 50: Representative Raman spectra from the Microgel Size dataset for four representative view at source ↗
Figure 51
Figure 51. Figure 51: Representative Raman spectra from the Microgel Synthesis Flow vs. Batch dataset showing view at source ↗
Figure 52
Figure 52. Figure 52: Representative Raman spectra from the Microgel Synthesis in Flow dataset showing 5 view at source ↗
Figure 53
Figure 53. Figure 53: Representative Raman spectra from the Succinic Concentration dataset showing 5 random view at source ↗
Figure 54
Figure 54. Figure 54: Representative Raman spectra from the Sugar Mixtures dataset, 5 random samples per view at source ↗
read the original abstract

Machine Learning (ML) has transformed many scientific fields, yet key applications still lack standardized benchmarks. Raman spectroscopy, a widely used technique for non-invasive molecular analysis, is one such field where progress is limited by fragmented datasets, inconsistent evaluation, and models that fail to capture the structure of spectral data. We introduce RamanBench, the first large-scale, fully reproducible benchmark for ML on Raman spectroscopy, consisting of streamlined data access, evaluation protocols and code, as well as a live leaderboard. It unifies 74 datasets (including 16 first released with this benchmark) across four domains, comprising 325,668 spectra and spanning classification and regression tasks under diverse experimental conditions. We benchmark 28 models under a standardized protocol, including classical methods (e.g., PLS), Raman-specific (e.g., RamanNet), Tabular Foundation Model (TFM) (e.g., TabPFN), and time-series approaches (e.g., ROCKET). TFM consistently outperform domain-specific and gradient boosting baselines, while time-series models remain competitive. However, no method generalizes across datasets, revealing a fundamental gap. Therefore, we invite the community to contribute new approaches to our living benchmark, with the potential to accelerate advances in critical applications such as medical diagnostics, biological research, and materials science.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces RamanBench, the first large-scale reproducible benchmark for machine learning on Raman spectroscopy. It unifies 74 datasets (16 newly released) totaling 325,668 spectra across classification and regression tasks in four domains. Under a single standardized protocol, 28 models are evaluated, including classical (PLS), Raman-specific (RamanNet), Tabular Foundation Models (TFM e.g. TabPFN), and time-series (ROCKET) approaches. The central claims are that TFM consistently outperform domain-specific and gradient-boosting baselines, time-series models remain competitive, and no method generalizes across datasets, revealing a fundamental gap.

Significance. If the protocol and rankings prove robust, RamanBench supplies a much-needed standardized, living evaluation platform with code and leaderboard for a fragmented field. The reported generalization failure, if not an artifact of preprocessing or splitting choices, would be a substantive finding that motivates new inductive biases for spectral data in medical, biological, and materials applications. The release of 16 new datasets and full reproducibility artifacts are clear strengths.

major comments (3)
  1. [Methods / Evaluation Protocol] Methods / Evaluation Protocol: The single fixed preprocessing pipeline (baseline correction, normalization, peak alignment, train/test splits) is applied uniformly to all 74 datasets with no reported ablation or sensitivity analysis to alternative domain-standard pipelines. Because the headline claims of TFM outperformance and the fundamental cross-dataset generalization gap rest on these rankings, the absence of such checks leaves open the possibility that the observed ordering is protocol-dependent rather than intrinsic.
  2. [Results] Results: Performance tables and the abstract state clear rankings and a generalization gap, yet no statistical significance tests, error bars, or variance across random seeds / multiple runs are described. Without these, it is impossible to determine whether the reported TFM superiority over baselines is robust or could be explained by implementation or split variability.
  3. [Evaluation Protocol / Results] Cross-dataset evaluation: The claim that 'no method generalizes across datasets' is central, but the manuscript does not specify the exact cross-dataset protocol (e.g., leave-one-dataset-out, meta-learning splits, or per-task train/test ratios) or the quantitative threshold used to declare failure of generalization. This detail is load-bearing for the 'fundamental gap' conclusion.
minor comments (2)
  1. [Abstract / §1] Abstract and §1: The total number of models (28) and the breakdown into categories (classical, Raman-specific, TFM, time-series) should be stated consistently; minor mismatches between abstract and main text reduce clarity.
  2. [Dataset description] Dataset table: Ensure every one of the 74 datasets has a clear citation or permanent link in the benchmark release; a few entries appear to rely only on internal identifiers.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed review, which highlights both the strengths of RamanBench and areas where additional clarity and analysis would strengthen the work. We address each major comment below and outline the revisions we will make.

read point-by-point responses
  1. Referee: The single fixed preprocessing pipeline (baseline correction, normalization, peak alignment, train/test splits) is applied uniformly to all 74 datasets with no reported ablation or sensitivity analysis to alternative domain-standard pipelines. Because the headline claims of TFM outperformance and the fundamental cross-dataset generalization gap rest on these rankings, the absence of such checks leaves open the possibility that the observed ordering is protocol-dependent rather than intrinsic.

    Authors: We selected the preprocessing pipeline to reflect a consensus approach commonly used in Raman spectroscopy studies, thereby enabling reproducible and comparable results across the 74 datasets. We agree that sensitivity to alternative pipelines is a relevant consideration. In the revised manuscript we will expand the methods section with an explicit rationale for the chosen pipeline and add a limited sensitivity analysis on a representative subset of datasets using two alternative domain-standard pipelines. revision: partial

  2. Referee: Performance tables and the abstract state clear rankings and a generalization gap, yet no statistical significance tests, error bars, or variance across random seeds / multiple runs are described. Without these, it is impossible to determine whether the reported TFM superiority over baselines is robust or could be explained by implementation or split variability.

    Authors: We acknowledge that reporting variability and statistical tests would improve the robustness assessment of the reported rankings. In the revised version we will recompute all results with multiple random seeds, include error bars (standard deviation) in the performance tables, and add paired statistical significance tests (e.g., Wilcoxon signed-rank) between the leading models and baselines. revision: yes

  3. Referee: The claim that 'no method generalizes across datasets' is central, but the manuscript does not specify the exact cross-dataset protocol (e.g., leave-one-dataset-out, meta-learning splits, or per-task train/test ratios) or the quantitative threshold used to declare failure of generalization. This detail is load-bearing for the 'fundamental gap' conclusion.

    Authors: We will add a precise description of the cross-dataset evaluation protocol, including the training and testing splits across datasets and the quantitative criteria used to identify generalization failure, to the evaluation protocol subsection of the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark with direct held-out measurements

full rationale

This is a pure empirical benchmarking paper that unifies 74 datasets, applies one fixed preprocessing and evaluation protocol, and reports model rankings on held-out splits. The claims (TFM outperformance, competitive time-series models, and absence of cross-dataset generalization) are direct experimental outcomes measured against independent test data rather than any derivation, equation, or self-referential fit. No self-citation is used to justify a uniqueness theorem or ansatz, and no prediction is constructed from parameters fitted to the same quantities being reported. The protocol choices are explicit and falsifiable by re-running on alternative pipelines, satisfying the criteria for a self-contained empirical result.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical benchmark paper. No new theoretical derivations, fitted constants, or postulated entities are introduced; the claims rest on data collection and standardized evaluation rather than new axioms or parameters.

pith-pipeline@v0.9.0 · 5563 in / 1105 out tokens · 41012 ms · 2026-05-09T17:19:04.393379+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

96 extracted references · 23 canonical work pages · 1 internal anchor

  1. [1]

    Vibrational spectroscopy and its future applications in microbiology.Applied Spectroscopy Reviews, 58(2):132–158, 2023

    Miia Marika Jansson, Martin Kögler, Sohvi Hörkkö, Tero Ala-Kokko, and Lassi Rieppo. Vibrational spectroscopy and its future applications in microbiology.Applied Spectroscopy Reviews, 58(2):132–158, 2023

  2. [2]

    Barbara Lafuente, Robert T Downs, Hexiong Yang, and Nate Stone. 1. the power of databases: The rruff project. InHighlights in mineralogical crystallography, pages 1–30. De Gruyter (O), 2015

  3. [3]

    The role of raman spectroscopy in biopharmaceuticals from development to manufacturing.Analytical and Bioanalytical Chemistry, 414(2):969–991, 2022

    Karen A Esmonde-White, Maryann Cuellar, and Ian R Lewis. The role of raman spectroscopy in biopharmaceuticals from development to manufacturing.Analytical and Bioanalytical Chemistry, 414(2):969–991, 2022

  4. [4]

    Rapid identification of pathogenic bacteria using raman spectroscopy and deep learning.Nature communications, 10 (1):4927, 2019

    Chi-Sing Ho, Neal Jean, Catherine A Hogan, Lena Blackmon, Stefanie S Jeffrey, Mark Holodniy, Niaz Banaei, Amr AE Saleh, Stefano Ermon, and Jennifer Dionne. Rapid identification of pathogenic bacteria using raman spectroscopy and deep learning.Nature communications, 10 (1):4927, 2019

  5. [5]

    An integrated computational pipeline for machine learning-driven diagnosis based on raman spectra of saliva samples.Computers in Biology and Medicine, 171:108028, 2024

    Dario Bertazioli, Marco Piazza, Cristiano Carlomagno, Alice Gualerzi, Marzia Bedoni, and Enza Messina. An integrated computational pipeline for machine learning-driven diagnosis based on raman spectra of saliva samples.Computers in Biology and Medicine, 171:108028, 2024

  6. [6]

    Open-source raman spectra of chemical compounds for active pharmaceutical ingredient development.Scientific Data, 12(1):498, 2025

    Aaron R Flanagan and Frank G Glavin. Open-source raman spectra of chemical compounds for active pharmaceutical ingredient development.Scientific Data, 12(1):498, 2025

  7. [7]

    Inline raman spectroscopy and indirect hard modeling for concentration monitoring of dissociated acid species.Applied spectroscopy, 75(5):506–519, 2021

    Alexander Echtermeyer, Caroline Marks, Alexander Mitsos, and Jörn Viell. Inline raman spectroscopy and indirect hard modeling for concentration monitoring of dissociated acid species.Applied spectroscopy, 75(5):506–519, 2021

  8. [8]

    Deep learning for raman spectroscopy: A review.Analytica, 3(3):287–301, 2022

    Ruihao Luo, Juergen Popp, and Thomas Bocklitz. Deep learning for raman spectroscopy: A review.Analytica, 3(3):287–301, 2022

  9. [9]

    Tabarena: A living benchmark for machine learning on tabular data.arXiv preprint arXiv:2506.16791, 2025

    Nick Erickson, Lennart Purucker, Andrej Tschalzev, David Holzmüller, Prateek Mutalik Desai, David Salinas, and Frank Hutter. Tabarena: A living benchmark for machine learning on tabular data. InProceedings of the 39th Conference on Neural Information Processing Systems (NeurIPS), 2025. URLhttps://arxiv.org/abs/2506.16791. 10

  10. [10]

    TALENT: A tabular analytics and learning toolbox.Journal of Machine Learning Research, 26(226):1–16, 2025

    Si-Yang Liu, Hao-Run Cai, Qi-Le Zhou, Huai-Hong Yin, Tao Zhou, Jun-Peng Jiang, and Han-Jia Ye. TALENT: A tabular analytics and learning toolbox.Journal of Machine Learning Research, 26(226):1–16, 2025

  11. [11]

    The ucr time series archive.IEEE/CAA Journal of Automatica Sinica, 6(6):1293–1305, 2019

    Hoang Anh Dau, Anthony Bagnall, Kaveh Kamgar, Chin-Chia Michael Yeh, Yan Zhu, Shaghayegh Gharghabi, Chotirat Ann Ratanamahatana, and Eamonn Keogh. The ucr time series archive.IEEE/CAA Journal of Automatica Sinica, 6(6):1293–1305, 2019

  12. [12]

    A., Lines, J., Flynn, M., Large, J., Bostrom, A.,

    Anthony Bagnall, Hoang Anh Dau, Jason Lines, Michael Flynn, James Large, Aaron Bostrom, Paul Southam, and Eamonn Keogh. The uea multivariate time series classification archive, 2018. arXiv preprint arXiv:1811.00075, 2018

  13. [13]

    Artificial intelligence-powered raman spectroscopy through open science and fair principles.ACS nano, 19(44):38189–38218, 2025

    Nicolas Coca-Lopez, Victor Alcolea-Rodriguez, Miguel A Bañares, Sandor Brockhauser, Julien Gorenflot, Alex Henderson, Ron Hildebrandt, Nina Jeliazkova, Nikolay Kochev, Enrique Lozano Diz, et al. Artificial intelligence-powered raman spectroscopy through open science and fair principles.ACS nano, 19(44):38189–38218, 2025

  14. [14]

    The fair guiding principles for scientific data management and stewardship

    Mark D Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E Bourne, et al. The fair guiding principles for scientific data management and stewardship. Scientific data, 3(1):1–9, 2016

  15. [15]

    Qiaohao Liang, Shyam Dwaraknath, and Kristin A. Persson. High-throughput computation and evaluation of Raman spectra.Scientific Data, 6:135, 2019. doi: 10.1038/s41597-019-0138-y

  16. [16]

    Convolutional neural networks as a tool for raman spectral mineral classification under low signal, dusty mars conditions.Earth and Space Science, 9(10):e2021EA002125, 2022

    Genesis Berlanga, Quentin Williams, and Nathan Temiquel. Convolutional neural networks as a tool for raman spectral mineral classification under low signal, dusty mars conditions.Earth and Space Science, 9(10):e2021EA002125, 2022

  17. [17]

    Schuetzke, N

    J. Schuetzke, N. J. Szymanski, and M. Reischl. Validating neural networks for spectroscopic classification on a universal synthetic dataset.npj Computational Materials, 9:100, 2023. doi: 10.1038/s41524-023-01055-y

  18. [18]

    Raman spectroscopic deep learning with signal aggregated representations for enhanced cell phenotype and signature identification.PNAS Nexus, 3(8): pgae268, 2024

    Songlin Lu, Yuanfang Huang, Wan Xiang Shen, Yu Lin Cao, Mengna Cai, Yan Chen, Ying Tan, Yu Yang Jiang, and Yu Zong Chen. Raman spectroscopic deep learning with signal aggregated representations for enhanced cell phenotype and signature identification.PNAS Nexus, 3(8): pgae268, 2024. doi: 10.1093/pnasnexus/pgae268

  19. [19]

    Ramanspy: An open-source python package for integrative raman spectroscopy data analysis.Analytical chemistry, 96(21):8492–8500, 2024

    Dimitar Georgiev, Simon Vilms Pedersen, Ruoxiao Xie, Álvaro Fernández-Galiana, Molly M Stevens, and Mauricio Barahona. Ramanspy: An open-source python package for integrative raman spectroscopy data analysis.Analytical chemistry, 96(21):8492–8500, 2024

  20. [20]

    Byrne, and David Pérez-Guaita

    Jaume Béjar-Grimalt, Ángel Sánchez-Illana, Guillermo Quintás, Hugh J. Byrne, and David Pérez-Guaita. Monte Carlo peaks: Simulated datasets to benchmark machine learning algorithms for clinical spectroscopy.Chemometrics and Intelligent Laboratory Systems, 2025. doi: 10.1016/j.chemolab.2025.105548

  21. [21]

    Deep learning for raman spectroscopy: Benchmarking models for upstream bioprocess monitoring.Measurement, page 118884, 2025

    Christoph Lange, Madeline Altmann, Daniel Stors, Simon Seidel, Kyle Moynahan, Linda Cai, Stefan Born, Peter Neubauer, and M Nicolas Cruz Bournazou. Deep learning for raman spectroscopy: Benchmarking models for upstream bioprocess monitoring.Measurement, page 118884, 2025

  22. [22]

    Nicolas Cruz Bournazou

    Christoph Lange, Maxim Borisyak, Martin Kögler, Stefan Born, Andreas Ziehe, Peter Neubauer, and M. Nicolas Cruz Bournazou. Comparing machine learning methods on Raman spectra from eight different spectrometers.Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 334:125861, 2025. doi: 10.1016/j.saa.2025.125861

  23. [23]

    Lilek et al

    J. Lilek et al. Machine learning of Raman spectroscopic data: Comparison of different validation strategies.Journal of Raman Spectroscopy, 56(9):867–877, 2025. doi: 10.1002/jrs.6842

  24. [24]

    Spectrumworld: Artificial intelligence foundation for spectroscopy.arXiv preprint arXiv:2508.01188, 2025

    Zhuo Yang, Jiaqing Xie, Shuaike Shen, Daolang Wang, Yeyun Chen, Ben Gao, Shuzhou Sun, Biqing Qi, Dongzhan Zhou, Lei Bai, et al. Spectrumworld: Artificial intelligence foundation for spectroscopy.arXiv preprint arXiv:2508.01188, 2025. 11

  25. [25]

    Deep spectral component filtering as a foundation model for spectral analysis demonstrated in metabolic profiling.Nature Machine Intelligence, 7 (5):743–757, 2025

    Bingsen Xue, Xinyuan Bi, Zheyi Dong, Yunzhe Xu, Minghui Liang, Xin Fang, Yizhe Yuan, Ruoxi Wang, Shuyu Liu, Rushi Jiao, et al. Deep spectral component filtering as a foundation model for spectral analysis demonstrated in metabolic profiling.Nature Machine Intelligence, 7 (5):743–757, 2025

  26. [26]

    Benchmarking deep learning models for raman spec- troscopy across open-source datasets.arXiv preprint arXiv:2601.16107, 2026

    Adithya Sineesh and Akshita Kamsali. Benchmarking deep learning models for raman spec- troscopy across open-source datasets.arXiv preprint arXiv:2601.16107, 2026

  27. [27]

    Raman-enabled predictions of pro- tein content and metabolites in biopharmaceutical saccharomyces cerevisiae fermentations

    Jeppe Hagedorn, Guilherme Ramos, Miguel Ressurreição, Ernst Broberg Hansen, Michael Sokolov, Carlos Casado Vázquez, and Christos Panos. Raman-enabled predictions of pro- tein content and metabolites in biopharmaceutical saccharomyces cerevisiae fermentations. Engineering in Life Sciences, 24(12):e202400045, 2024

  28. [28]

    Combining mechanistic modeling and raman spectroscopy for monitoring antibody chromatographic purification.Processes, 7(10): 683, 2019

    Fabian Feidl, Simone Garbellini, Martin F Luna, Sebastian V ogg, Jonathan Souquet, Hervé Broly, Massimo Morbidelli, and Alessandro Butté. Combining mechanistic modeling and raman spectroscopy for monitoring antibody chromatographic purification.Processes, 7(10): 683, 2019

  29. [29]

    Soft modeling: the basic design and some extensions.Systems under indirect observation, Part II, 2:36–37, 1982

    Herman Wold. Soft modeling: the basic design and some extensions.Systems under indirect observation, Part II, 2:36–37, 1982

  30. [30]

    Support-vector networks.Machine learning, 20(3): 273–297, 1995

    Corinna Cortes and Vladimir Vapnik. Support-vector networks.Machine learning, 20(3): 273–297, 1995

  31. [31]

    Imagenet: A large- scale hierarchical image database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large- scale hierarchical image database. In2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009

  32. [32]

    Glue: A multi-task benchmark and analysis platform for natural language understanding

    Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. Glue: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP workshop BlackboxNLP: Analyzing and interpreting neural networks for NLP, pages 353–355, 2018

  33. [33]

    Why do tree-based models still outperform deep learning on typical tabular data?Advances in neural information processing systems, 35:507–520, 2022

    Léo Grinsztajn, Edouard Oyallon, and Gaël Varoquaux. Why do tree-based models still outperform deep learning on typical tabular data?Advances in neural information processing systems, 35:507–520, 2022

  34. [34]

    Walter de Gruyter GmbH & Co KG, 2023

    Günter G Hoffmann.Infrared and Raman Spectroscopy: Principles and Applications. Walter de Gruyter GmbH & Co KG, 2023

  35. [35]

    Carte: pretraining and transfer for tabular learning.arXiv preprint arXiv:2402.16785, 2024

    Myung Jun Kim, Léo Grinsztajn, and Gaël Varoquaux. Carte: pretraining and transfer for tabular learning.arXiv preprint arXiv:2402.16785, 2024

  36. [36]

    Table Foundation Models: on knowledge pre-training for tabular learning, May 2025

    Myung Jun Kim, Félix Lefebvre, Gaëtan Brison, Alexandre Perez-Lebel, and Gaël Varoquaux. Table foundation models: on knowledge pre-training for tabular learning.arXiv preprint arXiv:2505.14415, 2025

  37. [37]

    Rapid identification of staphylococci by raman spectroscopy.Scientific reports, 7(1):14846, 2017

    Katarína Rebrošová, Martin Šiler, Ota Samek, Filip R˚ užiˇcka, Silvie Bernatová, Veronika Holá, Jan Ježek, Pavel Zemánek, Jana Sokolová, and Petr Petráš. Rapid identification of staphylococci by raman spectroscopy.Scientific reports, 7(1):14846, 2017

  38. [38]

    How to pre-process raman spectra for reliable and stable models?Analytica chimica acta, 704(1-2): 47–56, 2011

    Thomas Bocklitz, Angela Walter, Katharina Hartmann, Petra Rösch, and Jürgen Popp. How to pre-process raman spectra for reliable and stable models?Analytica chimica acta, 704(1-2): 47–56, 2011

  39. [39]

    Deep convolutional neural networks for raman spectrum recognition: a unified solution.Analyst, 142(21):4067–4074, 2017

    Jinchao Liu, Margarita Osadchy, Lorna Ashton, Michael Foster, Christopher J Solomon, and Stuart J Gibson. Deep convolutional neural networks for raman spectrum recognition: a unified solution.Analyst, 142(21):4067–4074, 2017

  40. [40]

    Scale-adaptive deep model for bacterial raman spectra identification.IEEE Journal of Biomedical and Health Informatics, 26(1):369–378, 2021

    Lin Deng, Yuzhong Zhong, Maoning Wang, Xiujuan Zheng, and Jianwei Zhang. Scale-adaptive deep model for bacterial raman spectra identification.IEEE Journal of Biomedical and Health Informatics, 26(1):369–378, 2021. 12

  41. [41]

    Ramannet: a generalized neural network architecture for raman spectrum analysis.Neural Computing and Applications, 35(25):18719–18735, 2023

    Nabil Ibtehaz, Muhammad EH Chowdhury, Amith Khandakar, Serkan Kiranyaz, M Sohel Rahman, and Susu M Zughaier. Ramannet: a generalized neural network architecture for raman spectrum analysis.Neural Computing and Applications, 35(25):18719–18735, 2023

  42. [42]

    Ramanformer: A transformer-based quantification approach for raman mixture components.ACS omega, 9(22):23241–23251, 2024

    Onur Can Koyun, Reyhan Kevser Keser, Safa Onur Sahin, Damla Bulut, Mustafa Yorulmaz, Veysel Yucesoy, and Behcet Ugur Toreyin. Ramanformer: A transformer-based quantification approach for raman mixture components.ACS omega, 9(22):23241–23251, 2024

  43. [43]

    Deep learning- based raman spectroscopy qualitative analysis algorithm: A convolutional neural network and transformer approach.Talanta, 275:126138, 2024

    Zilong Wang, Yunfeng Li, Jinglei Zhai, Siwei Yang, Biao Sun, and Pei Liang. Deep learning- based raman spectroscopy qualitative analysis algorithm: A convolutional neural network and transformer approach.Talanta, 275:126138, 2024

  44. [44]

    A self-supervised learning method for raman spectroscopy based on masked autoencoders.Expert Systems with Applications, page 128576, 2025

    Pengju Ren, Ri-gui Zhou, and Yaochong Li. A self-supervised learning method for raman spectroscopy based on masked autoencoders.Expert Systems with Applications, page 128576, 2025

  45. [45]

    Tabpfn: A transformer that solves small tabular classification problems in a second

    Noah Hollmann, Samuel Müller, Katharina Eggensperger, and Frank Hutter. Tabpfn: A transformer that solves small tabular classification problems in a second. InInternational Conference on Learning Representations, 2023

  46. [46]

    Accurate predictions on small data with a tabular foundation model.Nature, 637(8045):319–326, 2025

    Noah Hollmann, Samuel Müller, Lennart Purucker, Arjun Krishnakumar, Max Körfer, Shi Bin Hoo, Robin Tibor Schirrmeister, and Frank Hutter. Accurate predictions on small data with a tab- ular foundation model.Nature, 637(8045):319–326, 2025. doi: 10.1038/s41586-024-08328-6

  47. [47]

    TabICL: A tabular foundation model for in-context learning on large data

    Jingang Qu, David Holzmüller, Gaël Varoquaux, and Marine Le Morvan. TabICL: A tabular foundation model for in-context learning on large data. InInternational Conference on Machine Learning, 2025

  48. [48]

    Tabiclv2: A better, faster, scalable, and open tabular foundation model, 2026

    Jingang Qu, David Holzmüller, Gaël Varoquaux, and Marine Le Morvan. TabICLv2: A better, faster, scalable, and open tabular foundation model.arXiv preprint arXiv:2602.11139, 2026

  49. [49]

    Rocket: exceptionally fast and accurate time series classification using random convolutional kernels.Data Mining and Knowledge Discovery, 34(5):1454–1495, 2020

    Angus Dempster, Petitjean François, and Geoffrey I Webb. Rocket: exceptionally fast and accurate time series classification using random convolutional kernels.Data Mining and Knowledge Discovery, 34(5):1454–1495, 2020

  50. [50]

    Hive-cote 2.0: a new meta ensemble for time series classification.Machine Learning, 110(11):3211–3243, 2021

    Matthew Middlehurst, James Large, Michael Flynn, Jason Lines, Aaron Bostrom, and Anthony Bagnall. Hive-cote 2.0: a new meta ensemble for time series classification.Machine Learning, 110(11):3211–3243, 2021

  51. [51]

    Massspecgym: A benchmark for the discovery and identification of molecules.Advances in Neural Information Processing Systems, 37:110010–110027, 2024

    Roman Bushuiev, Anton Bushuiev, Niek F de Jonge, Adamo Young, Fleming Kretschmer, Raman Samusevich, Janne Heirman, Fei Wang, Luke Zhang, Kai Dührkop, et al. Massspecgym: A benchmark for the discovery and identification of molecules.Advances in Neural Information Processing Systems, 37:110010–110027, 2024

  52. [52]

    Toward a unified benchmark and framework for deep learning-based prediction of nuclear magnetic resonance chemical shifts.Nature Computational Science, 5(4):292–300, 2025

    Fanjie Xu, Wentao Guo, Feng Wang, Lin Yao, Hongshuai Wang, Fujie Tang, Zhifeng Gao, Linfeng Zhang, Weinan E, Zhong-Qun Tian, et al. Toward a unified benchmark and framework for deep learning-based prediction of nuclear magnetic resonance chemical shifts.Nature Computational Science, 5(4):292–300, 2025

  53. [53]

    In-line monitoring of microgel synthesis: flow versus batch reactor.Organic Process Research & Development, 25(9):2039–2051, 2021

    Luise F Kaven, Hanna JM Wolff, Lukas Wille, Matthias Wessling, Alexander Mitsos, and Joern Viell. In-line monitoring of microgel synthesis: flow versus batch reactor.Organic Process Research & Development, 25(9):2039–2051, 2021

  54. [54]

    Martin Kögler, Andrea Paul, Emmanuel Anane, Mario Birkholz, Alex Bunker, Tapani Viitala, Michael Maiwald, Stefan Junne, and Peter Neubauer. Comparison of time-gated surface- enhanced raman spectroscopy (tg-sers) and classical sers based monitoring of escherichia coli cultivation samples.Biotechnology progress, 34(6):1533–1542, 2018

  55. [55]

    Open raman spectral library for biomolecule identification.Chemometrics and Intelligent Laboratory Systems, 264:105476, 2025

    Marcelo Terán, José Javier Ruiz, Pablo Loza-Alvarez, David Masip, and David Merino. Open raman spectral library for biomolecule identification.Chemometrics and Intelligent Laboratory Systems, 264:105476, 2025. 13

  56. [56]

    Dataset of raman and surface-enhanced raman spectroscopy spectra of illicit adulterants added to dietary supplements

    Serena Rizzo, Yannick Weesepoel, Sara Erasmus, Joost Sinkeldam, Anna Lisa Piccinelli, and Saskia van Ruth. Dataset of raman and surface-enhanced raman spectroscopy spectra of illicit adulterants added to dietary supplements. 2023

  57. [57]

    Transfer-learning-based raman spectra identification.Journal of Raman Spectroscopy, 51 (1):176–186, 2020

    Rui Zhang, Huimin Xie, Shuning Cai, Yong Hu, Guo-kun Liu, Wenjing Hong, and Zhong-qun Tian. Transfer-learning-based raman spectra identification.Journal of Raman Spectroscopy, 51 (1):176–186, 2020

  58. [58]

    Pmlbmini: A tabular classification benchmark suite for data-scarce applications.arXiv preprint arXiv:2409.01635, 2024

    Ricardo Knauer, Marvin Grimm, and Erik Rodner. Pmlbmini: A tabular classification benchmark suite for data-scarce applications.arXiv preprint arXiv:2409.01635, 2024

  59. [59]

    Revisiting deep learning models for tabular data.Advances in neural information processing systems, 34: 18932–18943, 2021

    Yury Gorishniy, Ivan Rubachev, Valentin Khrulkov, and Artem Babenko. Revisiting deep learning models for tabular data.Advances in neural information processing systems, 34: 18932–18943, 2021

  60. [60]

    Rezero is all you need: Fast convergence at large depth

    Thomas Bachlechner, Bodhisattwa Prasad Majumder, Henry Mao, Gary Cottrell, and Julian McAuley. Rezero is all you need: Fast convergence at large depth. InUncertainty in artificial intelligence, pages 1352–1361. PMLR, 2021

  61. [61]

    Tabular data: Is deep learning all you need?arXiv preprint arXiv:2402.03970, 2024

    Guri Zabërgja, Arlind Kadra, Christian MM Frey, and Josif Grabocka. Tabular data: Is deep learning all you need?arXiv preprint arXiv:2402.03970, 2024

  62. [62]

    Coatnet: Marrying convolution and attention for all data sizes.Advances in neural information processing systems, 34:3965–3977, 2021

    Zihang Dai, Hanxiao Liu, Quoc V Le, and Mingxing Tan. Coatnet: Marrying convolution and attention for all data sizes.Advances in neural information processing systems, 34:3965–3977, 2021

  63. [63]

    col_name:col_val

    Xiyuan Zhang, Danielle C Maddix, Junming Yin, Nick Erickson, Abdul Fatir Ansari, Boran Han, Shuai Zhang, Leman Akoglu, Christos Faloutsos, Michael W Mahoney, et al. Mitra: Mixed synthetic priors for enhancing tabular foundation models.arXiv preprint arXiv:2510.21204, 2025

  64. [64]

    Cresswell, Keyvan Golestan, Guangwei Yu, Anthony L

    Junwei Ma, Valentin Thomas, Rasa Hosseinzadeh, Alex Labach, Hamidreza Kamkari, Jesse C Cresswell, Keyvan Golestan, Guangwei Yu, Anthony L Caterini, and Maksims V olkovs. Tabdpt: Scaling tabular foundation models on real data.arXiv preprint arXiv:2410.18164, 2024

  65. [65]

    TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling, February 2025

    Yury Gorishniy, Akim Kotelnikov, and Artem Babenko. TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling, February 2025. URL http://arxiv.org/ abs/2410.24210. arXiv:2410.24210 [cs]

  66. [66]

    Classification of deep-sea cold seep bacteria by transformer combined with raman spectroscopy.Scientific Reports, 13(1):3240, 2023

    Bo Liu, Kunxiang Liu, Xiaoqing Qi, Weijia Zhang, and Bei Li. Classification of deep-sea cold seep bacteria by transformer combined with raman spectroscopy.Scientific Reports, 13(1):3240, 2023

  67. [67]

    AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

    Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, and Alexander Smola. Autogluon-tabular: Robust and accurate automl for structured data.arXiv preprint arXiv:2003.06505, 2020

  68. [68]

    TabRepo: A large scale repository of tabular model evalua- tions and its AutoML applications

    David Salinas and Nick Erickson. TabRepo: A large scale repository of tabular model evalua- tions and its AutoML applications. InAutoML Conference 2024 (ABCD Track), 2024

  69. [69]

    Statistical comparisons of classifiers over multiple data sets.Journal of Machine learning research, 7(Jan):1–30, 2006

    Janez Demšar. Statistical comparisons of classifiers over multiple data sets.Journal of Machine learning research, 7(Jan):1–30, 2006

  70. [70]

    Sara Mostafapour, Thomas Dörfer, Ralf Heinke, Petra Rösch, Jürgen Popp, and Thomas Bocklitz. Investigating the effect of different pre-treatment methods on raman spectra recorded with different excitation wavelengths.Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 302:123100, 2023

  71. [71]

    Learned Publishing , volume =

    Amy Brand, Liz Allen, Micah Altman, Marjorie Hlava, and Jo Scott. Beyond authorship: attribution, contribution, collaboration, and credit.Learned Publishing, 28(2):151–155, 2015. doi: https://doi.org/10.1087/20150211. URL https://onlinelibrary.wiley.com/doi/ abs/10.1087/20150211. 14

  72. [72]

    The proposed uscf rating system, its development, theory, and applications.Chess life, 22(8):242–247, 1967

    Arpad E Elo. The proposed uscf rating system, its development, theory, and applications.Chess life, 22(8):242–247, 1967

  73. [73]

    Autorank: A Python package for automated ranking of classifiers.Journal of Open Source Software, 5(48):2173, 2020

    Steffen Herbold. Autorank: A Python package for automated ranking of classifiers.Journal of Open Source Software, 5(48):2173, 2020. doi: 10.21105/joss.02173

  74. [74]

    Xception: Deep learning with depthwise separable convolutions

    François Chollet. Xception: Deep learning with depthwise separable convolutions. InProceed- ings of the IEEE conference on computer vision and pattern recognition, pages 1251–1258, 2017

  75. [75]

    A new ecoc algorithm for multiclass microarray data classification

    Mengxin Sun, Kunhong Liu, Qingqi Hong, and Beizhan Wang. A new ecoc algorithm for multiclass microarray data classification. In2018 24th International Conference on Pattern Recognition (ICPR), pages 454–458. IEEE, 2018

  76. [76]

    A setup for automatic raman measurements in high-throughput experimentation.Biotechnology and Bioengineering, 122 (10):2751–2769, 2025

    Christoph Lange, Simon Seidel, Madeline Altmann, Daniel Stors, Annina Kemmer, Linda Cai, Stefan Born, Peter Neubauer, and M Nicolas Cruz Bournazou. A setup for automatic raman measurements in high-throughput experimentation.Biotechnology and Bioengineering, 122 (10):2751–2769, 2025

  77. [77]

    Application of green analytical chemistry to a green chemistry process: Magnetic resonance and raman spectroscopic process monitoring of continuous ethanolic fermentation

    Robin Legner, Alexander Wirtz, Tim Koza, Till Tetzlaff, Anna Nickisch-Hartfiel, and Martin Jaeger. Application of green analytical chemistry to a green chemistry process: Magnetic resonance and raman spectroscopic process monitoring of continuous ethanolic fermentation. Biotechnology and bioengineering, 116(11):2874–2883, 2019

  78. [78]

    Data augmentation scheme for raman spectra with highly correlated annotations

    Christoph Lange, Isabel Thiele, Lara Santolin, Sebastian L Riedel, Maxim Borisyak, Peter Neubauer, and Mariano Nicolas Cruz-Bournazou. Data augmentation scheme for raman spectra with highly correlated annotations. InComputer Aided Chemical Engineering, volume 53, pages 3055–3060. Elsevier, 2024

  79. [79]

    Melanie V oigt, Robin Legner, Simon Haefner, Anatoli Friesen, Alexander Wirtz, and Martin Jaeger. Using fieldable spectrometers and chemometric methods to determine ron of gasoline from petrol stations: A comparison of low-field 1h nmr@ 80 mhz, handheld raman and benchtop nir.Fuel, 236:829–835, 2019

  80. [80]

    Robin Legner, Melanie V oigt, Alexander Wirtz, Anatoli Friesen, Simon Haefner, and Martin Jaeger. Using compact proton nuclear magnetic resonance at 80 mhz and vibrational spectro- scopies and data fusion for research octane number and gasoline additive determination.Energy & Fuels, 34(1):103–110, 2019

Showing first 80 references.