Recognition: unknown
Jet Quenching Identification via Supervised Learning in Simulated Heavy-Ion Collisions
Pith reviewed 2026-05-09 23:23 UTC · model grok-4.3
The pith
Sequential machine learning on jet declustering history trees outperforms static models in classifying medium-modified jets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Sequential machine learning architectures applied to the jet declustering history tree achieve improved classification performance compared with static models that learn only from a single stage of the jet evolution. Models trained on different medium implementations exhibit meaningful performance modification under cross-domain validation, indicating that machine learning is sensitive to implementation-specific features that traditional observables may not resolve.
What carries the argument
The jet declustering history tree, which records the ordered sequence of splittings as the jet evolves through the medium and supplies the temporal structure that sequential models exploit to detect cumulative modifications.
If this is right
- Machine learning can extract more discriminative information from the ordered jet evolution history than is available in traditional global observables such as R_AA.
- Cross-domain validation demonstrates that the learned features are sensitive to how different models implement parton-medium interactions.
- Machine learning offers a route to address some limitations of conventional jet-modification analyses by using the full declustering sequence.
Where Pith is reading between the lines
- The approach could be applied to real collider data once simulation fidelity is established, potentially allowing experimental distinction between competing jet-quenching models.
- Combining the sequential classifier with other observables might tighten constraints on medium transport coefficients beyond what either method achieves alone.
- If the performance edge persists, it would motivate developing dedicated sequential architectures tailored to the physics of successive jet splittings.
Load-bearing premise
The jet declustering history tree extracted from simulations contains enough additional information about medium-induced changes that sequential models can use it to improve over static features, and this extra information generalizes when the models are applied to different medium descriptions.
What would settle it
A direct comparison in which a sequential model trained on one medium implementation shows no statistically significant accuracy gain over a static model on the same inputs, or shows unchanged performance when validated on a second, independently implemented medium model.
Figures
read the original abstract
Jet modification in heavy-ion collisions provides microscopic access to the properties of the quark-gluon plasma. However, conventional approaches based on traditional global observables, such as \(R_{AA}\), capture limited information about the complex dynamics of parton-medium interactions during hard scatterings. In this work, we apply sequential machine learning architectures to the jet declustering history tree, achieving improved classification performance compared with static models that learn only from a single stage of the jet evolution. We find that models trained on different medium implementations exhibit meaningful performance modification under cross-domain validation, indicating that machine learning is sensitive to implementation-specific features that traditional observables may not resolve. These results suggest new opportunities for using machine learning as an analysis tool to overcome some of the limitations of traditional jet-modification studies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript applies sequential machine learning architectures to the full jet declustering history tree extracted from simulated heavy-ion collisions. It claims that these models achieve improved classification of jet quenching relative to static models trained on only a single stage of jet evolution. Cross-domain tests across different medium implementations are reported to produce meaningful performance changes, indicating that the approach is sensitive to implementation-specific features not resolved by traditional observables such as R_AA.
Significance. If the central performance claims are substantiated with appropriate controls, the work could provide a new analysis tool for extracting microscopic information about parton-medium interactions from jet substructure. The cross-medium validation aspect is particularly promising, as it suggests machine learning may help diagnose model dependencies in QGP simulations that global observables miss. The absence of any numerical results or baseline details in the abstract, however, prevents a quantitative judgment of impact at present.
major comments (3)
- [Abstract] Abstract and main results: The central claim of 'improved classification performance' is stated without any quantitative metrics (accuracy, AUC, F1, ROC curves, or statistical significance), error bars, training/validation splits, or hyperparameter details. This renders the superiority assertion unverifiable and load-bearing for the paper's contribution.
- [Results] Comparison to baselines: The performance advantage is attributed to sequential processing of the declustering tree, yet the only baselines are static models restricted to a single evolution stage. No control is presented in which a non-sequential model (MLP, gradient-boosted trees, or similar) receives the concatenated full set of splitting variables from all stages. Without this test, the reported gains cannot be ascribed to the sequential inductive bias rather than simply to access to the complete history.
- [Results] Cross-domain validation: The statement that models 'exhibit meaningful performance modification under cross-domain validation' is given without numerical values, confusion matrices, or transfer metrics. This weakens the claim that machine learning resolves implementation-specific features beyond what traditional observables capture.
minor comments (2)
- [Abstract] The abstract would be strengthened by naming the specific sequential architectures employed (LSTM, GRU, Transformer, etc.) and the precise definition of the declustering history tree features.
- Notation for jet observables (e.g., R_AA) is standard but the manuscript should explicitly define the input feature vector for each declustering step to aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed review. We have revised the manuscript to address the concerns about quantitative metrics, baseline controls, and cross-domain results. Point-by-point responses follow.
read point-by-point responses
-
Referee: [Abstract] Abstract and main results: The central claim of 'improved classification performance' is stated without any quantitative metrics (accuracy, AUC, F1, ROC curves, or statistical significance), error bars, training/validation splits, or hyperparameter details. This renders the superiority assertion unverifiable and load-bearing for the paper's contribution.
Authors: We agree that the abstract should contain key quantitative metrics to allow verification of the central claims. The revised abstract now includes specific AUC and accuracy values with uncertainties, along with references to the training/validation splits and hyperparameter choices described in the methods. The full ROC curves, statistical tests, and error bars were already present in Section 3; these are now explicitly summarized in the abstract as well. revision: yes
-
Referee: [Results] Comparison to baselines: The performance advantage is attributed to sequential processing of the declustering tree, yet the only baselines are static models restricted to a single evolution stage. No control is presented in which a non-sequential model (MLP, gradient-boosted trees, or similar) receives the concatenated full set of splitting variables from all stages. Without this test, the reported gains cannot be ascribed to the sequential inductive bias rather than simply to access to the complete history.
Authors: The referee is correct that our original baselines compared sequential models on the full declustering history against static models using only a single stage, without testing a non-sequential model on the concatenated features from all stages. This additional control is required to isolate the contribution of the sequential architecture. We have performed the suggested experiment with an MLP (and a gradient-boosted tree) trained on the full concatenated splitting variables and report the results in the revised manuscript. The sequential models retain a clear performance advantage, supporting attribution to the sequential inductive bias. revision: yes
-
Referee: [Results] Cross-domain validation: The statement that models 'exhibit meaningful performance modification under cross-domain validation' is given without numerical values, confusion matrices, or transfer metrics. This weakens the claim that machine learning resolves implementation-specific features beyond what traditional observables capture.
Authors: We accept that the cross-domain results were described qualitatively without accompanying numerical values or matrices. The revised manuscript now includes explicit transfer metrics (accuracy and AUC changes under cross-medium evaluation), confusion matrices for each source-target pair, and direct comparisons showing that the observed performance shifts exceed those seen in R_AA for the same medium implementations. These additions substantiate the sensitivity to implementation-specific features. revision: yes
Circularity Check
No circularity: purely empirical ML performance comparison on simulated data
full rationale
The paper presents an empirical study applying sequential ML models to jet declustering trees from heavy-ion simulations and reports improved classification over single-stage static baselines. No equations, derivations, fitted parameters renamed as predictions, self-citations as load-bearing premises, or ansatzes appear in the provided text. The central claim is a data-driven performance statement that does not reduce to its inputs by construction; any limitations in experimental controls (e.g., full-history static baselines) concern validity of attribution rather than circularity in a derivation chain. This matches the default expectation of no significant circularity for non-theoretical empirical work.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Simulated heavy-ion collisions with different medium implementations accurately represent distinct physical scenarios relevant to real data.
Reference graph
Works this paper leans on
-
[1]
This choice is motivated by the fact that this configuration provides a theoretically clean grooming procedure
andβ= 0 [30]. This choice is motivated by the fact that this configuration provides a theoretically clean grooming procedure. In particular,β= 0 removes the angular dependence from the grooming condition, mak- ing the algorithm sensitive only to the momentum shar- ing between subjets, whilez cut = 0.1 suppresses soft con- taminating radiation without sign...
-
[2]
It allows networks to focus on different parts of the input sequence dynamically when making predictions
Attention-Enhanced LSTM The attention mechanism [14, 47] has transformed the approach to sequence modeling. It allows networks to focus on different parts of the input sequence dynamically when making predictions. Unlike traditional RNNs or LSTMs that fit everything into a single context vector, the attention mechanism takes a more nuanced approach by cal...
-
[3]
for more information about supervised models eval- uation metrics. Hyperparameter optimization [53, 54] was conducted by training several models with different parameter com- binations on the training dataset, followed by an evalu- ation of their performance using the validation set. The configuration that achieved the best results on the vali- dation set...
-
[4]
The RF models exhibit a consistent performance across all pT intervals and for both the Default and v-USPhydro datasets
Performance Metrics Tables I and II show the evaluation metrics measured for the Random Forest and MLP, respectively. The RF models exhibit a consistent performance across all pT intervals and for both the Default and v-USPhydro datasets. In particular, AUC scores remain above 0.85 in all cases, with a precision of around 0.5, indicating significant false...
-
[5]
We trained models on jets originating 7 TABLE II: Performance metrics for the MLP classifier acrossp T ranges and scenarios
Cross-domain Validation To further examine the reliability of these models and eliminate concerns about the dependence on superficial patterns, we conducted cross-domain evaluations (see Ta- bles III and IV). We trained models on jets originating 7 TABLE II: Performance metrics for the MLP classifier acrossp T ranges and scenarios. pT [GeV] Metric Default...
-
[6]
This method is grounded in cooper- ative game theory and provides model-agnostic attribu- tions of feature relevance
Importance Analysis To explore which input features most influence the predictions of the model, we perform a feature impor- tance analysis based on SHAP (SHapley Additive exPla- nations) values [67]. This method is grounded in cooper- ative game theory and provides model-agnostic attribu- tions of feature relevance. Positive SHAP values indicate that a g...
-
[7]
In all cases, the values reported are averages over independent runs, with the error indicated as well
Performance Metrics Tables V and VI present the performance metrics for both LSTM and LSTM+Attention models, evaluated on theJewelDefault and v-USPhydro substructure datasets, across the threep T intervals that are consid- ered in this study. In all cases, the values reported are averages over independent runs, with the error indicated as well. TABLE V: P...
-
[8]
In this setup, models are trained on data from one medium, either Default or v-USPhydro, and subsequently evaluated on data from the alternate medium
Cross-domain Validation To examine the capacity of LSTM, LSTM+Attention, and Transformer models to generalize the learned data, we conduct cross-domain evaluations, similarly to the ap- proach used for the Random Forest and MLP models in Section V A 2. In this setup, models are trained on data from one medium, either Default or v-USPhydro, and subsequentl...
-
[9]
In this way, the curves indicate the average im- portance of each declustering step in the model decision
Sequence Importance Analysis To investigate the relevance of each position in the Soft Drop sequence, we compute the absolute SHAP attribu- tion step by step for every jet and then take the aver- age. In this way, the curves indicate the average im- portance of each declustering step in the model decision. The distributions of the mean absolute SHAP value...
2024
-
[10]
U. Heinz and M. Jacob, Evidence for a new state of mat- ter: An assessment of the results from the cern lead beam programme, arXiv preprint nucl-th/0002042 (2000)
-
[11]
Adams, M
J. Adams, M. Aggarwal, Z. Ahammed, J. Amonett, B. Anderson, D. Arkhipkin, G. Averichev, S. Badyal, Y. Bai, J. Balewski,et al., Experimental and theoreti- cal challenges in the search for the quark–gluon plasma: The star collaboration’s critical assessment of the evi- dence from rhic collisions, Nuclear Physics A757, 102 (2005)
2005
-
[12]
Cao and X.-N
S. Cao and X.-N. Wang, Jet quenching and medium re- sponse in high-energy heavy-ion collisions: a review, Re- ports on Progress in Physics84, 024301 (2021)
2021
-
[13]
M. M. de Melo Paulino,Study of the medium effects in jet observables in relativistic heavy-ion collisions, Master’s thesis, Universidade de S˜ ao Paulo (USP) (2024), acesso em: 7 jan. 2025
2024
-
[14]
Singh, P
Y. Singh, P. K. Bhatia, and O. Sangwan, A review of studies on machine learning techniques, International Journal of Computer Science and Security1, 70 (2007)
2007
-
[15]
Du, Overview: Jet quenching with machine learn- ing, arXiv preprint arXiv:2308.10035 (2023)
Y.-L. Du, Overview: Jet quenching with machine learn- ing, arXiv preprint arXiv:2308.10035 (2023)
-
[16]
H. J. Bossi,Novel Uses of Machine Learning for Differ- ential Jet Quenching Measurements at the LHC, Ph.D. thesis, Yale University (2023)
2023
-
[17]
K. C. Zapp, F. Krauss, and U. A. Wiedemann, A per- turbative framework for jet quenching, Journal of High Energy Physics2013, 10.1007/jhep03(2013)080 (2013)
-
[18]
K. Zapp, Jewel 2.0.0: directions for use, The European Physical Journal C74, 10.1140/epjc/s10052-014-2762-1 (2014)
-
[19]
Noronha-Hostler, G
J. Noronha-Hostler, G. S. Denicol, J. Noronha, R. P. G. Andrade, and F. Grassi, v-USPhydro: Bulk Viscosity Ef- fects on Event-by-Event Relativistic Hydrodynamics, J. Phys. Conf. Ser.458, 012018 (2013)
2013
-
[20]
Breiman, Random forests, Machine learning45, 5 (2001)
L. Breiman, Random forests, Machine learning45, 5 (2001)
2001
-
[21]
D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning representations by back-propagating errors, na- ture323, 533 (1986)
1986
-
[22]
Hochreiter and J
S. Hochreiter and J. Schmidhuber, Long short-term mem- ory, Neural computation9, 1735 (1997)
1997
-
[23]
Neural Machine Translation by Jointly Learning to Align and Translate
D. Bahdanau, K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate, arXiv preprint arXiv:1409.0473 (2014)
work page internal anchor Pith review arXiv 2014
-
[24]
Vaswani, N
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, At- tention is all you need, Advances in neural information processing systems30(2017)
2017
-
[25]
Sj¨ ostrand, S
T. Sj¨ ostrand, S. Mrenna, and P. Skands, Pythia 6.4 physics and manual, Journal of High Energy Physics 2006, 026 (2006)
2006
-
[26]
Sj¨ ostrand, S
T. Sj¨ ostrand, S. Ask, J. R. Christiansen, R. Corke, N. De- sai, P. Ilten, S. Mrenna, S. Prestel, C. O. Rasmussen, and P. Z. Skands, An introduction to pythia 8.2, Computer physics communications191, 159 (2015)
2015
-
[27]
M. L. Miller, K. Reygers, S. J. Sanders, and P. Stein- berg, Glauber modeling in high-energy nuclear collisions, Annu. Rev. Nucl. Part. Sci.57, 205 (2007)
2007
-
[28]
Baier, Y
R. Baier, Y. L. Dokshitzer, A. H. Mueller, S. Peigne, and D. Schiff, Radiative energy loss of high energy quarks and gluons in a finite-volume quark-gluon plasma, Nuclear Physics B483, 291 (1997)
1997
-
[29]
Zakharov, Radiative energy loss of high-energy quarks in finite-size nuclear matter and quark-gluon plasma, Journal of Experimental and Theoretical Physics Letters 65, 615 (1997)
B. Zakharov, Radiative energy loss of high-energy quarks in finite-size nuclear matter and quark-gluon plasma, Journal of Experimental and Theoretical Physics Letters 65, 615 (1997)
1997
-
[30]
Kunnawalkam Elayavalli and K
R. Kunnawalkam Elayavalli and K. C. Zapp, Medium response in jewel and its impact on jet shape observables in heavy ion collisions, Journal of High Energy Physics 2017, 1 (2017)
2017
-
[31]
Liu and G
M. Liu and G. Liu, Smoothed particle hydrodynamics (sph): an overview and recent developments, Archives of computational methods in engineering17, 25 (2010)
2010
- [32]
-
[33]
J. E. Bernhard, J. S. Moreland, and S. A. Bass, Bayesian estimation of the specific shear and bulk viscosity of 16 quark–gluon plasma, Nature Physics15, 1113 (2019)
2019
-
[34]
Crispim Rom˜ ao, J
M. Crispim Rom˜ ao, J. Guilherme Milhano, and M. van Leeuwen, Jet substructure observables for jet quenching in quark gluon plasma: A machine learning driven anal- ysis, SciPost Physics16, 015 (2024)
2024
-
[35]
A. J. Larkoski, S. Marzani, G. Soyez, and J. Thaler, Soft drop, Journal of High Energy Physics2014, 1 (2014)
2014
-
[36]
Cacciari, G
M. Cacciari, G. P. Salam, and G. Soyez, The anti-kt jet clustering algorithm, Journal of High Energy Physics 2008, 063 (2008)
2008
-
[37]
Y. L. Dokshitzer, G. Leder, S. Moretti, and B. Webber, Better jet clustering algorithms, Journal of High Energy Physics1997, 001 (1997)
1997
-
[38]
CMS Collaboration, Measurement of the primary lund jet plane density in proton-proton collisions at √s= 13 tev, Journal of High Energy Physics2024, 116 (2024)
2024
-
[39]
A. J. Larkoski, S. Marzani, and J. Thaler, Sudakov safety in perturbative qcd, Physical Review D91, 111501 (2015)
2015
-
[40]
L. Liu, J. Velkovska, Y. Wu, and M. Verweij, Identifying quenched jets in heavy ion collisions with machine learn- ing, Journal of High Energy Physics2023, 1 (2023)
2023
-
[41]
Zhou,Ensemble methods: foundations and algo- rithms(CRC press, 2025)
Z.-H. Zhou,Ensemble methods: foundations and algo- rithms(CRC press, 2025)
2025
-
[42]
T. G. Dietterich, Ensemble methods in machine learning, inInternational workshop on multiple classifier systems (Springer, 2000) pp. 1–15
2000
-
[43]
Breiman, J
L. Breiman, J. Friedman, R. A. Olshen, and C. J. Stone, Classification and regression trees(Routledge, 2017)
2017
-
[44]
J. R. Quinlan, Induction of decision trees, Machine learn- ing1, 81 (1986)
1986
-
[45]
Breiman, Bagging predictors, Machine learning24, 123 (1996)
L. Breiman, Bagging predictors, Machine learning24, 123 (1996)
1996
-
[46]
T. K. Ho, The random subspace method for constructing decision forests, IEEE transactions on pattern analysis and machine intelligence20, 832 (1998)
1998
-
[47]
Alhazmi, Z
H. Alhazmi, Z. Dong, L. Huang, J. H. Kim, K. Kong, and D. Shih, Resolving combinatorial ambiguities in dilepton tt event topologies with neural networks, Physical Re- view D105, 115011 (2022)
2022
-
[48]
Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain., Psy- chological review65, 386 (1958)
F. Rosenblatt, The perceptron: a probabilistic model for information storage and organization in the brain., Psy- chological review65, 386 (1958)
1958
-
[49]
C. M. Bishop and N. M. Nasrabadi,Pattern recognition and machine learning, Vol. 4 (Springer, 2006)
2006
-
[50]
Nair and G
V. Nair and G. E. Hinton, Rectified linear units improve restricted boltzmann machines, inProceedings of the 27th international conference on machine learning (ICML-10) (2010) pp. 807–814
2010
- [51]
-
[52]
J. L. Elman, Finding structure in time, Cognitive science 14, 179 (1990)
1990
-
[53]
Sutskever, O
I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to se- quence learning with neural networks, Advances in neural information processing systems27(2014)
2014
-
[54]
Vinyals, L
O. Vinyals, L. Kaiser, T. Koo, S. Petrov, I. Sutskever, and G. Hinton, Grammar as a foreign language, Advances in neural information processing systems28(2015)
2015
- [55]
-
[56]
M.-T. Luong, H. Pham, and C. D. Manning, Effective ap- proaches to attention-based neural machine translation, arXiv preprint arXiv:1508.04025 (2015)
-
[57]
Y. Yi, Z. Chen, and R. Li, Lstm neural networks with attention mechanisms for accelerated prediction of charge density at onset condition of dc corona discharge, IEEE Access10, 124697 (2022)
2022
-
[58]
T. Rockt¨ aschel, E. Grefenstette, K. M. Hermann, T. Koˇ cisk` y, and P. Blunsom, Reasoning about entailment with neural attention, arXiv preprint arXiv:1509.06664 (2015)
-
[59]
H. Qu, C. Li, and S. Qian, Particle transformer for jet tagging, inInternational Conference on Machine Learn- ing(PMLR, 2022) pp. 18281–18292
2022
-
[60]
He and D
M. He and D. Wang, Quark/gluon discrimination and top tagging with dual attention transformer, The European Physical Journal C83, 1116 (2023)
2023
-
[61]
Ferrer, Analysis and comparison of classification met- rics, arXiv preprint arXiv:2209.05355 (2022)
L. Ferrer, Analysis and comparison of classification met- rics, arXiv preprint arXiv:2209.05355 (2022)
-
[62]
Bergstra and Y
J. Bergstra and Y. Bengio, Random search for hyper- parameter optimization, The journal of machine learning research13, 281 (2012)
2012
-
[63]
Bergstra, R
J. Bergstra, R. Bardenet, Y. Bengio, and B. K´ egl, Al- gorithms for hyper-parameter optimization, Advances in neural information processing systems24(2011)
2011
-
[64]
Efron, Bootstrap methods: another look at the jack- knife, inBreakthroughs in statistics: Methodology and distribution(Springer, 1992) pp
B. Efron, Bootstrap methods: another look at the jack- knife, inBreakthroughs in statistics: Methodology and distribution(Springer, 1992) pp. 569–593
1992
-
[65]
Efron and R
B. Efron and R. J. Tibshirani,An introduction to the bootstrap(Chapman and Hall/CRC, 1994)
1994
-
[66]
A. C. Davison and D. V. Hinkley,Bootstrap methods and their application, 1 (Cambridge university press, 1997)
1997
-
[67]
M. A. Hern´ an and J. M. Robins, Causal inference (2010)
2010
-
[68]
Apolinario, N
L. Apolinario, N. Armesto, and L. Cunqueiro, An anal- ysis of the influence of background subtraction and quenching on jet observables in heavy-ion collisions, Jour- nal of High Energy Physics2013, 1 (2013)
2013
- [69]
-
[70]
Mengel, P
T. Mengel, P. Steffanic, C. Hughes, A. C. O. da Silva, and C. Nattrass, Interpretable machine learning meth- ods applied to jet background subtraction in heavy-ion collisions, Physical Review C108, L021901 (2023)
2023
-
[71]
A. Budhraja, M. van Leeuwen, and J. G. Milhano, Jet observables in heavy ion collisions: a white paper, arXiv preprint arXiv:2409.03017 (2024)
-
[72]
Connors, C
M. Connors, C. Nattrass, R. Reed, and S. Salur, Jet measurements in heavy ion physics, Reviews of Modern Physics90, 025005 (2018)
2018
-
[73]
A. Falc˜ ao and K. Tywoniuk, Constraining jet quench- ing in heavy-ion collisions with bayesian inference, arXiv preprint arXiv:2411.14552 (2024)
-
[74]
Andres, J
C. Andres, J. Holguin, R. K. Elayavalli, and J. Vi- inikainen, Minimizing selection bias in inclusive jets in heavy-ion collisions with energy correlators, Physical Re- view Letters134, 082303 (2025)
2025
- [75]
-
[76]
S. M. Lundberg and S.-I. Lee, A unified approach to in- terpreting model predictions, Advances in neural infor- mation processing systems30(2017). 17
2017
-
[77]
Milhano, U
G. Milhano, U. A. Wiedemann, and K. C. Zapp, Sensitiv- ity of jet substructure to jet-induced medium response, Physics Letters B779, 409 (2018)
2018
-
[78]
Pedregosa, G
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg,et al., Scikit-learn: Machine learn- ing in python, the Journal of machine Learning research 12, 2825 (2011)
2011
-
[79]
Paszke, S
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., Pytorch: An imperative style, high-performance deep learning library, Advances in neural information processing systems32(2019)
2019
-
[80]
Bergstra, D
J. Bergstra, D. Yamins, and D. D. Cox, Making a sci- ence of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures, inPro- ceedings of the 30th International Conference on Machine Learning, Proceedings of Machine Learning Research, Vol. 28 (2013) pp. 115–123. 18 Appendix A: Machine Learning Hyperparameters In this ap...
2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.