pith. sign in

arxiv: 2605.22820 · v1 · pith:JTVQC7L4new · submitted 2026-05-21 · 💻 cs.LG

Integrable Elasticity via Neural Demand Potentials

Pith reviewed 2026-05-22 06:28 UTC · model grok-4.3

classification 💻 cs.LG
keywords demand estimationneural networksprice elasticitiesretail demandcross-price effectsintegrable modelsmultiproduct pricingscanner data
0
0 comments X

The pith

Neural network learns smooth log-demand to derive exact, stable elasticities

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes modeling multiproduct retail demand by training a neural network to output log-demand as a smooth function of log-prices, where the function is also conditioned on context. Elasticities are then obtained by taking derivatives of this single learned surface, which enforces consistency across own-price and cross-price responses. On the Dominick's beer dataset the resulting model produces better predictions on held-out observations than a standard directed log-log regression and returns more stable estimates for cross-price effects that are hard to identify from data alone. A reader would care because retailers rely on these numbers for pricing decisions that account for substitution between products.

Core claim

The Integrable Context-Dependent Demand Network learns log-demand as a smooth, context-conditioned function of log-prices, allowing elasticities to be derived exactly from the learned demand surface. On the Dominick's beer dataset, ICDN improves out-of-sample generalization over a directed log-log benchmark and yields more stable, economically plausible elasticity estimates, especially for weakly identified cross-price effects.

What carries the argument

The Integrable Context-Dependent Demand Network (ICDN), a neural model that represents log-demand directly as a function of log-prices and context so that all elasticities follow from differentiation of one consistent surface.

If this is right

  • Out-of-sample demand predictions improve relative to directed log-log models.
  • Cross-price elasticity estimates gain stability when identification from data is weak.
  • Derived elasticities align more closely with economic expectations for substitution patterns.
  • A single demand surface supplies all own-price and cross-price responses without separate regressions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same architecture could be applied to scanner data from other product categories to check whether stability gains hold beyond beer.
  • If the integrability property scales, neural demand surfaces might replace parametric systems in larger industrial-organization datasets.
  • Adding richer context such as promotions or seasonal indicators would test whether the smoothness constraint still yields usable elasticities.

Load-bearing premise

The neural network can learn a sufficiently accurate and smooth log-demand surface from the available data such that the derived elasticities remain stable and economically meaningful rather than reflecting model artifacts or overfitting.

What would settle it

Applying ICDN and the log-log benchmark to the Dominick's beer dataset and finding neither improved out-of-sample prediction accuracy nor reduced instability in cross-price elasticity estimates would falsify the performance claims.

Figures

Figures reproduced from arXiv: 2605.22820 by Carlos Heredia, Daniel Roncel.

Figure 1
Figure 1. Figure 1: Comparison between a global quadratic price effect and a spline-based nonlinear price repre [PITH_FULL_IMAGE:figures/full_fig_p012_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: ICDN forward pass. Context tokens are encoded into one latent representation per SKU, which generate own-price response terms, sparse attention-weighted cross-price response terms, and the parameters of the structured demand potential. Spline bases and analytic derivatives of log-prices are combined with these parameters to produce log-demand predictions and elasticities by exact differentiation. 19 [PITH… view at source ↗
Figure 3
Figure 3. Figure 3: Generalization comparison between ICDN and the benchmark. Left: fold-level [PITH_FULL_IMAGE:figures/full_fig_p030_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Own-price elasticity stability diagnostics. Left: bootstrap confidence interval width for ICDN [PITH_FULL_IMAGE:figures/full_fig_p031_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Cross-price elasticity diagnostics. Top left: distribution of bootstrap mean cross-price elasticities [PITH_FULL_IMAGE:figures/full_fig_p033_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Own-price resampling-based uncertainty diagnostics for bootstrap confidence intervals. Left: [PITH_FULL_IMAGE:figures/full_fig_p034_6.png] view at source ↗
read the original abstract

We propose the Integrable Context-Dependent Demand Network (ICDN), a demand-first neural model for multiproduct retail demand. The model learns log-demand as a smooth, context-conditioned function of log-prices, allowing elasticities to be derived exactly from the learned demand surface. On the Dominick's beer dataset, ICDN improves out-of-sample generalization over a directed log-log benchmark and yields more stable, economically plausible elasticity estimates, especially for weakly identified cross-price effects.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes the Integrable Context-Dependent Demand Network (ICDN), a neural model that learns log-demand as a smooth, context-conditioned function of log-prices for multiproduct retail settings. Elasticities are obtained exactly via differentiation of the learned surface. On the Dominick's beer dataset, ICDN is claimed to improve out-of-sample generalization relative to a directed log-log benchmark while producing more stable and economically plausible elasticity estimates, especially for weakly identified cross-price effects.

Significance. If the central claims hold after verification, the work provides a practical route to enforcing integrability in flexible neural demand models, which could improve elasticity estimation in retail data where cross-price identification is challenging. The approach leverages demand potentials to maintain theoretical consistency while retaining neural flexibility.

major comments (2)
  1. [§4] §4 (Empirical results on Dominick's beer): No empirical check is reported that the derived own- and cross-price elasticities satisfy Slutsky symmetry (or negative semi-definiteness of the Slutsky matrix) on held-out price vectors. This verification is load-bearing for the claim that integrability yields more plausible elasticities rather than artifacts of architecture or regularization.
  2. [§3] §3 (Model description): The architecture, loss function, regularization, and training procedure for the neural log-demand surface are described only at a high level. Without these details it is impossible to assess whether the smoothness required for stable differentiation is reliably achieved or whether finite-sample training preserves the integrability property.
minor comments (2)
  1. The abstract and introduction would benefit from an explicit comparison to existing integrable demand systems in the econometrics literature (e.g., those based on indirect utility or expenditure functions).
  2. [Results tables] Results tables should report standard errors or confidence intervals for the reported gains in out-of-sample fit and elasticity stability to allow assessment of statistical significance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight important aspects for strengthening the empirical validation and reproducibility of the ICDN model. We address each major comment below and commit to revisions that directly respond to the concerns raised.

read point-by-point responses
  1. Referee: [§4] §4 (Empirical results on Dominick's beer): No empirical check is reported that the derived own- and cross-price elasticities satisfy Slutsky symmetry (or negative semi-definiteness of the Slutsky matrix) on held-out price vectors. This verification is load-bearing for the claim that integrability yields more plausible elasticities rather than artifacts of architecture or regularization.

    Authors: We agree that verifying Slutsky symmetry and negative semi-definiteness on held-out price vectors is a valuable addition to substantiate the benefits of the integrability constraint. In the revised manuscript, we will add this analysis by sampling a set of held-out price vectors from the Dominick's beer dataset, deriving the corresponding Slutsky matrices via automatic differentiation of the learned demand surface, and reporting quantitative metrics such as the mean absolute deviation from symmetry across off-diagonal elements and the fraction of negative eigenvalues to confirm negative semi-definiteness. This will provide direct evidence that the observed improvements in elasticity stability are tied to the theoretical properties rather than incidental effects of the architecture. revision: yes

  2. Referee: [§3] §3 (Model description): The architecture, loss function, regularization, and training procedure for the neural log-demand surface are described only at a high level. Without these details it is impossible to assess whether the smoothness required for stable differentiation is reliably achieved or whether finite-sample training preserves the integrability property.

    Authors: We accept that the current presentation in §3 is insufficiently detailed for full reproducibility and assessment. The revised manuscript will expand §3 with complete specifications, including the precise neural network architecture (layer counts, hidden dimensions, and activation functions chosen to promote smoothness), the full loss function (including the primary demand-fitting term and any explicit regularization components such as gradient penalties or Hessian regularization to enforce smoothness), and the training details (optimizer, learning-rate schedule, epoch count, batch size, and any mechanisms such as constrained optimization or post-training projection used to maintain the integrability property in finite samples). These additions will enable readers to evaluate the reliability of the differentiation step and the preservation of theoretical consistency. revision: yes

Circularity Check

0 steps flagged

No significant circularity; elasticities derived from data-fitted neural surface

full rationale

The paper's core construction fits a neural network to learn a smooth log-demand surface from observed data on the Dominick's dataset, then obtains elasticities by exact differentiation of that surface. This is a standard supervised learning pipeline with no reduction of the claimed predictions to the inputs by construction. Integrability is imposed architecturally via demand potentials rather than being asserted as an empirical outcome that is then smuggled back in. No self-citation load-bearing steps, fitted-input-as-prediction patterns, or ansatz smuggling appear in the derivation chain. The out-of-sample generalization and stability comparisons to the log-log benchmark constitute independent empirical content.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

Ledger entries are inferred from the abstract description only; full paper may introduce additional fitted elements or background assumptions.

free parameters (1)
  • Neural network weights and biases
    Parameters fitted to the sales data to approximate the log-demand surface.
axioms (1)
  • domain assumption Log-demand can be represented as a smooth, context-conditioned function of log-prices that a neural network can learn accurately enough for derivative-based elasticities to be reliable.
    This premise is required for the exact-derivation property and the claim of improved stability.
invented entities (1)
  • ICDN (Integrable Context-Dependent Demand Network) no independent evidence
    purpose: Neural architecture that enforces integrability so elasticities follow directly from the demand surface.
    New model introduced in the work; no independent evidence outside the paper is provided.

pith-pipeline@v0.9.0 · 5590 in / 1397 out tokens · 58863 ms · 2026-05-22T06:28:26.073508+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

124 extracted references · 124 canonical work pages · 4 internal anchors

  1. [1]

    Four transformations on the Catalan triangle

    Sean J. Taylor and Benjamin Letham , title =. The American Statistician , volume =. 2018 , publisher =. doi:10.1080/00031305.2017.1380080 , eprint =

  2. [2]

    Journal of Machine Learning Research , volume=

    Dimensionality reduction: A comparative review , author=. Journal of Machine Learning Research , volume=

  3. [3]

    2014 , note =

    Categorical Principal Component Logistic Regression: A Case Study for Housing Loan Approval , journal =. 2014 , note =. doi:https://doi.org/10.1016/j.sbspro.2013.12.537 , author =

  4. [4]

    Machine learning , volume=

    An introduction to MCMC for machine learning , author=. Machine learning , volume=. 2003 , publisher=

  5. [5]

    and Kwiatkowski, Ariel and Balis, John U

    Towers, Mark and Terry, Jordan K. and Kwiatkowski, Ariel and Balis, John U. and Cola, Gianluca de and Deleu, Tristan and Goulão, Manuel and Kallinteris, Andreas and KG, Arjun and Krimmel, Markus and Perez-Vicente, Rodrigo and Pierré, Andrea and Schulhoff, Sander and Tai, Jun Jet and Shen, Andrew Tan Jin and Younis, Omar G. , month = mar, year =. Gymnasium...

  6. [6]

    Journal of Machine Learning Research , year =

    Antonin Raffin and Ashley Hill and Adam Gleave and Anssi Kanervisto and Maximilian Ernestus and Noah Dormann , title =. Journal of Machine Learning Research , year =

  7. [7]

    2012 , publisher=

    Handbook of Markov Decision Processes: Methods and Applications , author=. 2012 , publisher=

  8. [8]

    Thirty-Fifth Conference on Neural Information Processing Systems , year=

    Noether Networks: meta-learning useful conserved quantities , author=. Thirty-Fifth Conference on Neural Information Processing Systems , year=

  9. [9]

    2021 , eprint=

    Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges , author=. 2021 , eprint=

  10. [10]

    2021 , eprint=

    Noether: The More Things Change, the More Stay the Same , author=. 2021 , eprint=

  11. [11]

    Advances in Neural Information Processing Systems , volume=

    Noether’s learning dynamics: Role of symmetry breaking in neural networks , author=. Advances in Neural Information Processing Systems , volume=

  12. [12]

    The Journal of Machine Learning Research , volume=

    A general system of differential equations to model first-order adaptive algorithms , author=. The Journal of Machine Learning Research , volume=. 2020 , publisher=

  13. [13]

    Advances in Neural Information Processing Systems , volume=

    On the SDEs and scaling rules for adaptive gradient algorithms , author=. Advances in Neural Information Processing Systems , volume=

  14. [14]

    proceedings of the National Academy of Sciences , volume=

    A variational perspective on accelerated methods in optimization , author=. proceedings of the National Academy of Sciences , volume=. 2016 , publisher=

  15. [15]

    AdamW Optimizer , howpublished =

  16. [16]

    Decoupled Weight Decay Regularization

    Decoupled weight decay regularization , author=. arXiv preprint arXiv:1711.05101 , year=

  17. [17]

    Adam: A Method for Stochastic Optimization

    Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=

  18. [18]

    International Conference on Machine Learning , pages=

    Scaling vision transformers to 22 billion parameters , author=. International Conference on Machine Learning , pages=. 2023 , organization=

  19. [19]

    Wortsman, P

    Small-scale proxies for large-scale Transformer training instabilities , author=. arXiv preprint arXiv:2309.14322 , year=

  20. [20]

    and Srivastava, Santosh and Gupta, Maya R

    Frigyik, Bela A. and Srivastava, Santosh and Gupta, Maya R. , booktitle=. Functional Bregman divergence , year=

  21. [21]

    2023 , eprint=

    Nonlocal Lagrangian formalism , author=. 2023 , eprint=

  22. [22]

    Non-local Lagrangian mechanics: Noether’s theorem and Hamiltonian formalism , volume=

    Heredia, Carlos and Llosa, Josep , year=. Non-local Lagrangian mechanics: Noether’s theorem and Hamiltonian formalism , volume=. Journal of Physics A: Mathematical and Theoretical , publisher=. doi:10.1088/1751-8121/ac265c , number=

  23. [23]

    Mechanics with fractional derivatives , author =. Phys. Rev. E , volume =. 1997 , month =

  24. [24]

    Nonconservative Lagrangian and Hamiltonian mechanics , author =. Phys. Rev. E , volume =. 1996 , month =

  25. [25]

    and Avkar, T

    Baleanu, D. and Avkar, T. , year=. Lagrangians with linear velocities within Riemann-Liouville fractional derivatives , volume=. Il Nuovo Cimento B , publisher=. doi:10.1393/ncb/i2003-10062-y , number=

  26. [26]

    2008 , issn =

    New applications of fractional variational principles , journal =. 2008 , issn =. doi:https://doi.org/10.1016/S0034-4877(08)80007-9 , author =

  27. [27]

    2002 , issn =

    Formulation of Euler–Lagrange equations for fractional variational problems , journal =. 2002 , issn =. doi:https://doi.org/10.1016/S0022-247X(02)00180-4 , author =

  28. [28]

    2021 , eprint=

    Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics , author=. 2021 , eprint=

  29. [29]

    2019 , eprint=

    Decoupled Weight Decay Regularization , author=. 2019 , eprint=

  30. [30]

    Journal of Machine Learning Research , volume=

    Continuous time analysis of momentum methods , author=. Journal of Machine Learning Research , volume=

  31. [31]

    2015 , eprint=

    On Accelerated Methods in Optimization , author=. 2015 , eprint=

  32. [32]

    Learning internal representations by error propagation , author=

  33. [33]

    Large-Scale Machine Learning with Stochastic Gradient Descent

    Bottou, L \'e on. Large-Scale Machine Learning with Stochastic Gradient Descent. Proceedings of COMPSTAT'2010. 2010

  34. [34]

    Journal of Machine Learning Research , year =

    John Duchi and Elad Hazan and Yoram Singer , title =. Journal of Machine Learning Research , year =

  35. [35]

    An overview of gradient descent optimization algorithms

    An overview of gradient descent optimization algorithms , author=. arXiv preprint arXiv:1609.04747 , year=

  36. [36]

    2012 , howpublished =

    Geoffrey Hinton , title =. 2012 , howpublished =

  37. [37]

    Action Principle and Nonlocal Field Theories , author =. Phys. Rev. D , volume =. 1973 , month =

  38. [38]

    and Llosa, J

    Jaen, X. and Llosa, J. and Molina, A. A Reduction of order two for infinite order lagrangians. Phys. Rev. D. 1986. doi:10.1103/PhysRevD.34.2302

  39. [39]

    2011 , publisher=

    Generalized Classical Mechanics and Field Theory: A Geometrical Approach of Lagrangian and Hamiltonian Formalisms Involving Higher Order Derivatives , author=. 2011 , publisher=

  40. [40]

    1976 , publisher=

    Mechanics: Volume 1 , author=. 1976 , publisher=

  41. [41]

    Energy-momentum tensor for the electromagnetic field in a dispersive medium , volume=

    Heredia, Carlos and Llosa, Josep , year=. Energy-momentum tensor for the electromagnetic field in a dispersive medium , volume=. Journal of Physics Communications , publisher=. doi:10.1088/2399-6528/abfd14 , number=

  42. [42]

    Nonlocal Lagrangian fields and the second Noether theorem

    Heredia, Carlos and Llosa, Josep , year=. Nonlocal Lagrangian fields and the second Noether theorem. Non-commutative U(1) gauge theory , volume=. Journal of High Energy Physics , publisher=. doi:10.1007/jhep04(2024)021 , number=

  43. [43]

    Ostrogradskii, M , title =. Mem. Acad. St. Petersburg , volume =. 1850 , pages =

  44. [44]

    Vladimirov, V. S. , booktitle =. Generalized functions in mathematical physics , year =

  45. [45]

    2006 , publisher =

    Josep Peñarrocha Gantes and Arcadi Santamaria and Jordi Vidal , title =. 2006 , publisher =

  46. [46]

    2021 , publisher=

    Handbook of differential equations , author=. 2021 , publisher=

  47. [47]

    , date-added =

    Noether, E. , date-added =. Invariante Variationsprobleme , volume =. Nachrichten von der Gesellschaft der Wissenschaften zu G. 1918 , bdsk-url-1 =

  48. [48]

    2004 , publisher=

    Convex optimization , author=. 2004 , publisher=

  49. [49]

    Nonlocal Lagrangian fields: Noether’s theorem and Hamiltonian formalism , volume=

    Heredia, Carlos and Llosa, Josep , year=. Nonlocal Lagrangian fields: Noether’s theorem and Hamiltonian formalism , volume=. Physical Review D , publisher=. doi:10.1103/physrevd.105.126002 , number=

  50. [50]

    Advances in Neural Information Processing Systems (NeurIPS) , year=

    Neural Ordinary Differential Equations , author=. Advances in Neural Information Processing Systems (NeurIPS) , year=

  51. [51]

    1982 , publisher=

    Analysis, Manifolds and Physics Revised Edition , author=. 1982 , publisher=

  52. [52]

    Brown and M

    A. Brown and M. C. Bartholomew-Biggs , title =. Journal of Optimization Theory and Applications , volume =

  53. [53]

    , booktitle=

    Romero, Orlando and Benosman, Mouhacine and Pappas, George J. , booktitle=. ODE Discretization Schemes as Optimization Algorithms , year=

  54. [54]

    1970 , publisher=

    Lectures in Analytical Mechanics: Translated from the Russian by George Yankovsky , author=. 1970 , publisher=

  55. [55]

    H. K. Khalil , title =. 2002 , address =

  56. [56]

    Soviet Mathematics Doklady , volume=

    A method of solving a convex programming problem with convergence rate o(1/k^2) , author=. Soviet Mathematics Doklady , volume=

  57. [57]

    Weijie Su and Stephen Boyd and Emmanuel J. Cand. A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights , journal =. 2016 , volume =

  58. [58]

    2022 , eprint=

    Neural Integro-Differential Equations , author=. 2022 , eprint=

  59. [59]

    Advances in neural information processing systems , volume=

    Neural ordinary differential equations , author=. Advances in neural information processing systems , volume=

  60. [60]

    SIAM Journal on Financial Mathematics , volume=

    Stochastic gradient descent in continuous time , author=. SIAM Journal on Financial Mathematics , volume=. 2017 , publisher=

  61. [61]

    2014 , issn =

    IDSOLVER: A general purpose solver for nth-order integro-differential equations , journal =. 2014 , issn =. doi:doi.org/10.1016/j.cpc.2013.09.008 , author =

  62. [62]

    2024 , eprint=

    Are nonlocal Lagrangian systems fatally unstable? , author=. 2024 , eprint=

  63. [63]

    Proceedings of the 34th International Conference on Machine Learning , pages =

    Stochastic Modified Equations and Adaptive Stochastic Gradient Algorithms , author =. Proceedings of the 34th International Conference on Machine Learning , pages =. 2017 , editor =

  64. [64]

    Communications in Mathematics and Statistics , year =

    Weinan E , title =. Communications in Mathematics and Statistics , year =

  65. [65]

    2018 , journal =

    Michael Betancourt , title =. 2018 , journal =

  66. [66]

    Journal of Mathematical Imaging and Vision , year =

    Lars Ruthotto and Eldad Haber , title =. Journal of Mathematical Imaging and Vision , year =

  67. [67]

    Advances in neural information processing systems , volume=

    Hamiltonian neural networks , author=. Advances in neural information processing systems , volume=

  68. [68]

    Lagrangian neural networks,

    Lagrangian neural networks , author=. arXiv preprint arXiv:2003.04630 , year=

  69. [69]

    Advances in Neural Information Processing Systems , volume=

    Noether networks: meta-learning useful conserved quantities , author=. Advances in Neural Information Processing Systems , volume=

  70. [70]

    1976 , isbn =

    Walter Rudin , title =. 1976 , isbn =

  71. [71]

    Bauschke and Patrick L

    Heinz H. Bauschke and Patrick L. Combettes , title =. 2017 , doi =

  72. [72]

    1998 , booktitle =

    Chapter One - Linear Integral Inequalities , editor =. 1998 , booktitle =. doi:https://doi.org/10.1016/S0076-5392(98)80003-9 , author =

  73. [73]

    2004 , isbn =

    Yurii Nesterov , title =. 2004 , isbn =

  74. [74]

    Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak- ojasiewicz Condition

    Karimi, Hamed and Nutini, Julie and Schmidt, Mark. Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak- ojasiewicz Condition. Machine Learning and Knowledge Discovery in Databases. 2016

  75. [75]

    Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-

    Attouch, H\'. Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-. Math. Oper. Res. , month = may, pages =. 2010 , issue_date =. doi:10.1287/moor.1100.0449 , abstract =

  76. [76]

    Lipschitz Functions , subtitle =

    Ştefan Cobzaş and Radu Miculescu and Adriana Nicolae , series =. Lipschitz Functions , subtitle =. 2019 , isbn =. doi:10.1007/978-3-030-16489-8 , pages =

  77. [77]

    Variations on Barbălat’s Lemma , volume=

    Bálint Farkas and Sven-Ake Wegner , year=. Variations on Barbălat’s Lemma , volume=. The American Mathematical Monthly , publisher=. doi:10.4169/amer.math.monthly.123.8.825 , number=

  78. [78]

    Barbalat , title =

    I. Barbalat , title =. Revue Math. 1959 , pages =

  79. [79]

    Journal of Machine Learning Research , volume=

    Adaptive subgradient methods for online learning and stochastic optimization , author=. Journal of Machine Learning Research , volume=

  80. [80]

    2025 , eprint=

    Modeling AdaGrad, RMSProp, and Adam with Integro-Differential Equations , author=. 2025 , eprint=

Showing first 80 references.