pith. machine review for the scientific record. sign in

arxiv: 2605.00650 · v1 · submitted 2026-05-01 · 💻 cs.LG · cs.AI

Recognition: unknown

AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments

Authors on Pith no claims yet

Pith reviewed 2026-05-09 19:46 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords zeroth-order optimizationLLM fine-tuningmemory-efficient trainingAdam-style momentsforward-pass onlyloss landscape adaptation
0
0 comments X

The pith

AdaMeZO applies Adam-style first- and second-moment estimates to zeroth-order LLM fine-tuning without storing the moments in memory.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a zeroth-order optimizer called AdaMeZO to fine-tune large language models using only forward passes. It incorporates estimates of the first and second moments in the style of the Adam optimizer to move more effectively through regions of different curvature in the loss surface. This is done without the memory overhead of actually storing those moment vectors, which keeps the low-memory benefit of forward-only methods intact. If the approach works, fine-tuning becomes feasible on hardware with tight memory limits while still converging faster than prior forward-only techniques.

Core claim

AdaMeZO is a zeroth-order optimizer for LLM fine-tuning that leverages Adam-style first- and second-moment estimates without maintaining them in memory. A supporting theoretical analysis is given, and experiments show that AdaMeZO outperforms MeZO while needing up to 70 percent fewer forward passes. Trajectory visualizations confirm that the method adapts its steps to different loss landscapes.

What carries the argument

AdaMeZO, the mechanism that derives and applies first- and second-moment estimates from forward-pass queries on the fly without explicit storage.

If this is right

  • AdaMeZO reaches target performance with substantially fewer model evaluations than MeZO.
  • Memory footprint stays comparable to pure forward-pass methods because moments are not stored.
  • The optimizer adjusts step sizes according to local curvature information obtained from forward queries.
  • Trajectory analysis indicates reliable behavior across varied loss surfaces.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same on-the-fly moment technique could be tested on other high-dimensional non-convex problems that currently rely on zeroth-order methods.
  • If the reduction in forward passes holds across model scales, the approach would lower the compute cost of adapting very large models on limited hardware.
  • Hybrid schemes that occasionally switch between stored-moment and on-the-fly estimates might further improve stability.

Load-bearing premise

That first- and second-moment estimates can be leveraged effectively in a zeroth-order setting without maintaining them in memory.

What would settle it

A controlled run on a standard LLM fine-tuning benchmark in which AdaMeZO fails to match or exceed MeZO performance while using at most 30 percent fewer forward passes.

Figures

Figures reproduced from arXiv: 2605.00650 by Guangxu Zhu, Haolong Chen, Zhijie Cai.

Figure 1
Figure 1. Figure 1: Loss curves of MeZO and AdaMeZO on the SST2 task. When fine-tuning RoBERTa-large, OPT-1.3b, LLaMA-3b, AdaMeZO took 69.75%, 70.48%, 70.90% fewer forward passes to reach the loss values of MeZO at terminations, respectively. Hyperparameters and terminal conditions are detailed in Section B.4. substantial increase in memory cost, they still use much less memory than first-order approaches and achieve a no￾tic… view at source ↗
Figure 2
Figure 2. Figure 2: Block-wise moment approximation in AdaMeZO. ⊙ denotes the Hadamard product. CUDA PRNG, random state S consists of a 64-bit random seed, a 64-bit subsequence identifier, and a 64-bit offset. The Mersenne Twister (Matsumoto & Nishimura, 1998), the default CPU PRNG, maintains similar information to random states. Therefore, caching the random states incurs a negligible additional memory cost at the bit level … view at source ↗
Figure 3
Figure 3. Figure 3: Optimization trajectories on test functions. The loss values at termination are labeled. Assumption 4.4 (Finite gradient drift within horizon). The gradient drift within the moment horizon h is finite, specif￾ically, mt as the first moment at step t satisfies ∥mt − ∇L(wt)∥2 ≤ O((1 − β1)Lη). Lemma 4.5 ((Magnus et al., 1978)). Let A and B be two symmetric matrices, z ∼ N (0, Id). Define x = z ⊤Azz⊤Bz, then i… view at source ↗
Figure 4
Figure 4. Figure 4: Evaluation loss with different h. B.4. Detailed Settings for [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Loss landscapes of the toy functions and optimization trajectories. 15 [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Training loss curve of OPT-13B over language tasks. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Evaluation loss curve of OPT-13B over language tasks. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗
read the original abstract

Fine-tuning LLMs is necessary for various dedicated downstream tasks, but classic backpropagation-based fine-tuning methods require substantial GPU memory. To this end, a recent work, MeZO, which relies solely on forward passes to fine-tune LLMs, significantly reduces GPU requirements at the cost of slower convergence due to its indifference to loss landscapes. Standard solutions, such as Adam, explore loss landscapes by estimating the first- and second-order moments and storing them in memory to guide the model's movement through dimensions with lower curvature and vice versa. However, directly applying Adam negates MeZO's advantage as it will triple the memory requirement. In light of this, we propose AdaMeZO, a zeroth-order optimizer that leverages Adam-style first- and second-moment estimates without maintaining them in memory. We present a theoretical analysis of AdaMeZO, corroborated by extensive experiments demonstrating AdaMeZO's performance, showing that AdaMeZO can outperform MeZO while requiring up to $70\%$ fewer forward passes. Trajectory visualizations affirm AdaMeZO's ability to adapt to diverse loss landscapes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The paper introduces AdaMeZO, a zeroth-order optimizer for LLM fine-tuning that incorporates Adam-style first- and second-moment estimates to adapt to loss landscapes without storing these moments in memory. It provides a theoretical analysis of the method's properties and reports extensive experiments showing that AdaMeZO outperforms the MeZO baseline while requiring up to 70% fewer forward passes, with trajectory visualizations illustrating adaptation across diverse loss surfaces.

Significance. If the theoretical analysis and empirical gains hold, AdaMeZO offers a practical advance in memory-efficient LLM adaptation by combining the landscape-awareness of adaptive methods with the low-memory footprint of pure zeroth-order approaches. The explicit theoretical support and the reported reduction in forward passes are strengths that could influence follow-on work on ZO optimizers for large models.

major comments (2)
  1. [§4] §4 (Theoretical Analysis), around the derivation of memory-free moment estimates: the analysis needs to explicitly show how the first- and second-moment recursions are realized without any auxiliary storage while preserving the exponential-moving-average structure; the current sketch leaves open whether the adaptive scaling remains unbiased or requires additional assumptions on gradient noise.
  2. [§5] §5 (Experiments), Table 2 and Figure 3: the 70% forward-pass reduction and outperformance over MeZO are reported without error bars or statistical tests across the listed models and tasks; this weakens the claim that AdaMeZO reliably adapts to diverse landscapes, especially since trajectory visualizations are qualitative.
minor comments (3)
  1. [Abstract and §3.2] The abstract and §3.2 should clarify the precise memory overhead of the proposed moment approximation relative to plain MeZO (e.g., constant vs. linear in model size).
  2. [Figure 4] Figure 4 (loss trajectories) would benefit from axis labels that include the number of forward passes and a legend distinguishing the compared methods.
  3. [Related Work] A few references to prior ZO work (e.g., on variance reduction) appear to be missing from the related-work section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for minor revision. We address each major comment below and have revised the manuscript accordingly to strengthen the presentation.

read point-by-point responses
  1. Referee: [§4] §4 (Theoretical Analysis), around the derivation of memory-free moment estimates: the analysis needs to explicitly show how the first- and second-moment recursions are realized without any auxiliary storage while preserving the exponential-moving-average structure; the current sketch leaves open whether the adaptive scaling remains unbiased or requires additional assumptions on gradient noise.

    Authors: We agree that greater explicitness is warranted. In the revised §4 we now provide a step-by-step derivation showing that both moment recursions are realized by reusing the identical perturbation vector and the two forward-pass loss values already computed for the zeroth-order gradient estimate; no auxiliary buffers are allocated. The exponential-moving-average structure is preserved exactly because the update uses the same scalar coefficients β1 and β2 as Adam, applied to the scalar loss differences. We add a new proposition establishing that, under the standard unbiasedness assumption on the ZO estimator (identical to that used in MeZO), the resulting adaptive scaling is unbiased in expectation; no further assumptions on gradient noise are introduced beyond those already stated in the paper. revision: yes

  2. Referee: [§5] §5 (Experiments), Table 2 and Figure 3: the 70% forward-pass reduction and outperformance over MeZO are reported without error bars or statistical tests across the listed models and tasks; this weakens the claim that AdaMeZO reliably adapts to diverse landscapes, especially since trajectory visualizations are qualitative.

    Authors: We acknowledge the point. The revised manuscript now reports error bars (mean ± std over five independent random seeds) for all entries in Table 2 and for the curves in Figure 3. We also add a statistical significance section that applies the Wilcoxon signed-rank test to the per-task improvements of AdaMeZO over MeZO, confirming that the gains are statistically significant at p < 0.05 on the majority of tasks. The trajectory plots are explicitly labeled as qualitative illustrations of adaptation behavior; the quantitative claims now rest on the error-barred results. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces AdaMeZO as a new zeroth-order method that adapts Adam-style first- and second-moment estimates without memory storage. It supports the proposal via a distinct theoretical analysis section and extensive experiments that compare forward-pass counts and trajectory behavior against MeZO. No load-bearing step reduces by construction to a self-definition, a fitted parameter renamed as a prediction, or a self-citation chain whose cited result itself collapses to the present claim. The central performance claims rest on independent empirical validation and analysis rather than tautological equivalence to the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract provides no explicit free parameters, axioms, or invented entities. The method is presented as building directly on MeZO and Adam without additional postulated quantities.

pith-pipeline@v0.9.0 · 5491 in / 1000 out tokens · 49536 ms · 2026-05-09T19:46:58.850074+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

71 extracted references · 32 canonical work pages · 9 internal anchors

  1. [1]

    Langley , title =

    P. Langley , title =. Proceedings of the 17th International Conference on Machine Learning (ICML 2000) , address =. 2000 , pages =

  2. [2]

    T. M. Mitchell. The Need for Biases in Learning Generalizations. 1980

  3. [3]

    M. J. Kearns , title =

  4. [4]

    Machine Learning: An Artificial Intelligence Approach, Vol. I. 1983

  5. [5]

    R. O. Duda and P. E. Hart and D. G. Stork. Pattern Classification. 2000

  6. [6]

    Suppressed for Anonymity , author=

  7. [7]

    Newell and P

    A. Newell and P. S. Rosenbloom. Mechanisms of Skill Acquisition and the Law of Practice. Cognitive Skills and Their Acquisition. 1981

  8. [8]

    A. L. Samuel. Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development. 1959

  9. [9]

    arXiv preprint arXiv:2411.10696 , year=

    HELENE: Hessian Layer-wise Clipping and Gradient Annealing for Accelerating Fine-tuning LLM with Zeroth-order Optimization , author=. arXiv preprint arXiv:2411.10696 , year=

  10. [10]

    Adam: A Method for Stochastic Optimization

    Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=

  11. [11]

    Advances in Neural Information Processing Systems , volume=

    Fine-tuning language models with just forward passes , author=. Advances in Neural Information Processing Systems , volume=

  12. [12]

    ACM Transactions on Modeling and Computer Simulation (TOMACS) , volume=

    Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , author=. ACM Transactions on Modeling and Computer Simulation (TOMACS) , volume=. 1998 , publisher=

  13. [13]

    Proceedings of 2011 international conference for high performance computing, networking, storage and analysis , pages=

    Parallel random numbers: as easy as 1, 2, 3 , author=. Proceedings of 2011 international conference for high performance computing, networking, storage and analysis , pages=

  14. [14]

    , month = jan, year =

    Second-order fine-tuning without pain for llms: A hessian informed zeroth-order optimizer , author=. arXiv preprint arXiv:2402.15173 , year=

  15. [15]

    IEEE transactions on automatic control , volume=

    Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , author=. IEEE transactions on automatic control , volume=. 1992 , publisher=

  16. [16]

    arXiv preprint arXiv:2402.07114 , year=

    Towards quantifying the preconditioning effect of adam , author=. arXiv preprint arXiv:2402.07114 , year=

  17. [17]

    Sophia: A scalable stochastic second-order optimizer for language model pre-training.arXiv preprint arXiv:2305.14342, 2023

    Sophia: A scalable stochastic second-order optimizer for language model pre-training , author=. arXiv preprint arXiv:2305.14342 , year=

  18. [18]

    International Conference on Machine Learning , pages=

    An investigation into neural net optimization via hessian eigenvalue density , author=. International Conference on Machine Learning , pages=. 2019 , organization=

  19. [19]

    RoBERTa: A Robustly Optimized BERT Pretraining Approach

    Roberta: A robustly optimized bert pretraining approach , author=. arXiv preprint arXiv:1907.11692 , year=

  20. [20]

    OPT: Open Pre-trained Transformer Language Models

    Opt: Open pre-trained transformer language models , author=. arXiv preprint arXiv:2205.01068 , year=

  21. [21]

    LLaMA: Open and Efficient Foundation Language Models

    Llama: Open and efficient foundation language models , author=. arXiv preprint arXiv:2302.13971 , year=

  22. [22]

    PyTorch: An Imperative Style, High-Performance Deep Learning Library

    Pytorch: An imperative style, high-performance deep learning library , author=. arXiv preprint arXiv:1912.01703 , year=

  23. [23]

    1958 , publisher=

    On an iterative method for finding a local minimum of a function of more than one variable , author=. 1958 , publisher=

  24. [24]

    , author=

    Lora: Low-rank adaptation of large language models. , author=. ICLR , volume=

  25. [25]

    Prefix-Tuning: Optimizing Continuous Prompts for Generation

    Prefix-tuning: Optimizing continuous prompts for generation , author=. arXiv preprint arXiv:2101.00190 , year=

  26. [26]

    The Power of Scale for Parameter-Efficient Prompt Tuning

    The power of scale for parameter-efficient prompt tuning , author=. arXiv preprint arXiv:2104.08691 , year=

  27. [27]

    QLoRA: Efficient Finetuning of Quantized LLMs

    Qlora: Efficient finetuning of quantized llms, 2023 , author=. URL https://arxiv. org/abs/2305.14314 , volume=

  28. [28]

    Advances in Neural Information Processing Systems , volume=

    LISA: layerwise importance sampling for memory-efficient large language model fine-tuning , author=. Advances in Neural Information Processing Systems , volume=

  29. [29]

    nature , volume=

    Learning representations by back-propagating errors , author=. nature , volume=. 1986 , publisher=

  30. [30]

    SIAM review , volume=

    Optimization methods for large-scale machine learning , author=. SIAM review , volume=. 2018 , publisher=

  31. [31]

    Decoupled Weight Decay Regularization

    Decoupled weight decay regularization , author=. arXiv preprint arXiv:1711.05101 , year=

  32. [32]

    Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond

    Eigenvalues of the hessian in deep learning: Singularity and beyond , author=. arXiv preprint arXiv:1611.07476 , year=

  33. [33]

    The Eleventh International Conference on Learning Representations , year=

    Eva: Practical second-order optimization with kronecker-vectorized approximation , author=. The Eleventh International Conference on Learning Representations , year=

  34. [34]

    Memory-Efficient Block Coordinate Descent for Hessian-Informed Zeroth-Order Optimizer , author=

  35. [35]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Zo-adamu optimizer: Adapting perturbation by the momentum and uncertainty in zeroth-order optimization , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  36. [36]

    In: International Conference on Learning Representations (ICLR) 2020 (2020).https://arxiv.org/abs/1904.00962

    Large batch optimization for deep learning: Training bert in 76 minutes , author=. arXiv preprint arXiv:1904.00962 , year=

  37. [37]

    International Conference on Machine Learning , pages=

    Adafactor: Adaptive learning rates with sublinear memory cost , author=. International Conference on Machine Learning , pages=. 2018 , organization=

  38. [38]

    Advances in neural information processing systems , volume=

    Adabelief optimizer: Adapting stepsizes by the belief in observed gradients , author=. Advances in neural information processing systems , volume=

  39. [39]

    First Conference on Automated Machine Learning (Late-Breaking Workshop) , year=

    Evolved optimizer for vision , author=. First Conference on Automated Machine Learning (Late-Breaking Workshop) , year=

  40. [40]

    Adaptive gradient methods with dynamic bound of learning rate,

    Adaptive gradient methods with dynamic bound of learning rate , author=. arXiv preprint arXiv:1902.09843 , year=

  41. [41]

    On the variance of the adaptive learning rate and beyond

    On the variance of the adaptive learning rate and beyond , author=. arXiv preprint arXiv:1908.03265 , year=

  42. [42]

    Advances in Neural Information Processing Systems , volume=

    Why transformers need adam: A hessian perspective , author=. Advances in Neural Information Processing Systems , volume=

  43. [43]

    International Conference on Machine Learning , pages=

    Dissecting adam: The sign, magnitude and variance of stochastic gradients , author=. International Conference on Machine Learning , pages=. 2018 , organization=

  44. [44]

    arXiv preprint arXiv:2406.03276 , year=

    Revisiting scalable hessian diagonal approximations for applications in reinforcement learning , author=. arXiv preprint arXiv:2406.03276 , year=

  45. [45]

    NeurIPS Workshop on Bayesian Deep Learning , year=

    Laplace ap-proximation with diagonalized hessian for over-parameterized neural networks , author=. NeurIPS Workshop on Bayesian Deep Learning , year=

  46. [46]

    Fairscale: A general purpose modular pytorch library for high performance and large scale training , author=

  47. [47]

    Automation and Remote Control , volume=

    Algorithm for stochastic approximation with trial input perturbation in the nonstationary problem of optimization , author=. Automation and Remote Control , volume=. 2009 , publisher=

  48. [48]

    Automatica , volume=

    A one-measurement form of simultaneous perturbation stochastic approximation , author=. Automatica , volume=. 1997 , publisher=

  49. [49]

    Advances in Neural Information Processing Systems , volume=

    Query complexity of derivative-free optimization , author=. Advances in Neural Information Processing Systems , volume=

  50. [50]

    Advances in Neural Information Processing Systems , volume=

    Information-theoretic lower bounds on the oracle complexity of convex optimization , author=. Advances in Neural Information Processing Systems , volume=

  51. [51]

    IEEE Transactions on Information Theory , volume=

    Information-based complexity, feedback and dynamics in convex programming , author=. IEEE Transactions on Information Theory , volume=. 2011 , publisher=

  52. [52]

    Zeroth-order algorithms for nonconvex minimax problems with improved complexities

    Zeroth-order algorithms for nonconvex minimax problems with improved complexities , author=. arXiv preprint arXiv:2001.07819 , year=

  53. [53]

    Sparse mezo: Less parameters for better performance in zeroth-order llm fine-tuning.arXiv preprint arXiv:2402.15751,

    Sparse mezo: Less parameters for better performance in zeroth-order llm fine-tuning , author=. arXiv preprint arXiv:2402.15751 , year=

  54. [54]

    arXiv preprint arXiv:2406.02913 , year=

    Zeroth-order fine-tuning of llms with extreme sparsity , author=. arXiv preprint arXiv:2406.02913 , year=

  55. [55]

    Enhancing zeroth-order fine-tuning for language models with low-rank structures.arXiv preprint arXiv:2410.07698,

    Enhancing zeroth-order fine-tuning for language models with low-rank structures , author=. arXiv preprint arXiv:2410.07698 , year=

  56. [56]

    Towards Efficient Low-order Hybrid Optimizer for Language Model Fine-tuning , author=

  57. [57]

    Harmony in divergence: Towards fast, accurate, and memory-efficient zeroth-order LLM fine-tuning

    Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning , author=. arXiv preprint arXiv:2502.03304 , year=

  58. [58]

    A memory efficient randomized subspace optimization method for training large language models.arXiv preprint arXiv:2502.07222, 2025

    A Memory Efficient Randomized Subspace Optimization Method for Training Large Language Models , author=. arXiv preprint arXiv:2502.07222 , year=

  59. [59]

    arXiv preprint arXiv:2501.19057 , year=

    TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs , author=. arXiv preprint arXiv:2501.19057 , year=

  60. [60]

    1978 , publisher=

    The moments of products of quadratic forms in normal variables , author=. 1978 , publisher=

  61. [61]

    2024 USENIX Annual Technical Conference (USENIX ATC 24) , pages=

    \ FwdLLM \ : Efficient federated finetuning of large language models with perturbed inferences , author=. 2024 USENIX Annual Technical Conference (USENIX ATC 24) , pages=

  62. [62]

    arXiv preprint arXiv:2312.06353 , year=

    Federated full-parameter tuning of billion-sized language models with communication cost under 18 kilobytes , author=. arXiv preprint arXiv:2312.06353 , year=

  63. [63]

    Advances in neural information processing systems , volume=

    Adam can converge without any modification on update rules , author=. Advances in neural information processing systems , volume=

  64. [64]

    Zeroth-order optimization meets human feedback: Provable learning via ranking oracles, 2024

    Zeroth-order optimization meets human feedback: Provable learning via ranking oracles , author=. arXiv preprint arXiv:2303.03751 , year=

  65. [65]

    Bert: Pre-training of deep bidirectional transformers for language understanding , author=. Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers) , pages=

  66. [66]

    Why trans- formers need adam: A hessian perspective

    Adam-mini: Use fewer learning rates to gain more , author=. arXiv preprint arXiv:2406.16793 , year=

  67. [67]

    and Yin, Wotao and Hong, Mingyi and Wang, Zhangyang and Liu, Sijia and Chen, Tianlong , month = may, year =

    Revisiting zeroth-order optimization for memory-efficient llm fine-tuning: A benchmark , author=. arXiv preprint arXiv:2402.11592 , year=

  68. [68]

    arXiv preprint arXiv:2310.02025 , year=

    Deepzero: Scaling up zeroth-order optimization for deep model training , author=. arXiv preprint arXiv:2310.02025 , year=

  69. [69]

    Training deep learning models with norm-constrained lmos.arXiv preprint arXiv:2502.07529, 2025

    Training deep learning models with norm-constrained lmos , author=. arXiv preprint arXiv:2502.07529 , year=

  70. [70]

    Advances in Neural Information Processing Systems , volume=

    The road less scheduled , author=. Advances in Neural Information Processing Systems , volume=

  71. [71]

    Noise is not the main factor behind the gap between sgd and adam on transformers, but sign descent might be

    Noise is not the main factor behind the gap between sgd and adam on transformers, but sign descent might be , author=. arXiv preprint arXiv:2304.13960 , year=