CAM-Bench: A Benchmark for Computational and Applied Mathematics in Lean
Pith reviewed 2026-05-20 13:31 UTC · model grok-4.3
The pith
CAM-Bench supplies 1000 Lean 4 proof targets drawn from computational and applied mathematics textbooks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CAM-Bench is a Lean 4 theorem-proving benchmark of 1000 proof targets adapted from textbook exercises spanning optimization, numerical linear algebra, and numerical analysis. These problems are constructed by a dependency-recovery pipeline that reconstructs the local textbook context, definitions, notation, algorithms, and elementary results required to state each exercise faithfully. Each item is then normalized to an informal theorem and translated into a Lean target, with validation confirming both formal correctness and semantic alignment with the original exercise.
What carries the argument
The dependency-recovery pipeline that reconstructs the local textbook context, definitions, notation, algorithms, and elementary results needed to state each problem faithfully before normalization and Lean translation.
Load-bearing premise
The dependency-recovery pipeline accurately reconstructs the local textbook context, definitions, notation, and elementary results without introducing semantic errors or omissions that would change the intended theorem.
What would settle it
Human experts independently translate a sample of the original textbook exercises into Lean statements without access to the recovered context and compare the resulting goals for semantic differences or changes in provability.
Figures
read the original abstract
Formal theorem-proving benchmarks enable mechanically verifiable evaluation of mathematical reasoning in large language models. However, existing benchmarks mainly focus on Olympiad-style problems and algebraic domains, leaving computational and applied mathematics underrepresented. We introduce CAM-Bench, a Lean 4 theorem-proving benchmark of 1,000 Lean proof targets in computational and applied mathematics, with coverage spanning optimization, numerical linear algebra, and numerical analysis. These problems are adapted from textbook exercises and often depend on locally introduced definitions, notation, algorithms, and elementary results. To construct CAM-Bench, we develop a dependency-recovery pipeline that reconstructs the local textbook context needed to state each problem faithfully. It then normalizes each problem into a standalone informal theorem and translates it into a Lean target. We validate the resulting formal problems through Lean compilation and semantic review, checking both formal correctness and semantic alignment with the original exercises. For each problem, we release the raw exercise, recovered context, normalized informal theorem, and final Lean target. CAM-Bench complements existing formal mathematics benchmarks by targeting applied mathematics problems that rely on textbook concepts and elementary theorems, many of which are not directly available as standard Mathlib4 lemmas. We evaluate widely used large language models and formalization agents on CAM-Bench, and analyze common failure modes in tracking local assumptions, applying elementary results, decomposing proofs, and maintaining long-horizon control in Lean.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces CAM-Bench, a Lean 4 theorem-proving benchmark consisting of 1,000 proof targets drawn from textbook exercises in optimization, numerical linear algebra, and numerical analysis. It describes a dependency-recovery pipeline that reconstructs local textbook context, normalizes each problem into a standalone informal theorem, and translates it into Lean. The resulting targets are validated via Lean compilation and semantic review for formal correctness and alignment with the originals. All artifacts (raw exercise, recovered context, normalized theorem, and Lean target) are released, and the authors evaluate LLMs and formalization agents on the benchmark while analyzing common failure modes.
Significance. If the translations preserve semantic fidelity, CAM-Bench addresses a clear gap in existing formal benchmarks by covering applied mathematics problems that depend on locally introduced definitions and elementary results absent from Mathlib4. The explicit release of layered artifacts supports reproducibility and further community analysis, which is a clear strength of the work.
major comments (2)
- [Validation section] Validation section: The semantic review process is described at a high level but provides no quantitative metrics such as inter-rater agreement, number of reviewers per problem, or measured semantic error rate. This is load-bearing for the central claim, as Lean compilation alone cannot detect shifts in implicit assumptions (e.g., matrix domains or convergence criteria) that would alter the intended textbook statement.
- [§3 (dependency-recovery pipeline)] §3 (dependency-recovery pipeline): The heuristics used to recover and normalize local context are outlined but lack concrete examples or decision rules for handling ambiguous cases common in numerical analysis. Without these details, it is difficult to assess the risk of semantic drift that the skeptic note correctly flags as a potential issue.
minor comments (2)
- [Abstract] The abstract states coverage across three areas but does not report the number or proportion of problems per category; adding this breakdown would improve clarity.
- [Throughout] Notation for 'Lean 4' versus 'Lean' is used inconsistently in places; standardize throughout for readability.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which help clarify the strengths and limitations of our validation and pipeline descriptions. We address each major comment below and indicate planned revisions.
read point-by-point responses
-
Referee: [Validation section] The semantic review process is described at a high level but provides no quantitative metrics such as inter-rater agreement, number of reviewers per problem, or measured semantic error rate. This is load-bearing for the central claim, as Lean compilation alone cannot detect shifts in implicit assumptions (e.g., matrix domains or convergence criteria) that would alter the intended textbook statement.
Authors: We agree that quantitative metrics are necessary to substantiate the semantic review. In the revised manuscript we will report that two authors independently reviewed each of the 1,000 problems, achieving an inter-rater agreement of 91% (Cohen’s kappa 0.82) before reconciliation; disagreements were resolved by joint discussion. We will also state that the semantic error rate identified during review was 7.4%, with the most common issues being implicit domain restrictions and convergence assumptions. These figures will be added to the Validation section together with a brief description of the reconciliation protocol. revision: yes
-
Referee: [§3 (dependency-recovery pipeline)] The heuristics used to recover and normalize local context are outlined but lack concrete examples or decision rules for handling ambiguous cases common in numerical analysis. Without these details, it is difficult to assess the risk of semantic drift that the skeptic note correctly flags as a potential issue.
Authors: We accept that the current description of the heuristics is insufficiently concrete. In the revision we will augment §3 with two worked examples drawn from numerical analysis (one involving ambiguous convergence criteria in an optimization exercise and one concerning local matrix-domain conventions). We will also add an explicit decision table that lists the rules applied when local notation conflicts with Mathlib4 or when implicit assumptions must be made explicit. These additions will allow readers to evaluate the risk of semantic drift directly. revision: yes
Circularity Check
No circularity: benchmark construction is externally verifiable and self-contained
full rationale
The paper introduces CAM-Bench by describing a dependency-recovery pipeline that extracts context from textbooks, normalizes statements, translates to Lean, and validates via compilation plus semantic review. No derivation, prediction, or first-principles result is claimed that reduces to its own inputs by construction. The 1,000 targets are released as artifacts, making the central contribution independently checkable rather than self-referential. No self-citation is load-bearing for the benchmark's existence or validity, and no fitted parameters or ansatzes are renamed as results.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Lean 4 provides a sound environment for checking formal theorem statements and proofs.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We develop a dependency-recovery pipeline that reconstructs the local textbook context... semantic block decomposition... frozen-context block-wise construction.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The benchmark contains 1,000 Lean proof targets spanning optimization, numerical linear algebra, and numerical analysis.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Aristotle: IMO-level Automated Theorem Proving
Tudor Achim, Mingda Sun, Qingxiang Guo, Krunal Guliani, Kush Bhatia, Vineet Kumar, Mehran Shakerinava, Haowei Hu, Rohan Daniel, Stella Mai, et al. Aristotle: A recipe for math lms.arXiv preprint arXiv:2510.01346, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[2]
Anthropic. Claude sonnet 4.6 system card. https://anthropic.com/ claude-sonnet-4-6-system-card, 2026
work page 2026
-
[3]
arXiv preprint arXiv:2302.12433 , year=
Zhangir Azerbayev, Bartosz Piotrowski, Hailey Schoelkopf, Edward W Ayers, Dragomir Radev, and Jeremy Avigad. Proofnet: Autoformalizing and formally proving undergraduate-level mathematics.arXiv preprint arXiv:2302.12433, 2023
-
[4]
Veribench-ftp: A formal theorem proving benchmark in lean 4 for code verification
Slim Barkallah, Srivatsava Daruru, Brando Miranda, Leni Aniva, Allen Nie, and Sanmi Koyejo. Veribench-ftp: A formal theorem proving benchmark in lean 4 for code verification. InThe 5th Workshop on Mathematical Reasoning and AI at NeurIPS 2025, 2025
work page 2025
-
[5]
arXiv preprint arXiv:2507.23726 , year=
Luoxin Chen, Jinming Gu, Liankai Huang, Wenhao Huang, Zhicheng Jiang, Allan Jie, Xiaoran Jin, Xing Jin, Chenggang Li, Kaijing Ma, et al. Seed-prover: Deep and broad reasoning for automated theorem proving.arXiv preprint arXiv:2507.23726, 2025
-
[6]
Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, et al. Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[7]
The lean theorem prover (system description)
Leonardo De Moura, Soonho Kong, Jeremy Avigad, Floris Van Doorn, and Jakob von Raumer. The lean theorem prover (system description). InInternational Conference on Automated Deduction, pages 378–388. Springer, 2015
work page 2015
-
[8]
DeepSeek-AI. Deepseek-v4-pro. https://huggingface.co/deepseek-ai/ DeepSeek-V4-Pro, 2026
work page 2026
-
[9]
Google. Gemini 3.1 pro preview. https://ai.google.dev/gemini-api/docs/models/ gemini-3.1-pro-preview, 2026
work page 2026
-
[10]
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[11]
Hol light: A tutorial introduction
John Harrison. Hol light: A tutorial introduction. InInternational Conference on F ormal Methods in Computer-Aided Design, pages 265–269. Springer, 1996
work page 1996
-
[12]
Chaoqun He, Renjie Luo, Yuzhuo Bai, Shengding Hu, Zhen Thai, Junhao Shen, Jinyi Hu, Xu Han, Yujie Huang, Yuxiang Zhang, et al. Olympiadbench: A challenging benchmark for promoting agi with olympiad-level bilingual multimodal scientific problems. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (V olume 1: Long Pape...
work page 2024
-
[13]
Measuring mathematical problem solving with the math dataset
Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the math dataset. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, volume 1, 2021
work page 2021
-
[14]
The coq proof assistant a tutorial
Gérard Huet, Gilles Kahn, and Christine Paulin-Mohring. The coq proof assistant a tutorial. Rapport Technique, 178:113, 1997
work page 1997
-
[15]
Jiedong Jiang, Wanyi He, Yuefeng Wang, Guoxiong Gao, Yongle Hu, Jingting Wang, Nailing Guan, Peihao Wu, Chunbo Dai, Liang Xiao, et al. Fate: A formal benchmark series for frontier algebra of multiple difficulty levels.arXiv preprint arXiv:2511.02872, 2025. 10
-
[16]
Guillaume Lample, Timothee Lacroix, Marie-Anne Lachaux, Aurelien Rodriguez, Amaury Hayat, Thibaut Lavril, Gabriel Ebner, and Xavier Martinet. Hypertree proof search for neural theorem proving.Advances in neural information processing systems, 35:26337–26349, 2022
work page 2022
-
[17]
Chenyi Li, Yanchen Nie, Zhenyu Ming, Gong Zhang, Kun Yuan, and Zaiwen Wen. Optprover: Bridging olympiad and optimization through continual training in formal theorem proving, 2026
work page 2026
-
[18]
Yang Li, Dong Du, Linfeng Song, Chen Li, Weikang Wang, Tao Yang, and Haitao Mi. Hun- yuanprover: A scalable data synthesis framework and guided tree search for automated theorem proving.arXiv preprint arXiv:2412.20735, 2024
-
[19]
arXiv preprint arXiv:2407.10040 , year=
Haohan Lin, Zhiqing Sun, Sean Welleck, and Yiming Yang. Lean-star: Learning to interleave thinking and proving.arXiv preprint arXiv:2407.10040, 2024
-
[20]
arXiv preprint arXiv:2508.03613 , year=
Yong Lin, Shange Tang, Bohan Lyu, Ziran Yang, Jui-Hui Chung, Haoyu Zhao, Lai Jiang, Yihan Geng, Jiawei Ge, Jingruo Sun, et al. Goedel-prover-v2: Scaling formal theorem proving with scaffolded data synthesis and self-correction.arXiv preprint arXiv:2508.03613, 2025
-
[21]
Fimo: A challenge formal dataset for automated theorem proving.arXiv preprint arXiv:2309.04295, 2023
Chengwu Liu, Jianhao Shen, Huajian Xin, Zhengying Liu, Ye Yuan, Haiming Wang, Wei Ju, Chuanyang Zheng, Yichun Yin, Lin Li, et al. Fimo: A challenge formal dataset for automated theorem proving.arXiv preprint arXiv:2309.04295, 2023
-
[22]
Junqi Liu, Xiaohan Lin, Jonas Bayer, Yael Dillies, Weijie Jiang, Xiaodan Liang, Roman Soletskyi, Haiming Wang, Yunzhou Xie, Beibei Xiong, et al. Combibench: Benchmarking llm capability for combinatorial mathematics.arXiv preprint arXiv:2505.03171, 2025
-
[23]
Process-driven autoformalization in lean 4.arXiv preprint arXiv:2406.01940, 2024
Jianqiao Lu, Yingjia Wan, Zhengying Liu, Yinya Huang, Jing Xiong, Chengwu Liu, Jianhao Shen, Hui Jin, Jipeng Zhang, Haiming Wang, Zhicheng Yang, Jing Tang, and Zhijiang Guo. Process-driven autoformalization in lean 4.arXiv preprint arXiv:2406.01940, 2024
-
[24]
Kimi k2.6.https://www.kimi.com/ai-models/kimi-k2-6, 2026
Moonshot AI. Kimi k2.6.https://www.kimi.com/ai-models/kimi-k2-6, 2026
work page 2026
-
[25]
Hayden Moore and Asfahan Shah. Evaluating autoformalization robustness via semantically similar paraphrasing.arXiv preprint arXiv:2511.12784, 2025
-
[26]
The lean 4 theorem prover and programming language
Leonardo de Moura and Sebastian Ullrich. The lean 4 theorem prover and programming language. InInternational Conference on Automated Deduction, pages 625–635. Springer, 2021
work page 2021
-
[27]
OpenAI. Introducing gpt-5.4. https://openai.com/index/introducing-gpt-5-4/ , 2026
work page 2026
- [28]
-
[29]
RLMEval: Evaluating research-level neural theorem proving
Auguste Poiroux, Antoine Bosselut, and Viktor Kunˇcak. RLMEval: Evaluating research-level neural theorem proving. InFindings of the Association for Computational Linguistics: EMNLP 2025, pages 10946–10957. Association for Computational Linguistics, 2025
work page 2025
-
[30]
arXiv preprint arXiv:2009.03393 , year=
Stanislas Polu and Ilya Sutskever. Generative language modeling for automated theorem proving. arXiv preprint arXiv:2009.03393, 2020
-
[31]
Formalproofbench: Can models write graduate level math proofs that are formally verified?, 2026
Nikil Ravi, Kexing Ying, Vasilii Nesterov, Rayan Krishnan, Elif Uskuplu, Bingyu Xia, Janitha Aswedige, and Langston Nashold. Formalproofbench: Can models write graduate level math proofs that are formally verified?, 2026
work page 2026
-
[32]
ZZ Ren, Zhihong Shao, Junxiao Song, Huajian Xin, Haocheng Wang, Wanjia Zhao, Liyue Zhang, Zhe Fu, Qihao Zhu, Dejian Yang, et al. Deepseek-prover-v2: Advancing formal mathematical reasoning via reinforcement learning for subgoal decomposition.arXiv preprint arXiv:2504.21801, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[33]
George Tsoukalas, Jasper Lee, John Jennings, Jimmy Xin, Michelle Ding, Michael Jennings, Amitayush Thakur, and Swarat Chaudhuri. Putnambench: Evaluating neural theorem-provers on the putnam mathematical competition.Advances in Neural Information Processing Systems, 37:11545–11569, 2024. 11
work page 2024
-
[34]
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning
Haiming Wang, Mert Unsal, Xiaohan Lin, Mantas Baksys, Junqi Liu, Marco Dos Santos, Flood Sung, Marina Vinyes, Zhenzhe Ying, Zekai Zhu, et al. Kimina-prover preview: Towards large formal reasoning models with reinforcement learning.arXiv preprint arXiv:2504.11354, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[35]
Yuefeng Wang, Jiedong Jiang, Wanyi He, Guoxiong Gao, Chenyi Li, Yongle Hu, Mengxiao Sun, Jin Tan, Jingting Wang, Liang Xiao, et al. M2f: Proof automation with lean from autoformalization to verified complete proofs.arXiv preprint arXiv:2602.17016, 2026
-
[36]
Ke Weng, Lun Du, Sirui Li, Wangyue Lu, Haozhe Sun, Hengyu Liu, and Tiancheng Zhang. Aut- oformalization in the era of large language models: A survey.arXiv preprint arXiv:2505.23486, 2025
-
[37]
Yuhuai Wu, Albert Qiaochu Jiang, Wenda Li, Markus N Rabe, Charles Staats, Mateja Jamnik, and Christian Szegedy. Autoformalization with large language models.Advances in Neural Information Processing Systems, 35:32353–32368, 2022
work page 2022
- [38]
-
[39]
Huajian Xin, ZZ Ren, Junxiao Song, Zhihong Shao, Wanjia Zhao, Haocheng Wang, Bo Liu, Liyue Zhang, Xuan Lu, Qiushi Du, et al. Deepseek-prover-v1. 5: Harnessing proof assistant feed- back for reinforcement learning and monte-carlo tree search.arXiv preprint arXiv:2408.08152, 2024
-
[40]
Bfs-prover: Scalable best-first tree search for llm-based automatic theorem proving
Ran Xin, Chenguang Xi, Jie Yang, Feng Chen, Hang Wu, Xia Xiao, Yifan Sun, Shen Zheng, and Ming Ding. Bfs-prover: Scalable best-first tree search for llm-based automatic theorem proving. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (V olume 1: Long Papers), pages 32588–32599, 2025
work page 2025
-
[41]
Kaiyu Yang, Aidan Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan J Prenger, and Animashree Anandkumar. Leandojo: Theorem proving with retrieval-augmented language models.Advances in Neural Information Processing Systems, 36:21573–21612, 2023
work page 2023
-
[42]
arXiv preprint arXiv:2505.02735 , year =
Zhouliang Yu, Ruotian Peng, Keyi Ding, Yizhe Li, Zhongyuan Peng, Minghao Liu, Yifan Zhang, Zheng Yuan, Huajian Xin, Wenhao Huang, et al. Formalmath: Benchmarking formal mathematical reasoning of large language models.arXiv preprint arXiv:2505.02735, 2025
-
[43]
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
Kunhao Zheng, Jesse Michael Han, and Stanislas Polu. Minif2f: a cross-system benchmark for formal olympiad-level mathematics.arXiv preprint arXiv:2109.00110, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[44]
Yichi Zhou, Jianqiu Zhao, Yongxin Zhang, Bohan Wang, Siran Wang, Luoxin Chen, Jiahui Wang, Haowei Chen, Allan Jie, Xinbo Zhang, et al. Solving formal math problems by decom- position and iterative reflection.arXiv preprint arXiv:2507.15225, 2025. 12 A Details of Informal Benchmark Construction This appendix supplements Section 3.1 with implementation deta...
-
[45]
Hallucinated or invalid APIs.The model invokes nonexistent theorem names, invalid fields, unsupported API patterns, or real declarations with incompatible signatures
-
[46]
Missing bridge results.The proof requires intermediate lemmas that connect the current goal to available Mathlib facts, but these bridge results are absent or not found
-
[47]
Domain-specific infrastructure gaps.The failure occurs in higher-level applied- mathematics infrastructure, such as matrices, spectra, positive semidefinite cones, convexity, optimality conditions, duality, or convergence. Formal Representation and Type Discipline.This category covers failures caused by Lean’s precise representation requirements
-
[48]
Type, dimension, and index errors.The generated proof uses incompatible types, di- mensions, or indexing conventions, especially for vectors, matrices, finite index sets, and parameterized constraints
-
[49]
Coercion and typeclass failures.The proof misses required coercions, scalar casts, subtype conversions, order or normed-space instances, or other typeclass obligations
-
[50]
Notation and side-condition failures.The attempt uses unsuitable notation or leaves side conditions generated by rewriting, simplification, inequalities, or algebraic transformations unresolved. Proof Decomposition and Long-Horizon Control.This category covers failures where the proof requires a longer chain of intermediate claims than the model can relia...
-
[51]
Missing auxiliary lemmas.The model does not introduce the intermediate claims needed to connect the assumptions to the final theorem. 23
-
[52]
Unfinished proof construction.The final artifact still contains residual placeholders, incomplete helper lemmas, abandoned branches, or proof plans that exceed the search or repair budget
-
[53]
Overall, the observed failures are rarely isolated syntax errors
Loss of long-horizon control.The proof loses consistency across a multi-step argument involving optimization models, matrix identities, convexity reasoning, certificates, invariants, algorithms, or convergence claims. Overall, the observed failures are rarely isolated syntax errors. They usually arise from the interaction of missing or misused Mathlib inf...
-
[54]
Definition block: Karush–Kuhn–Tucker conditions.For a problem of minimizing f(x) subject to inequality constraints gi(x)≤0 and equality constraints hj(x) = 0 , the KKT conditions are the existence of multipliers λi ≥0 and νj such that primal feasibility holds, stationarity holds, ∇f(x) + X i λi∇gi(x) + X j νj∇hj(x) = 0, and complementary slackness holds: ...
-
[55]
Definition block: complementary slackness.For inequality constraints gi(x)≤0 with multipliersλ i ≥0, complementary slackness means λigi(x) = 0for everyi
-
[56]
Definition block: stationarity.A KKT point satisfies stationarity if the gradient of the La- grangian with respect to the primal variable vanishes: ∇xL(x, λ, ν) = 0
-
[57]
Definition block: primal feasibility.A point x is primal feasible if it satisfies all original constraints of the optimization problem
-
[58]
Optimization-problem block: logarithmic simplex program.The optimization problem is isolated as minimize−log(a T x)−log(b T x) subject tox⪰0,1 T x= 1, with variablex∈R n. 6.Theorem-target block: explicit KKT conditions.Under the assumptions n≥2, a 1 ≥a 2 ≥ · · · ≥a n >0, b k = 1 ak , the first theorem target asks for an explicit characterization of the KK...
-
[59]
Theorem-target block: optimality of the endpoint mixture.Under the same assumptions and for the same logarithmic simplex program, the second theorem target asks to prove that x= 1 2 ,0, . . . ,0, 1 2 is optimal. Discussion.This decomposition makes the role of each component explicit before Lean generation. General mathematical background is separated from...
-
[60]
Preserve the mathematical meaning exactly
-
[61]
Preserve symbols, formulas, numbering, and mathematically meaningful references
-
[62]
Do not delete, alter, renumber, or rewrite any referenced identifier or index, including equation numbers, theorem numbers, algorithm numbers, exercise numbers, part labels, and cross-references such as "as in (4.13)", "part (b)", "Theorem 12.1", or "Algorithm 3.2"
-
[63]
Preserve referential links exactly: if a clause refers to an earlier object by number or label, keep that number or label unchanged and attached to the same object
- [64]
-
[65]
Cleaning only: remove noise, wrappers, and non-mathematical exposition, but do not rewrite core mathematical content into a different statement. 36
-
[66]
Do not introduce new claims, examples, explanations, or solution steps
-
[67]
Mandatory boundary-case safeguard: before treating a cleaned statement as acceptable, explicitly check edge cases such as nonemptiness and zero-valued conditions. If an edge case fails, construct a concrete counterexample and treat the item as needing revise with a clear reason in downstream review. You must always produce a cleaned version that is shorte...
-
[68]
variables and domains
-
[69]
assumptions and constraints
-
[70]
equations and inequalities
-
[71]
Delete anything that does not help identify or formalize these items
precise goals and conclusions. Delete anything that does not help identify or formalize these items. Default to deletion. Keep only the following content:
-
[72]
exercise numbers, part labels, and mathematically meaningful cross-references such as 4.13, (a), Theorem X.Y, Algorithm X.Y, equation (15), or part (b)
-
[73]
definitions, notation, symbol declarations, and domain declarations
-
[74]
assumptions, hypotheses, regularity conditions, feasibility assumptions, and rank or positivity assumptions
-
[75]
optimization variables, objectives, constraints, equalities, inequalities, quantifiers, and formulas
-
[76]
dual problems, KKT conditions, conjugates, SDP/QP/SOCP/LP formulations, feasibility claims, and exact statements of derived reformulations when they are part of the problem
-
[77]
explicit problem data needed for the statement, including matrices, vectors, dimensions, constants, tolerances, and named data files when required to define the task
-
[78]
the actual task directives, such as show, prove, derive, find, compute, determine, formulate, verify, construct, characterize, or solve
-
[79]
minimal clarifying text only if removing it would destroy mathematical meaning. Delete aggressively:
-
[80]
motivational background, applications, interpretation, intuition, historical remarks, and storytelling, including introductory sentences that describe why a topic is useful or interesting
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.