Losses that Cook: Topological Optimal Transport for Structured Recipe Generation

Daniele Rege Cambrin; Mattia Ottoborgo; Paolo Garza

arxiv: 2601.02531 · v2 · submitted 2026-01-05 · 💻 cs.CL · cs.AI

Losses that Cook: Topological Optimal Transport for Structured Recipe Generation

Mattia Ottoborgo , Daniele Rege Cambrin , Paolo Garza This is my paper

Pith reviewed 2026-05-16 17:26 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords recipe generationoptimal transporttopological lossstructured text generationingredient predictionNLG evaluation

0 comments

The pith

A topological loss that models ingredients as point clouds in embedding space improves recipe ingredient accuracy and procedural coherence over standard cross-entropy training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper explores training objectives for recipe text generation that go beyond fluency-focused cross-entropy loss. It introduces a topological loss based on optimal transport that represents lists of ingredients as point clouds in embedding space and minimizes the divergence between predicted and reference lists. Experiments on the RECIPE-NLG dataset show gains on ingredient and action metrics, with a Dice loss excelling at time and temperature precision and a mixed loss offering balanced improvements. Human raters prefer outputs from the new loss in 62 percent of direct comparisons.

Core claim

Representing ingredient lists as point clouds in embedding space and minimizing their optimal-transport divergence produces generated recipes with measurably higher ingredient fidelity, action accuracy, and overall human preference than models trained only with cross-entropy.

What carries the argument

Topological optimal-transport loss that encodes predicted and gold ingredient lists as point clouds and minimizes their divergence in embedding space.

If this is right

Ingredient- and action-level metrics rise compared with pure cross-entropy baselines.
Dice loss alone yields the best time and temperature precision.
A mixed loss combining the new term with Dice produces synergistic gains on quantity and timing.
Human preference reaches 62 percent for models trained with the topological loss.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same point-cloud optimal-transport construction could be applied to other structured generation domains that require set-level fidelity, such as shopping lists or medical procedure steps.
Refining the ingredient embedding space to encode substitution relations or nutritional compatibility would likely amplify the loss's effect.
Pairing the loss with an explicit duration-prediction head could further tighten timing accuracy beyond what the current mixed objective achieves.

Load-bearing premise

That treating ingredient lists as point clouds and minimizing their optimal-transport divergence is sufficient to enforce timing accuracy, procedural coherence, and overall recipe quality without further explicit constraints.

What would settle it

On a held-out recipe corpus, an ablation that removes the optimal-transport term but keeps all other training details shows no drop in ingredient-level F1 or action accuracy.

read the original abstract

Cooking recipes are complex procedures that require not only a fluent and factual text, but also accurate timing, temperature, and procedural coherence, as well as the correct composition of ingredients. Standard training procedures are primarily based on cross-entropy and focus solely on fluency. Building on RECIPE-NLG, we investigate the use of several composite objectives and present a new topological loss that represents ingredient lists as point clouds in embedding space, minimizing the divergence between predicted and gold ingredients. Using both standard NLG metrics and recipe-specific metrics, we find that our loss significantly improves ingredient- and action-level metrics. Meanwhile, the Dice loss excels in time/temperature precision, and the mixed loss yields competitive trade-offs with synergistic gains in quantity and time. A human preference analysis supports our finding, showing our model is preferred in 62% of the cases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Ingredient OT loss improves recipe metrics over baselines with human support, but action-level gains lack isolating evidence.

read the letter

The paper's key move is to represent ingredient lists as point clouds in embedding space and minimize an optimal transport divergence to the gold lists as a training loss. This produces better results on ingredient and action metrics than the usual cross-entropy baseline, with a human study backing it up at 62% preference. They do a solid job exploring several composite objectives on the RECIPE-NLG foundation. The topological loss shines on ingredient and action scores, Dice handles time and temperature better, and the mix gives balanced gains in quantity and timing. Using recipe-specific metrics alongside standard ones adds useful detail. The soft spot is that the loss only directly constrains ingredients, so the action-level improvements lack direct mechanistic support. No ablation is mentioned that removes the OT term while keeping everything else the same to see if action metrics fall. Without that, it's difficult to credit the new loss specifically for the procedural gains rather than the composite training in general. This paper is for researchers focused on recipe generation and structured NLG. It has real experiments and human evaluation, so it deserves a serious referee even if the attribution needs tightening. I would recommend sending it to peer review.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces a topological optimal transport loss for recipe generation on the RECIPE-NLG dataset. Ingredient lists are modeled as point clouds in embedding space, and the loss minimizes divergence between predicted and gold ingredient sets. The authors compare this loss against Dice loss and mixed objectives, reporting gains on both ingredient- and action-level metrics, competitive trade-offs on quantity/time/temperature, and a human preference study in which their model is selected in 62% of cases.

Significance. If the reported gains are robustly attributable to the proposed loss, the work provides evidence that topology-aware structured objectives can improve compositional accuracy and downstream procedural coherence in complex generation tasks. This could generalize to other domains requiring set-level fidelity, such as plan or code generation.

major comments (1)

[Abstract / Experiments] Abstract and experimental sections: the claim that the topological OT loss produces significant gains on action-level metrics (procedural steps, timing, coherence) is load-bearing for the central contribution, yet no ablation is described that removes the OT term while holding the remainder of the composite objective fixed. Because the loss operates exclusively on ingredient embeddings, the mechanism linking ingredient alignment to action improvements remains unverified.

minor comments (2)

[Abstract] The human preference result (62%) is presented without details on participant count, comparison protocol, or statistical testing; adding these would allow readers to assess reliability.
[Methods] Notation for the OT loss (point-cloud representation, cost function, and regularization) should be defined explicitly with equation numbers in the methods section to support reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback. We address the major comment regarding the ablation of the OT term below, and plan to revise the manuscript accordingly to strengthen the evidence for our claims.

read point-by-point responses

Referee: [Abstract / Experiments] Abstract and experimental sections: the claim that the topological OT loss produces significant gains on action-level metrics (procedural steps, timing, coherence) is load-bearing for the central contribution, yet no ablation is described that removes the OT term while holding the remainder of the composite objective fixed. Because the loss operates exclusively on ingredient embeddings, the mechanism linking ingredient alignment to action improvements remains unverified.

Authors: We agree that an explicit ablation isolating the contribution of the topological OT loss within a fixed composite objective would provide stronger evidence for the causal link to action-level improvements. In the current experiments, the cross-entropy baseline represents training without the OT loss, while our proposed model uses the OT loss (potentially in combination with other terms as in the mixed objective). The observed gains in procedural metrics suggest that better ingredient set alignment facilitates more coherent action generation, as the model is trained to produce recipes with accurate ingredients and steps. However, to directly address the concern, we will include an additional ablation study in the revised version where we compare the full composite objective with and without the OT term. This will verify the mechanism by which ingredient embedding alignment translates to improvements in timing, coherence, and procedural steps. We will also update the abstract and experimental sections to reflect these new results and clarify the connection. revision: yes

Circularity Check

0 steps flagged

No circularity: new OT loss defined independently and evaluated on held-out metrics

full rationale

The paper defines a topological optimal transport loss that represents ingredient lists as point clouds in embedding space and minimizes divergence to gold ingredients. This construction is stated directly from first principles (optimal transport on embeddings) without reducing to any fitted parameter that is then re-predicted or to a self-citation chain. Reported gains on ingredient- and action-level metrics are measured on held-out data using standard NLG and recipe-specific metrics plus human preference; no equation or claim equates a prediction to its own input by construction. The central result therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review prevents identification of specific free parameters or axioms; the topological loss implicitly assumes an embedding space in which ingredient similarity is meaningfully captured by point-cloud geometry.

pith-pipeline@v0.9.0 · 5442 in / 1047 out tokens · 35887 ms · 2026-05-16T17:26:11.969093+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

topological loss that represents ingredient lists as point clouds in embedding space, minimizing the divergence between predicted and gold ingredients... LTopo = Sϵ(PCpred, PCtarget)
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Sinkhorn divergence... differentiable approximation of optimal transport (Wasserstein) distance

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.