A Two-Phase Adaptive Balanced Penalty Method for Controllable Pareto Front Learning under Split Feasibility Conditions

Dung D. Le; Nguyen Viet Hoang; Tran Ngoc Thang

arxiv: 2605.19306 · v1 · pith:4B4FX6OAnew · submitted 2026-05-19 · 💻 cs.LG · math.OC

A Two-Phase Adaptive Balanced Penalty Method for Controllable Pareto Front Learning under Split Feasibility Conditions

Nguyen Viet Hoang , Dung D. Le , Tran Ngoc Thang This is my paper

Pith reviewed 2026-05-20 06:56 UTC · model grok-4.3

classification 💻 cs.LG math.OC

keywords Controllable Pareto Front LearningHypernetworksSplit FeasibilityAdaptive Penalty MethodMulti-Objective OptimizationConvergence AnalysisBi-Level Optimization

0 comments

The pith

A new adaptive penalty method trains hypernetworks to learn controllable Pareto fronts while satisfying split feasibility constraints and proving full-sequence convergence.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper solves the open issue of training hypernetworks for controllable Pareto front learning when split feasibility conditions must hold, by giving the first rigorous convergence guarantees for the task. It reformulates the constrained problem as a Bi-Level Scalarized Split Problem and introduces the Adaptive Balanced Penalty algorithm whose gradients for optimality, set feasibility, and image feasibility are combined adaptively using a computable lower bound. A convex surrogate technique then establishes full-sequence convergence under ordinary convexity and Robbins-Monro step-size rules. The same penalty structure becomes a two-phase feasibility-first training procedure for Hyper-MLP and HyperTrans networks, and a new Expected Feasible Hypervolume metric jointly scores solution quality and constraint satisfaction. On five multi-objective benchmarks the solver matches ground truth, and on three multi-task datasets it lifts feasible hypervolume up to 2.3 times higher than unconstrained baselines by moving feasibility rates from 36-49 percent to 87-100 percent.

Core claim

The Adaptive Balanced Penalty algorithm, when applied to the Bi-Level Scalarized Split Problem reformulation of constrained Pareto optimization, achieves full-sequence convergence for hypernetwork training in Controllable Pareto Front Learning under standard convexity and Robbins-Monro step-size assumptions, which in turn supports a two-phase feasibility-first training strategy that demonstrably raises constraint satisfaction rates to 87-100 percent.

What carries the argument

The Adaptive Balanced Penalty (ABP) algorithm, which blends optimality, set feasibility, and image feasibility gradient components through an adaptive indicator driven by a computable lower bound.

If this is right

The two-phase ABP-HyperNet training strategy produces hypernetworks whose generated Pareto fronts satisfy the split feasibility conditions at rates of 87-100 percent.
The Expected Feasible Hypervolume metric provides a joint measure of solution quality and constraint satisfaction that can be used to compare constrained CPFL methods.
The ABP solver matches ground-truth solutions on standard multi-objective benchmarks while enforcing the feasibility constraints.
Hyper-MLP and HyperTrans architectures trained with the translated ABP penalty structure outperform unconstrained baselines by up to 2.3 times in EFHV.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same adaptive blending mechanism could be tested on problems where only approximate convexity holds, to see whether practical performance remains strong even if the formal proof does not apply.
The Expected Feasible Hypervolume could serve as an evaluation tool in other constrained multi-objective settings such as resource allocation or neural architecture search with hard constraints.
Because the method separates feasibility and optimality phases, it may reduce the need for heavy constraint-handling machinery in related hypernetwork applications.

Load-bearing premise

The problems satisfy standard convexity assumptions and Robbins-Monro step-size conditions required for the convergence proof.

What would settle it

A counter-example in which the ABP algorithm diverges or fails to reach a feasible solution on a convex Bi-Level Scalarized Split Problem instance that obeys the Robbins-Monro step-size schedule would disprove the full-sequence convergence claim.

Figures

Figures reproduced from arXiv: 2605.19306 by Dung D. Le, Nguyen Viet Hoang, Tran Ngoc Thang.

**Figure 1.** Figure 1: Conceptual overview of Controllable Pareto Front Learning. (a) Existing CPFL methods (Navon et al., 2021; Tuan et al., 2024a,b) approximate the entire unconstrained Pareto front; solutions may lie anywhere in the objective space. (b) Our BSSP two-phase training strategy restricts solutions to a decision-maker-specified region 𝑄, systematically driving them into 𝑄 while optimizing the trade-off. N.V. Hoang,… view at source ↗

**Figure 2.** Figure 2: Training pipeline for hypernetwork-based CPFL under split feasibility conditions. When 𝜀0 = 0: 𝓁∗ = 0, all three bounds vanish, and ̂𝑥 ∈ Ω (exact optimality and feasibility). (Proof in Appendix A.4.) Remark 4.22.1 (Summary of convergence conclusions) [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗

**Figure 3.** Figure 3: Multi-LeNet target network for Multi-MNIST, Multi-Fashion, and Fashion+MNIST. All weights 𝜽 are generated by the hypernetwork; no learnable parameters reside in the target network itself. The two task heads produce logits for the left-image task (Task 0) and right-image task (Task 1), respectively [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗

**Figure 4.** Figure 4: Hyper-MLP architecture. A three-layer shared MLP trunk maps 𝒓 to 𝒉MLP(𝒓) ∈ ℝ𝑑 ; separate linear heads project this representation onto each target parameter tensor 𝜃𝑗 . 5.2.2. HyperTrans The key limitation of Hyper-MLP is that 𝒓 is processed as a monolithic input, obscuring per-objective contributions. This matters for the BSSP penalty structure because the image feasibility residual 𝜌 𝑘 has a per-objectiv… view at source ↗

**Figure 5.** Figure 5: HyperTrans architecture: each 𝑟𝑖 is embedded into a 𝑑-dimensional token; a single Transformer block captures pairwise interactions; mean-pooling aggregates the tokens into a shared state projected via linear heads to produce 𝜽. The attention scores capture the pairwise trade-off structure, providing HyperTrans with an inductive bias for modelling the coupling between objectives—precisely what the image-fea… view at source ↗

**Figure 6.** Figure 6: Constrained Pareto front approximation on two convex benchmarks with 50 rays. Left: CVX1 (dim 𝑥 = 1). Right: CVX2 (dim 𝑥 = 2, Binh & Korn). ABP-HyperTrans places its output points tightly along the ground-truth constrained front inside 𝑄+ [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗

**Figure 7.** Figure 7: Constrained Pareto front approximation on two non-convex ZDT benchmarks with 50 rays. Left: ZDT1 (dim 𝑥 = 30, convex front 𝑓2 = 1 − √ 𝑓1 ). Right: ZDT2 (dim 𝑥 = 30, concave front 𝑓2 = 1 − 𝑓 2 1 ). Layout as in [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗

**Figure 8.** Figure 8: Multi-MNIST: Pareto fronts of all four methods under Box (left) and Sphere (right) constraints for a representative fold. Shaded regions: constraint set 𝑄 / 𝑄+ ; legend shows ray-level feasibility [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗

**Figure 9.** Figure 9: Multi-Fashion: Pareto fronts of all four methods. Layout as in [PITH_FULL_IMAGE:figures/full_fig_p023_9.png] view at source ↗

**Figure 10.** Figure 10: Fashion+MNIST: Pareto fronts of all four methods. Layout as in [PITH_FULL_IMAGE:figures/full_fig_p024_10.png] view at source ↗

**Figure 11.** Figure 11: Multi-MNIST, ABP-HyperMLP: Pareto front under None (blue), Box (red), and Sphere (green) constraints. Dashed lines: constraint boundaries; shaded: 𝑄 or 𝑄+ [PITH_FULL_IMAGE:figures/full_fig_p031_11.png] view at source ↗

**Figure 12.** Figure 12: Multi-MNIST, ABP-HyperTrans: layout as in [PITH_FULL_IMAGE:figures/full_fig_p031_12.png] view at source ↗

**Figure 13.** Figure 13: Multi-Fashion, ABP-HyperMLP: layout as in [PITH_FULL_IMAGE:figures/full_fig_p032_13.png] view at source ↗

**Figure 14.** Figure 14: Multi-Fashion, ABP-HyperTrans: layout as in [PITH_FULL_IMAGE:figures/full_fig_p032_14.png] view at source ↗

**Figure 15.** Figure 15: Fashion+MNIST, ABP-HyperMLP: layout as in [PITH_FULL_IMAGE:figures/full_fig_p032_15.png] view at source ↗

**Figure 16.** Figure 16: Fashion+MNIST, ABP-HyperTrans: layout as in [PITH_FULL_IMAGE:figures/full_fig_p033_16.png] view at source ↗

**Figure 17.** Figure 17: ABP-HyperTrans predictions on CVX2 at 10, 20, 50, and 100 preference rays. Grey curve: unconstrained Pareto front; blue circle: boundary of 𝑄; shaded region: 𝑄+ ; red stars: predicted solutions; cyan cross: 𝑧 ∗ . N.V. Hoang, D.D. Le and T.N. Thang: Preprint submitted to Elsevier Page 31 of 34 [PITH_FULL_IMAGE:figures/full_fig_p033_17.png] view at source ↗

**Figure 18.** Figure 18: Hyperparameter sensitivity on Multi-MNIST (Hyper-MLP, Box constraint) across four key parameters. Bars represent the mean over 5 independent seeds for (a) Phase-1 weight 𝜀, (b) penalty growth 𝜌, (c) penalty cap 𝛽max, and (d) initial penalty 𝛽0 . N.V. Hoang, D.D. Le and T.N. Thang: Preprint submitted to Elsevier Page 32 of 34 [PITH_FULL_IMAGE:figures/full_fig_p034_18.png] view at source ↗

read the original abstract

We address the open problem of training hypernetworks for Controllable Pareto Front Learning (CPFL) under split feasibility conditions with rigorous theoretical guarantees. We reformulate the constrained Pareto problem as a Bi-Level Scalarized Split Problem (BSSP) and propose the Adaptive Balanced Penalty (ABP) algorithm, whose three gradient components -- optimality, set feasibility, and image feasibility -- are blended through an adaptive indicator driven by a computable lower bound. Using a novel convex surrogate technique, we prove full-sequence convergence under standard convexity and Robbins-Monro step-size assumptions. The ABP penalty structure is then translated into a two-phase, feasibility-first training strategy for Hyper-MLP and HyperTrans architectures (ABP-HyperNet). To evaluate constrained CPFL, we introduce the Expected Feasible Hypervolume (EFHV), which jointly captures solution quality and constraint satisfaction. Experiments on five multi-objective benchmarks validate the ABP solver against ground truth, while three multi-task learning datasets demonstrate that ABP-HyperNet achieves up to 2.3x higher EFHV than unconstrained baselines by raising feasibility from 36-49% to 87-100%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a workable adaptive penalty scheme and new EFHV metric for constrained CPFL with clear empirical gains, but the convergence proof stays in the convex regime and does not transfer to hypernetwork training.

read the letter

The main takeaway is a new Adaptive Balanced Penalty algorithm that adaptively blends optimality, set feasibility, and image feasibility gradients for split-feasibility constrained Pareto front learning. It comes with a claimed full-sequence convergence result via a convex surrogate and a two-phase training recipe for hypernetworks, plus a new Expected Feasible Hypervolume metric that folds in constraint satisfaction. Experiments report feasibility rising from the 30-50% range to 87-100% and EFHV improvements up to 2.3 times over unconstrained baselines on the tested benchmarks and multi-task datasets. That is concrete and useful on its face. The reformulation as a Bi-Level Scalarized Split Problem and the adaptive indicator driven by a computable lower bound are the clearest novelties. The translation of the penalty into a feasibility-first two-phase schedule for Hyper-MLP and HyperTrans architectures is a reasonable engineering step. The main limitation is the theory-practice gap. The convergence argument relies on convexity plus Robbins-Monro step sizes, yet the actual target is hypernetwork weight optimization, which is non-convex. Nothing in the write-up shows that the surrogate or the guarantees survive that translation, so the two-phase procedure functions as a heuristic rather than a provably convergent method. The experiments validate against ground truth and baselines but do not include the kind of ablation or sensitivity checks that would strengthen the case. This is aimed at people working on constrained multi-objective learning in ML. It has enough new pieces and reported gains to merit a serious referee, though the authors will need to tighten the scope of the theoretical claims and add more detail on the non-convex regime. I would send it out for review rather than desk-reject.

Referee Report

2 major / 2 minor

Summary. The paper addresses controllable Pareto front learning (CPFL) under split feasibility by reformulating the problem as a Bi-Level Scalarized Split Problem (BSSP). It introduces the Adaptive Balanced Penalty (ABP) algorithm that blends optimality, set feasibility, and image feasibility gradients via an adaptive indicator. A novel convex surrogate technique is used to prove full-sequence convergence under standard convexity and Robbins-Monro step-size conditions. The ABP structure is translated into a two-phase feasibility-first training procedure for Hyper-MLP and HyperTrans hypernetworks (ABP-HyperNet). A new Expected Feasible Hypervolume (EFHV) metric is proposed to evaluate both quality and feasibility. Experiments on five multi-objective benchmarks and three multi-task datasets report improved feasibility rates (87-100%) and up to 2.3x higher EFHV versus unconstrained baselines.

Significance. If the convergence result holds and the guarantees transfer to the hypernetwork setting, the work would advance constrained multi-objective optimization by providing the first rigorous full-sequence convergence for controllable Pareto front learning with split feasibility. The convex surrogate technique, the two-phase training heuristic, and the EFHV metric are potentially useful contributions for practical hypernetwork-based Pareto approximation in multi-task learning.

major comments (2)

The convergence proof (abstract and theoretical section) establishes full-sequence convergence for the ABP solver on the BSSP under convexity and Robbins-Monro assumptions via the convex surrogate. However, the central application translates ABP into two-phase training of Hyper-MLP and HyperTrans architectures, whose parameter spaces are non-convex. No argument shows that the surrogate technique or the two-phase heuristic inherits the same guarantees; the reported EFHV gains remain purely empirical. This gap is load-bearing for the claim of 'rigorous theoretical guarantees' for ABP-HyperNet.
The weakest assumption listed (standard convexity of the BSSP) is invoked for the proof, yet the manuscript does not verify or relax this assumption when the BSSP is instantiated inside a hypernetwork whose outer optimization is non-convex. A concrete test or counter-example analysis for the non-convex regime would be required to support the transfer.

minor comments (2)

The definition and computability of the 'computable lower bound' driving the adaptive indicator should be stated explicitly with pseudocode or an equation reference.
Clarify whether the five benchmark experiments validate the ABP solver in isolation or already include the hypernetwork training; the distinction affects how the theoretical guarantees are claimed to support the empirical results.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive comments and for highlighting the potential impact of our contributions to constrained CPFL. We address each major comment below, clarifying the scope of our theoretical results and the practical nature of the hypernetwork extension.

read point-by-point responses

Referee: The convergence proof (abstract and theoretical section) establishes full-sequence convergence for the ABP solver on the BSSP under convexity and Robbins-Monro assumptions via the convex surrogate. However, the central application translates ABP into two-phase training of Hyper-MLP and HyperTrans architectures, whose parameter spaces are non-convex. No argument shows that the surrogate technique or the two-phase heuristic inherits the same guarantees; the reported EFHV gains remain purely empirical. This gap is load-bearing for the claim of 'rigorous theoretical guarantees' for ABP-HyperNet.

Authors: We agree that the full-sequence convergence proof via the convex surrogate applies specifically to the ABP solver on the convex BSSP under the stated assumptions. The ABP-HyperNet translates the ABP penalty structure into a two-phase feasibility-first training procedure for the hypernetworks, but this is presented as a practical heuristic rather than a direct application of the convergence result. The manuscript does not claim that the guarantees transfer to the non-convex hypernetwork parameter space, and the EFHV improvements are empirical. We will revise the abstract, Section 1, and the conclusion to explicitly delineate the theoretical guarantees (ABP on BSSP) from the empirical results (ABP-HyperNet). A dedicated limitations paragraph will be added to discuss this distinction. revision: yes
Referee: The weakest assumption listed (standard convexity of the BSSP) is invoked for the proof, yet the manuscript does not verify or relax this assumption when the BSSP is instantiated inside a hypernetwork whose outer optimization is non-convex. A concrete test or counter-example analysis for the non-convex regime would be required to support the transfer.

Authors: The convexity assumption is required for the BSSP convergence analysis. In the hypernetwork setting the outer optimization over network parameters is non-convex, and the manuscript does not verify, relax, or provide counter-example analysis for this regime. We will add a discussion subsection noting this limitation and clarifying that the two-phase procedure is motivated by the ABP structure to prioritize feasibility in practice, without inheriting the convexity-based guarantees. A full non-convex analysis or counter-example study lies outside the current scope. revision: partial

standing simulated objections not resolved

A concrete test or counter-example analysis for the non-convex regime in hypernetwork training

Circularity Check

0 steps flagged

No circularity: derivation relies on external standard assumptions and independent definitions

full rationale

The paper reformulates the constrained Pareto problem as BSSP, introduces the ABP algorithm with three gradient components blended via an adaptive indicator, and proves full-sequence convergence via a novel convex surrogate technique under explicitly stated standard convexity and Robbins-Monro step-size assumptions. The two-phase feasibility-first training for Hyper-MLP and HyperTrans, along with the EFHV metric, are defined directly from the penalty structure without reducing to fitted inputs or prior self-citations. No load-bearing step equates a claimed result to its own inputs by construction; the proof chain is self-contained against external mathematical benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 2 invented entities

The central claims rest on standard optimization assumptions for convergence and the introduction of new algorithmic structures and evaluation metric without independent external validation in the abstract.

free parameters (1)

adaptive indicator parameters
Used to blend optimality, set feasibility, and image feasibility gradients based on computable lower bound

axioms (2)

domain assumption Standard convexity assumptions
Invoked for proving full-sequence convergence of ABP algorithm
standard math Robbins-Monro step-size assumptions
Required for stochastic convergence guarantees in the proof

invented entities (2)

Adaptive Balanced Penalty (ABP) algorithm no independent evidence
purpose: To solve the Bi-Level Scalarized Split Problem for constrained CPFL
New penalty structure with adaptive indicator proposed in the paper
Expected Feasible Hypervolume (EFHV) no independent evidence
purpose: To jointly capture solution quality and constraint satisfaction in constrained CPFL
New evaluation metric introduced for the constrained setting

pith-pipeline@v0.9.0 · 5738 in / 1445 out tokens · 68252 ms · 2026-05-20T06:56:09.801242+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Using a novel convex surrogate technique, we prove full-sequence convergence under standard convexity and Robbins–Monro step-size assumptions.
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The ABP penalty structure is then translated into a two-phase, feasibility-first training strategy for Hyper-MLP and HyperTrans architectures

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages · 1 internal anchor

[1]

Constrained policy optimization, in: International Conference on Machine Learning (ICML), pp

Achiam, J., Held, D., Tamar, A., Abbeel, P., 2017. Constrained policy optimization, in: International Conference on Machine Learning (ICML), pp. 22--31

work page 2017
[2]

A reductions approach to fair classification, in: International Conference on Machine Learning (ICML), pp

Agarwal, A., Beygelzimer, A., Dud\' i k, M., Langford, J., Wallach, H., 2018. A reductions approach to fair classification, in: International Conference on Machine Learning (ICML), pp. 60--69

work page 2018
[3]

Constrained Markov decision processes

Altman, E., 1999. Constrained Markov decision processes. Chapman & Hall/CRC, Boca Raton, FL

work page 1999
[4]

Convex analysis and monotone operator theory in Hilbert spaces

Bauschke, H.H., Combettes, P.L., 2017. Convex analysis and monotone operator theory in Hilbert spaces. 2nd ed., Springer, Cham

work page 2017
[5]

Nonlinear programming

Bertsekas, D.P., 1999. Nonlinear programming. 2nd ed., Athena Scientific, Belmont, MA

work page 1999
[6]

Mobes: A multiobjective evolution strategy for constrained optimization problems, in: The third international conference on genetic algorithms (Mendel 97), p

Binh, T.T., Korn, U., 1997. Mobes: A multiobjective evolution strategy for constrained optimization problems, in: The third international conference on genetic algorithms (Mendel 97), p. 27

work page 1997
[7]

Dynamic string-averaging cq-methods for the split feasibility problem with percentage violation constraints arising in radiation therapy treatment planning

Brooke, M., Censor, Y., Gibali, A., 2021. Dynamic string-averaging cq-methods for the split feasibility problem with percentage violation constraints arising in radiation therapy treatment planning. International Transactions in Operational Research 30, 181--205

work page 2021
[8]

Iterative oblique projection onto convex sets and the split feasibility problem

Byrne, C., 2002. Iterative oblique projection onto convex sets and the split feasibility problem. Inverse problems 18, 441

work page 2002
[9]

A unified treatment of some iterative algorithms in signal processing and image reconstruction

Byrne, C., 2004. A unified treatment of some iterative algorithms in signal processing and image reconstruction. Inverse Problems 20, 103--120

work page 2004
[10]

Multi-objective optimization method for enhancing chemical reaction process

Cao, X., Jia, S., Luo, Y., Yuan, X., Qi, Z., Yu, K.T., 2019. Multi-objective optimization method for enhancing chemical reaction process. Chemical Engineering Science 195, 494--506

work page 2019
[11]

A multiprojection algorithm using bregman projections in a product space

Censor, Y., Elfving, T., 1994. A multiprojection algorithm using bregman projections in a product space. Numerical Algorithms 8, 221--239

work page 1994
[12]

The multiple-sets split feasibility problem and its applications for inverse problems

Censor, Y., Elfving, T., Kopf, N., Bortfeld, T., 2005. The multiple-sets split feasibility problem and its applications for inverse problems. Inverse problems 21, 2071

work page 2005
[13]

Algorithms for the split variational inequality problem

Censor, Y., Gibali, A., Reich, S., 2012. Algorithms for the split variational inequality problem. Numerical Algorithms 59, 301--323

work page 2012
[14]

Multicriteria optimization

Ehrgott, M., 2005. Multicriteria optimization. volume 491. Springer Science & Business Media

work page 2005
[15]

Single- and multiobjective evolutionary optimization assisted by Gaussian random field metamodels

Emmerich, M.T.M., Giannakoglou, K.C., Naujoks, B., 2006. Single- and multiobjective evolutionary optimization assisted by Gaussian random field metamodels. IEEE Transactions on Evolutionary Computation 10, 421--439

work page 2006
[16]

Bayesian optimization with unknown constraints, in: Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence (UAI), pp

Gelbart, M.A., Snoek, J., Adams, R.P., 2014. Bayesian optimization with unknown constraints, in: Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence (UAI), pp. 250--259

work page 2014
[17]

Fundamentals of convex analysis

Hiriart-Urruty, J.-B., Lemar\' e chal, C., 2004. Fundamentals of convex analysis. Springer, Berlin

work page 2004
[18]

Elitist non-dominated sorting harris hawks optimization: Framework and developments for multi-objective problems

Jangir, P., Heidari, A.A., Chen, H., 2021. Elitist non-dominated sorting harris hawks optimization: Framework and developments for multi-objective problems. Expert Systems with Applications 186, 115747

work page 2021
[19]

Optimization over the efficient set of a bicriteria convex programming problem

Kim, N.T.B., Thang, T.N., 2013. Optimization over the efficient set of a bicriteria convex programming problem. Pac. J. Optim. 9, 103--115

work page 2013
[20]

Iteration-complexity of first-order penalty methods for convex programming

Lan, G., Monteiro, R.D.C., 2013. Iteration-complexity of first-order penalty methods for convex programming. Mathematical Programming 138, 115--139

work page 2013
[21]

Pareto multi-task learning, in: Thirty-third Conference on Neural Information Processing Systems (NeurIPS), pp

Lin, X., Zhen, H.L., Li, Z., Zhang, Q., Kwong, S., 2019. Pareto multi-task learning, in: Thirty-third Conference on Neural Information Processing Systems (NeurIPS), pp. 12037--12047

work page 2019
[22]

Pareto set learning for expensive multi-objective optimization

Lin, X., Yang, Z., Zhang, Q., 2022. Pareto set learning for expensive multi-objective optimization. Advances in Neural Information Processing Systems 35, 16298--16310

work page 2022
[23]

Nonlinear multiobjective optimization

Miettinen, K., 1999. Nonlinear multiobjective optimization. Kluwer Academic Publishers, Boston

work page 1999
[24]

Learning the Pareto front with hypernetworks, in: International Conference on Learning Representations (ICLR)

Navon, A., Shamsian, A., Chechik, G., Fetaya, E., 2021. Learning the Pareto front with hypernetworks, in: International Conference on Learning Representations (ICLR)

work page 2021
[25]

Introductory lectures on convex optimization: a basic course

Nesterov, Y., 2004. Introductory lectures on convex optimization: a basic course. Kluwer Academic Publishers, Boston

work page 2004
[26]

Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations

Raissi, M., Perdikaris, P., Karniadakis, G.E., 2019. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics 378, 686--707

work page 2019
[27]

A stochastic approximation method

Robbins, H., Monro, S., 1951. A stochastic approximation method. Annals of Mathematical Statistics 22, 400--407

work page 1951
[28]

A convergence theorem for non-negative almost supermartingales and some applications, in: Rustagi, J.S

Robbins, H., Siegmund, D., 1971. A convergence theorem for non-negative almost supermartingales and some applications, in: Rustagi, J.S. (Ed.), Optimizing methods in statistics. Academic Press, New York, pp. 233--257

work page 1971
[29]

Variational analysis

Rockafellar, R.T., Wets, R.J.-B., 2009. Variational analysis. Springer, Berlin

work page 2009
[30]

Convex analysis

Rockafellar, R.T., 1970. Convex analysis. Princeton University Press, Princeton, NJ

work page 1970
[31]

Dynamic routing between capsules, in: Advances in Neural Information Processing Systems (NeurIPS), pp

Sabour, S., Frosst, N., Hinton, G.E., 2017. Dynamic routing between capsules, in: Advances in Neural Information Processing Systems (NeurIPS), pp. 3859--3869

work page 2017
[32]

Multi-task learning as multi-objective optimization

Sener, O., Koltun, V., 2018. Multi-task learning as multi-objective optimization. Advances in neural information processing systems 31

work page 2018
[33]

A monotonic optimization approach for solving strictly quasiconvex multiobjective programming problems

Thang, T.N., Solanki, V.K., Dao, T.A., Thi Ngoc Anh, N., Van Hai, P., 2020. A monotonic optimization approach for solving strictly quasiconvex multiobjective programming problems. Journal of Intelligent & Fuzzy Systems 38, 6053--6063

work page 2020
[34]

A framework for controllable Pareto front learning with completed scalarization functions and its applications

Tuan, T.A., Hoang, L.P., Le, D.D., Thang, T.N., 2024. A framework for controllable Pareto front learning with completed scalarization functions and its applications. Neural Networks 169, 257--273

work page 2024
[35]

A HyperTrans model for controllable Pareto front learning with split feasibility constraints

Tuan, T.A., Dung, N.V., Thang, T.N., 2024. A HyperTrans model for controllable Pareto front learning with split feasibility constraints. Neural Networks 179, 106571

work page 2024
[36]

Optimizing over pareto set of semistrictly quasiconcave vector maximization and application to stochastic portfolio selection

Vuong, N.D., Thang, T.N., 2023. Optimizing over pareto set of semistrictly quasiconcave vector maximization and application to stochastic portfolio selection. Journal of Industrial and Management Optimization 19, 1999--2019

work page 2023
[37]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

Xiao, H., Rasul, K., Vollgraf, R., 2017. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747

work page internal anchor Pith review Pith/arXiv arXiv 2017
[38]

Are transformers universal approximators of sequence-to-sequence functions?, in: International Conference on Learning Representations (ICLR)

Yun, C., Bhojanapalli, S., Rawat, A.S., Reddi, S.J., Kumar, S., 2020. Are transformers universal approximators of sequence-to-sequence functions?, in: International Conference on Learning Representations (ICLR)

work page 2020
[39]

Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach

Zitzler, E., Thiele, L., 1999. Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE transactions on Evolutionary Computation 3, 257--271

work page 1999
[40]

Comparison of multiobjective evolutionary algorithms: Empirical results

Zitzler, E., Deb, K., Thiele, L., 2000. Comparison of multiobjective evolutionary algorithms: Empirical results. Evolutionary computation 8, 173--195

work page 2000

[1] [1]

Constrained policy optimization, in: International Conference on Machine Learning (ICML), pp

Achiam, J., Held, D., Tamar, A., Abbeel, P., 2017. Constrained policy optimization, in: International Conference on Machine Learning (ICML), pp. 22--31

work page 2017

[2] [2]

A reductions approach to fair classification, in: International Conference on Machine Learning (ICML), pp

Agarwal, A., Beygelzimer, A., Dud\' i k, M., Langford, J., Wallach, H., 2018. A reductions approach to fair classification, in: International Conference on Machine Learning (ICML), pp. 60--69

work page 2018

[3] [3]

Constrained Markov decision processes

Altman, E., 1999. Constrained Markov decision processes. Chapman & Hall/CRC, Boca Raton, FL

work page 1999

[4] [4]

Convex analysis and monotone operator theory in Hilbert spaces

Bauschke, H.H., Combettes, P.L., 2017. Convex analysis and monotone operator theory in Hilbert spaces. 2nd ed., Springer, Cham

work page 2017

[5] [5]

Nonlinear programming

Bertsekas, D.P., 1999. Nonlinear programming. 2nd ed., Athena Scientific, Belmont, MA

work page 1999

[6] [6]

Mobes: A multiobjective evolution strategy for constrained optimization problems, in: The third international conference on genetic algorithms (Mendel 97), p

Binh, T.T., Korn, U., 1997. Mobes: A multiobjective evolution strategy for constrained optimization problems, in: The third international conference on genetic algorithms (Mendel 97), p. 27

work page 1997

[7] [7]

Dynamic string-averaging cq-methods for the split feasibility problem with percentage violation constraints arising in radiation therapy treatment planning

Brooke, M., Censor, Y., Gibali, A., 2021. Dynamic string-averaging cq-methods for the split feasibility problem with percentage violation constraints arising in radiation therapy treatment planning. International Transactions in Operational Research 30, 181--205

work page 2021

[8] [8]

Iterative oblique projection onto convex sets and the split feasibility problem

Byrne, C., 2002. Iterative oblique projection onto convex sets and the split feasibility problem. Inverse problems 18, 441

work page 2002

[9] [9]

A unified treatment of some iterative algorithms in signal processing and image reconstruction

Byrne, C., 2004. A unified treatment of some iterative algorithms in signal processing and image reconstruction. Inverse Problems 20, 103--120

work page 2004

[10] [10]

Multi-objective optimization method for enhancing chemical reaction process

Cao, X., Jia, S., Luo, Y., Yuan, X., Qi, Z., Yu, K.T., 2019. Multi-objective optimization method for enhancing chemical reaction process. Chemical Engineering Science 195, 494--506

work page 2019

[11] [11]

A multiprojection algorithm using bregman projections in a product space

Censor, Y., Elfving, T., 1994. A multiprojection algorithm using bregman projections in a product space. Numerical Algorithms 8, 221--239

work page 1994

[12] [12]

The multiple-sets split feasibility problem and its applications for inverse problems

Censor, Y., Elfving, T., Kopf, N., Bortfeld, T., 2005. The multiple-sets split feasibility problem and its applications for inverse problems. Inverse problems 21, 2071

work page 2005

[13] [13]

Algorithms for the split variational inequality problem

Censor, Y., Gibali, A., Reich, S., 2012. Algorithms for the split variational inequality problem. Numerical Algorithms 59, 301--323

work page 2012

[14] [14]

Multicriteria optimization

Ehrgott, M., 2005. Multicriteria optimization. volume 491. Springer Science & Business Media

work page 2005

[15] [15]

Single- and multiobjective evolutionary optimization assisted by Gaussian random field metamodels

Emmerich, M.T.M., Giannakoglou, K.C., Naujoks, B., 2006. Single- and multiobjective evolutionary optimization assisted by Gaussian random field metamodels. IEEE Transactions on Evolutionary Computation 10, 421--439

work page 2006

[16] [16]

Bayesian optimization with unknown constraints, in: Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence (UAI), pp

Gelbart, M.A., Snoek, J., Adams, R.P., 2014. Bayesian optimization with unknown constraints, in: Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence (UAI), pp. 250--259

work page 2014

[17] [17]

Fundamentals of convex analysis

Hiriart-Urruty, J.-B., Lemar\' e chal, C., 2004. Fundamentals of convex analysis. Springer, Berlin

work page 2004

[18] [18]

Elitist non-dominated sorting harris hawks optimization: Framework and developments for multi-objective problems

Jangir, P., Heidari, A.A., Chen, H., 2021. Elitist non-dominated sorting harris hawks optimization: Framework and developments for multi-objective problems. Expert Systems with Applications 186, 115747

work page 2021

[19] [19]

Optimization over the efficient set of a bicriteria convex programming problem

Kim, N.T.B., Thang, T.N., 2013. Optimization over the efficient set of a bicriteria convex programming problem. Pac. J. Optim. 9, 103--115

work page 2013

[20] [20]

Iteration-complexity of first-order penalty methods for convex programming

Lan, G., Monteiro, R.D.C., 2013. Iteration-complexity of first-order penalty methods for convex programming. Mathematical Programming 138, 115--139

work page 2013

[21] [21]

Pareto multi-task learning, in: Thirty-third Conference on Neural Information Processing Systems (NeurIPS), pp

Lin, X., Zhen, H.L., Li, Z., Zhang, Q., Kwong, S., 2019. Pareto multi-task learning, in: Thirty-third Conference on Neural Information Processing Systems (NeurIPS), pp. 12037--12047

work page 2019

[22] [22]

Pareto set learning for expensive multi-objective optimization

Lin, X., Yang, Z., Zhang, Q., 2022. Pareto set learning for expensive multi-objective optimization. Advances in Neural Information Processing Systems 35, 16298--16310

work page 2022

[23] [23]

Nonlinear multiobjective optimization

Miettinen, K., 1999. Nonlinear multiobjective optimization. Kluwer Academic Publishers, Boston

work page 1999

[24] [24]

Learning the Pareto front with hypernetworks, in: International Conference on Learning Representations (ICLR)

Navon, A., Shamsian, A., Chechik, G., Fetaya, E., 2021. Learning the Pareto front with hypernetworks, in: International Conference on Learning Representations (ICLR)

work page 2021

[25] [25]

Introductory lectures on convex optimization: a basic course

Nesterov, Y., 2004. Introductory lectures on convex optimization: a basic course. Kluwer Academic Publishers, Boston

work page 2004

[26] [26]

Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations

Raissi, M., Perdikaris, P., Karniadakis, G.E., 2019. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics 378, 686--707

work page 2019

[27] [27]

A stochastic approximation method

Robbins, H., Monro, S., 1951. A stochastic approximation method. Annals of Mathematical Statistics 22, 400--407

work page 1951

[28] [28]

A convergence theorem for non-negative almost supermartingales and some applications, in: Rustagi, J.S

Robbins, H., Siegmund, D., 1971. A convergence theorem for non-negative almost supermartingales and some applications, in: Rustagi, J.S. (Ed.), Optimizing methods in statistics. Academic Press, New York, pp. 233--257

work page 1971

[29] [29]

Variational analysis

Rockafellar, R.T., Wets, R.J.-B., 2009. Variational analysis. Springer, Berlin

work page 2009

[30] [30]

Convex analysis

Rockafellar, R.T., 1970. Convex analysis. Princeton University Press, Princeton, NJ

work page 1970

[31] [31]

Dynamic routing between capsules, in: Advances in Neural Information Processing Systems (NeurIPS), pp

Sabour, S., Frosst, N., Hinton, G.E., 2017. Dynamic routing between capsules, in: Advances in Neural Information Processing Systems (NeurIPS), pp. 3859--3869

work page 2017

[32] [32]

Multi-task learning as multi-objective optimization

Sener, O., Koltun, V., 2018. Multi-task learning as multi-objective optimization. Advances in neural information processing systems 31

work page 2018

[33] [33]

A monotonic optimization approach for solving strictly quasiconvex multiobjective programming problems

Thang, T.N., Solanki, V.K., Dao, T.A., Thi Ngoc Anh, N., Van Hai, P., 2020. A monotonic optimization approach for solving strictly quasiconvex multiobjective programming problems. Journal of Intelligent & Fuzzy Systems 38, 6053--6063

work page 2020

[34] [34]

A framework for controllable Pareto front learning with completed scalarization functions and its applications

Tuan, T.A., Hoang, L.P., Le, D.D., Thang, T.N., 2024. A framework for controllable Pareto front learning with completed scalarization functions and its applications. Neural Networks 169, 257--273

work page 2024

[35] [35]

A HyperTrans model for controllable Pareto front learning with split feasibility constraints

Tuan, T.A., Dung, N.V., Thang, T.N., 2024. A HyperTrans model for controllable Pareto front learning with split feasibility constraints. Neural Networks 179, 106571

work page 2024

[36] [36]

Optimizing over pareto set of semistrictly quasiconcave vector maximization and application to stochastic portfolio selection

Vuong, N.D., Thang, T.N., 2023. Optimizing over pareto set of semistrictly quasiconcave vector maximization and application to stochastic portfolio selection. Journal of Industrial and Management Optimization 19, 1999--2019

work page 2023

[37] [37]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

Xiao, H., Rasul, K., Vollgraf, R., 2017. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747

work page internal anchor Pith review Pith/arXiv arXiv 2017

[38] [38]

Are transformers universal approximators of sequence-to-sequence functions?, in: International Conference on Learning Representations (ICLR)

Yun, C., Bhojanapalli, S., Rawat, A.S., Reddi, S.J., Kumar, S., 2020. Are transformers universal approximators of sequence-to-sequence functions?, in: International Conference on Learning Representations (ICLR)

work page 2020

[39] [39]

Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach

Zitzler, E., Thiele, L., 1999. Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE transactions on Evolutionary Computation 3, 257--271

work page 1999

[40] [40]

Comparison of multiobjective evolutionary algorithms: Empirical results

Zitzler, E., Deb, K., Thiele, L., 2000. Comparison of multiobjective evolutionary algorithms: Empirical results. Evolutionary computation 8, 173--195

work page 2000