Pareto Optimization with Robust Evaluation for Noisy Subset Selection

Bin Zhang; Chao Qian; Danxuan Liu; Weiyong Yang; Yiheng Xu

arxiv: 2501.06813 · v2 · submitted 2025-01-12 · 💻 cs.NE

Pareto Optimization with Robust Evaluation for Noisy Subset Selection

Yiheng Xu , Danxuan Liu , Bin Zhang , Weiyong Yang , Chao Qian This is my paper

Pith reviewed 2026-05-23 06:17 UTC · model grok-4.3

classification 💻 cs.NE

keywords noisy subset selectionPareto optimizationrobust evaluationinfluence maximizationsparse regressioncardinality constraintevolutionary algorithmsmulti-objective optimization

0 comments

The pith

PORE maximizes a robust evaluation function while minimizing subset size to find high-quality solutions under noisy evaluations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PORE for the noisy subset selection problem with a cardinality constraint, where objective evaluations contain noise. It uses Pareto optimization to maximize a robust version of the objective and minimize subset size at the same time. This targets the weaknesses of the greedy algorithm, which struggles with noise, and of prior evolutionary methods like POSS and PONSS, which either perform poorly or demand high computation. Experiments on influence maximization and sparse regression datasets show PORE produces better results. A reader would care because noisy evaluations are common in real applications, and an efficient method could improve outcomes without requiring many repeated measurements.

Core claim

The paper claims that PORE, by maximizing a robust evaluation function and minimizing subset size simultaneously via Pareto optimization, efficiently identifies well-structured solutions for noisy subset selection and outperforms the classical greedy algorithm, POSS, and PONSS on real-world datasets for influence maximization and sparse regression, with ablation studies confirming the value of the robust evaluation component.

What carries the argument

Pareto Optimization with Robust Evaluation (PORE), an algorithm that treats a noise-robust objective as one objective and subset size as the second objective in a multi-objective evolutionary search.

If this is right

PORE handles noise without the high resource use seen in PONSS.
It delivers higher-quality subsets than greedy or POSS across the tested noisy influence maximization and sparse regression tasks.
Ablation confirms that the robust evaluation step drives the performance gain.
The approach works for cardinality-constrained subset selection where evaluations are noisy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same robust-plus-Pareto structure could be tried on other noisy combinatorial problems such as noisy set cover or facility location.
In applications with expensive evaluations, the method may reduce the total number of queries needed compared with methods that rely on repeated sampling.
Different noise distributions might require tailored robust functions, which could be tested by varying the noise model in the same experimental setup.

Load-bearing premise

The robust evaluation function, when optimized jointly with subset size, will reliably surface high-quality subsets despite noise without the search becoming biased or too slow.

What would settle it

On a held-out noisy influence maximization instance, PORE returns subsets whose measured influence is no higher than those returned by the greedy algorithm or PONSS after equal computational budget.

Figures

Figures reproduced from arXiv: 2501.06813 by Bin Zhang, Chao Qian, Danxuan Liu, Weiyong Yang, Yiheng Xu.

**Figure 1.** Figure 1: Influence maximization (influence spread: the larger the better). The left subfigure on each dataset: influence spread vs [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗

**Figure 2.** Figure 2: Sparse regression (R2 : the larger the better). The left subfigure on each dataset: R2 vs budget k. The right subfigure on each dataset: R2 vs running time of PORE, PONSS and POSS for k = 14. 5 6 7 8 9 10 k 1500 1700 1900 2100 Influence Spread PORE PORE-F PONSS (a) ego-Facebook 5 6 7 8 9 10 k 200 300 400 500 Influence Spread PORE PORE-F PONSS (b) HepPh 10 12 14 16 18 20 k 0.08 0.11 0.14 0.17 R 2 PORE PORE-… view at source ↗

**Figure 3.** Figure 3: The ablation experiments on the robust evaluation, where PORE-F denotes PORE without robust evaluation. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Performance of the algorithms under different noise intensities. The left subfigure on [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Performance of PORE with different θ values on the application of influence maximization, where the dataset is ego-Facebook and the budget k = 7. VI. CONCLUSION In this paper, we propose a new algorithm based on Pareto optimization with robust evaluation for noisy subset selection, named PORE. The robust evaluation method assesses the quality of a solution by calculating the average value of a noisy functi… view at source ↗

read the original abstract

Subset selection is a fundamental problem in combinatorial optimization, which has a wide range of applications such as influence maximization and sparse regression. The goal is to select a subset of limited size from a ground set in order to maximize a given objective function. However, the evaluation of the objective function in real-world scenarios is often noisy. Previous algorithms, including the greedy algorithm and multi-objective evolutionary algorithms POSS and PONSS, either struggle in noisy environments or consume excessive computational resources. In this paper, we focus on the noisy subset selection problem with a cardinality constraint, where the evaluation of a subset is noisy. We propose a novel approach based on Pareto Optimization with Robust Evaluation for noisy subset selection (PORE), which maximizes a robust evaluation function and minimizes the subset size simultaneously. PORE can efficiently identify well-structured solutions and handle computational resources, addressing the limitations observed in PONSS. Our experiments, conducted on real-world datasets for influence maximization and sparse regression, demonstrate that PORE significantly outperforms previous methods, including the classical greedy algorithm, POSS, and PONSS. Further validation through ablation studies confirms the effectiveness of our robust evaluation function.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes PORE, a Pareto-based multi-objective evolutionary algorithm that simultaneously maximizes a robust evaluation function and minimizes subset cardinality for noisy subset selection. It targets applications including influence maximization and sparse regression, claiming to address limitations of the greedy algorithm, POSS, and PONSS in noisy settings while using fewer computational resources. Experiments on real-world datasets are reported to show significant outperformance, with ablation studies confirming the value of the robust evaluation component.

Significance. If the empirical claims are supported by reproducible noise models, statistical tests, and baseline implementations, the work could provide a practical advance for noisy combinatorial optimization by balancing solution quality and efficiency. The focus on real-world datasets and explicit handling of computational constraints in PONSS are positive elements.

major comments (3)

[§4] §4 (Experiments): The noise model is not specified (e.g., additive Gaussian variance, multiplicative noise, or oracle query model), preventing assessment of whether PORE's robustness holds beyond the tested instances or generalizes to other noise regimes.
[Table 2 and Table 3] Table 2 and Table 3: No statistical significance tests (p-values, confidence intervals, or non-parametric tests across runs) are reported for the claimed outperformance over greedy, POSS, and PONSS; mean values alone do not establish that differences are reliable rather than due to run variance.
[§3.2] §3.2 (Robust Evaluation): The exact definition and derivation of the robust evaluation function are presented at a high level without the closed-form expression or analysis showing why it is less sensitive to noise than the original objective; this is load-bearing for the central algorithmic claim.

minor comments (2)

[Abstract] The abstract states 'significantly outperforms' without any quantitative deltas or reference to the specific tables/figures that support this.
[§4.1] Baseline implementations (especially PONSS) should include explicit parameter settings and runtime measurements to allow direct comparison of computational resource usage.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and will revise the paper to incorporate the requested clarifications and additions.

read point-by-point responses

Referee: [§4] §4 (Experiments): The noise model is not specified (e.g., additive Gaussian variance, multiplicative noise, or oracle query model), preventing assessment of whether PORE's robustness holds beyond the tested instances or generalizes to other noise regimes.

Authors: We agree that an explicit description of the noise model is necessary for reproducibility and to assess generalizability. In the revised manuscript, Section 4 will be updated to specify the noise model (including type and parameters) used in the experiments on the real-world datasets. revision: yes
Referee: [Table 2 and Table 3] Table 2 and Table 3: No statistical significance tests (p-values, confidence intervals, or non-parametric tests across runs) are reported for the claimed outperformance over greedy, POSS, and PONSS; mean values alone do not establish that differences are reliable rather than due to run variance.

Authors: We concur that statistical tests are required to support the performance claims. The revised Tables 2 and 3 will report standard deviations across independent runs together with results from a non-parametric test (Wilcoxon signed-rank test) and associated p-values comparing PORE against the baselines. revision: yes
Referee: [§3.2] §3.2 (Robust Evaluation): The exact definition and derivation of the robust evaluation function are presented at a high level without the closed-form expression or analysis showing why it is less sensitive to noise than the original objective; this is load-bearing for the central algorithmic claim.

Authors: We will expand Section 3.2 to include the precise closed-form expression of the robust evaluation function and add a short derivation/analysis explaining its reduced sensitivity to noise relative to the original objective. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper proposes the PORE algorithm for noisy subset selection under cardinality constraints and validates it empirically on influence maximization and sparse regression tasks. No derivation chain, first-principles result, or prediction is claimed that reduces by construction to fitted inputs, self-definitions, or self-citation load-bearing premises. The central claim rests on experimental outperformance and ablation studies, which are independent of any internal reduction to the method's own parameters or prior self-citations. Self-citations to POSS/PONSS (if present) support baseline comparisons but do not underpin any uniqueness theorem or ansatz used in the new method.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract alone supplies no concrete equations or implementation details, preventing identification of specific free parameters, axioms, or invented entities; the robust evaluation function is referenced but not defined.

pith-pipeline@v0.9.0 · 5733 in / 1034 out tokens · 59087 ms · 2026-05-23T06:17:46.246867+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages · 1 internal anchor

[1]

Maximizing the spread of influence through a social network,

D. Kempe, J. Kleinberg, and ´E. Tardos, “Maximizing the spread of influence through a social network,” in Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY , 2003, pp. 137–146

work page 2003
[2]

Miller, Subset selection in regression

A. Miller, Subset selection in regression. chapman and hall/CRC, 2002

work page 2002
[3]

Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection

A. Das and D. Kempe, “Submodular meets spectral: Greedy algorithms for subset selection, sparse approximation and dictionary selection,” arXiv preprint arXiv:1102.3975 , 2011

work page internal anchor Pith review Pith/arXiv arXiv 2011
[4]

Practical feature subset selection for machine learning,

M. A. Hall and L. A. Smith, “Practical feature subset selection for machine learning,” 1998

work page 1998
[5]

Unsupervised feature selection by pareto optimization,

C. Feng, C. Qian, and K. Tang, “Unsupervised feature selection by pareto optimization,” in Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, HI, 2019, pp. 3534–3541

work page 2019
[6]

A threshold of ln n for approximating set cover,

U. Feige, “A threshold of ln n for approximating set cover,” Journal of the ACM, vol. 45, no. 4, pp. 634–652, 1998

work page 1998
[7]

Analysis of the greedy approach in problems of maximum k-coverage,

D. S. Hochbaum and A. Pathria, “Analysis of the greedy approach in problems of maximum k-coverage,” Naval Research Logistics , vol. 45, no. 6, pp. 615–627, 1998

work page 1998
[8]

Best algorithms for approximating the maximum of a submodular set function,

G. L. Nemhauser and L. A. Wolsey, “Best algorithms for approximating the maximum of a submodular set function,” Mathematics of Operations Research, vol. 3, no. 3, pp. 177–188, 1978

work page 1978
[9]

Subset selection by Pareto opti- mization,

C. Qian, Y . Yu, and Z.-H. Zhou, “Subset selection by Pareto opti- mization,” in Advances in Neural Information Processing Systems 28 , Montreal, Canada, 2015, pp. 1774–1782

work page 2015
[10]

Scalable influence maximization for prevalent viral marketing in large-scale social networks,

W. Chen, C. Wang, and Y . Wang, “Scalable influence maximization for prevalent viral marketing in large-scale social networks,” in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , Washington, DC, 2010, pp. 1029–1038

work page 2010
[11]

Noisy submodular maxi- mization via adaptive sampling with applications to crowdsourced image collection summarization,

A. Singla, S. Tschiatschek, and A. Krause, “Noisy submodular maxi- mization via adaptive sampling with applications to crowdsourced image collection summarization,” in Proceedings of the 30th AAAI Conference on Artificial Intelligence , Phoenix, AZ, 2016, pp. 2037–2043

work page 2016
[12]

Submodular optimization under noise,

A. Hassidim and Y . Singer, “Submodular optimization under noise,” in Proceedings of the 30th Conference on Learning Theory , Amsterdam, The Netherlands, 2017, pp. 1069–1122

work page 2017
[13]

Optimization for approximate submodularity,

——, “Optimization for approximate submodularity,” in Advances in Neural Information Processing Systems 31 , Montr ´eal, Canada, 2018, pp. 394–405

work page 2018
[14]

Subset selection under noise,

C. Qian, J.-C. Shi, Y . Yu, K. Tang, and Z.-H. Zhou, “Subset selection under noise,” in Advances in Neural Information Processing Systems 30, Long Beach, CA, 2017, pp. 3560–3570

work page 2017
[15]

Distributed pareto optimization for large-scale noisy subset selection,

C. Qian, “Distributed pareto optimization for large-scale noisy subset selection,” IEEE Transactions on Evolutionary Computation , vol. 24, no. 4, pp. 694–707, 2020

work page 2020
[16]

Talk of the network: A complex systems look at the underlying process of word-of-mouth,

J. Goldenberg, B. Libai, and E. Muller, “Talk of the network: A complex systems look at the underlying process of word-of-mouth,” Marketing Letters, vol. 12, pp. 211–223, 2001

work page 2001
[17]

Diekhoff, Statistics for the social and behavioral sciences: Univari- ate, bivariate, and multivariate

G. Diekhoff, Statistics for the social and behavioral sciences: Univari- ate, bivariate, and multivariate . Wm. C. Brown Publishers, 1992

work page 1992
[18]

Applied multivariate statistical analysis,

R. A. Johnson, D. W. Wichern et al. , “Applied multivariate statistical analysis,” 2002

work page 2002
[19]

Maximizing submodular functions under matroid constraints by evolutionary algorithms,

T. Friedrich and F. Neumann, “Maximizing submodular functions under matroid constraints by evolutionary algorithms,” Evolutionary Compu- tation, vol. 23, no. 4, pp. 543–558, 2015

work page 2015
[20]

Efficient submodular optimization under noise: Local search is robust,

L. Huang, Y . Wang, C. Yang, and H. Zhou, “Efficient submodular optimization under noise: Local search is robust,” in Advances in Neural Information Processing Systems 35, New Orleans, LA, 2022, pp. 26 122– 26 134

work page 2022
[21]

Efficient influence maximization in so- cial networks,

W. Chen, Y . Wang, and S. Yang, “Efficient influence maximization in so- cial networks,” in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , Paris, France, 2009, pp. 199–208

work page 2009
[22]

Simpath: An efficient algorithm for influence maximization under the linear threshold model,

A. Goyal, W. Lu, and L. V . Lakshmanan, “Simpath: An efficient algorithm for influence maximization under the linear threshold model,” in Proceedings of the 11th International Conference on Data Mining , Vancouver, Canada, 2011, pp. 211–220

work page 2011

[1] [1]

Maximizing the spread of influence through a social network,

D. Kempe, J. Kleinberg, and ´E. Tardos, “Maximizing the spread of influence through a social network,” in Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY , 2003, pp. 137–146

work page 2003

[2] [2]

Miller, Subset selection in regression

A. Miller, Subset selection in regression. chapman and hall/CRC, 2002

work page 2002

[3] [3]

Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection

A. Das and D. Kempe, “Submodular meets spectral: Greedy algorithms for subset selection, sparse approximation and dictionary selection,” arXiv preprint arXiv:1102.3975 , 2011

work page internal anchor Pith review Pith/arXiv arXiv 2011

[4] [4]

Practical feature subset selection for machine learning,

M. A. Hall and L. A. Smith, “Practical feature subset selection for machine learning,” 1998

work page 1998

[5] [5]

Unsupervised feature selection by pareto optimization,

C. Feng, C. Qian, and K. Tang, “Unsupervised feature selection by pareto optimization,” in Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, HI, 2019, pp. 3534–3541

work page 2019

[6] [6]

A threshold of ln n for approximating set cover,

U. Feige, “A threshold of ln n for approximating set cover,” Journal of the ACM, vol. 45, no. 4, pp. 634–652, 1998

work page 1998

[7] [7]

Analysis of the greedy approach in problems of maximum k-coverage,

D. S. Hochbaum and A. Pathria, “Analysis of the greedy approach in problems of maximum k-coverage,” Naval Research Logistics , vol. 45, no. 6, pp. 615–627, 1998

work page 1998

[8] [8]

Best algorithms for approximating the maximum of a submodular set function,

G. L. Nemhauser and L. A. Wolsey, “Best algorithms for approximating the maximum of a submodular set function,” Mathematics of Operations Research, vol. 3, no. 3, pp. 177–188, 1978

work page 1978

[9] [9]

Subset selection by Pareto opti- mization,

C. Qian, Y . Yu, and Z.-H. Zhou, “Subset selection by Pareto opti- mization,” in Advances in Neural Information Processing Systems 28 , Montreal, Canada, 2015, pp. 1774–1782

work page 2015

[10] [10]

Scalable influence maximization for prevalent viral marketing in large-scale social networks,

W. Chen, C. Wang, and Y . Wang, “Scalable influence maximization for prevalent viral marketing in large-scale social networks,” in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , Washington, DC, 2010, pp. 1029–1038

work page 2010

[11] [11]

Noisy submodular maxi- mization via adaptive sampling with applications to crowdsourced image collection summarization,

A. Singla, S. Tschiatschek, and A. Krause, “Noisy submodular maxi- mization via adaptive sampling with applications to crowdsourced image collection summarization,” in Proceedings of the 30th AAAI Conference on Artificial Intelligence , Phoenix, AZ, 2016, pp. 2037–2043

work page 2016

[12] [12]

Submodular optimization under noise,

A. Hassidim and Y . Singer, “Submodular optimization under noise,” in Proceedings of the 30th Conference on Learning Theory , Amsterdam, The Netherlands, 2017, pp. 1069–1122

work page 2017

[13] [13]

Optimization for approximate submodularity,

——, “Optimization for approximate submodularity,” in Advances in Neural Information Processing Systems 31 , Montr ´eal, Canada, 2018, pp. 394–405

work page 2018

[14] [14]

Subset selection under noise,

C. Qian, J.-C. Shi, Y . Yu, K. Tang, and Z.-H. Zhou, “Subset selection under noise,” in Advances in Neural Information Processing Systems 30, Long Beach, CA, 2017, pp. 3560–3570

work page 2017

[15] [15]

Distributed pareto optimization for large-scale noisy subset selection,

C. Qian, “Distributed pareto optimization for large-scale noisy subset selection,” IEEE Transactions on Evolutionary Computation , vol. 24, no. 4, pp. 694–707, 2020

work page 2020

[16] [16]

Talk of the network: A complex systems look at the underlying process of word-of-mouth,

J. Goldenberg, B. Libai, and E. Muller, “Talk of the network: A complex systems look at the underlying process of word-of-mouth,” Marketing Letters, vol. 12, pp. 211–223, 2001

work page 2001

[17] [17]

Diekhoff, Statistics for the social and behavioral sciences: Univari- ate, bivariate, and multivariate

G. Diekhoff, Statistics for the social and behavioral sciences: Univari- ate, bivariate, and multivariate . Wm. C. Brown Publishers, 1992

work page 1992

[18] [18]

Applied multivariate statistical analysis,

R. A. Johnson, D. W. Wichern et al. , “Applied multivariate statistical analysis,” 2002

work page 2002

[19] [19]

Maximizing submodular functions under matroid constraints by evolutionary algorithms,

T. Friedrich and F. Neumann, “Maximizing submodular functions under matroid constraints by evolutionary algorithms,” Evolutionary Compu- tation, vol. 23, no. 4, pp. 543–558, 2015

work page 2015

[20] [20]

Efficient submodular optimization under noise: Local search is robust,

L. Huang, Y . Wang, C. Yang, and H. Zhou, “Efficient submodular optimization under noise: Local search is robust,” in Advances in Neural Information Processing Systems 35, New Orleans, LA, 2022, pp. 26 122– 26 134

work page 2022

[21] [21]

Efficient influence maximization in so- cial networks,

W. Chen, Y . Wang, and S. Yang, “Efficient influence maximization in so- cial networks,” in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , Paris, France, 2009, pp. 199–208

work page 2009

[22] [22]

Simpath: An efficient algorithm for influence maximization under the linear threshold model,

A. Goyal, W. Lu, and L. V . Lakshmanan, “Simpath: An efficient algorithm for influence maximization under the linear threshold model,” in Proceedings of the 11th International Conference on Data Mining , Vancouver, Canada, 2011, pp. 211–220

work page 2011