pith. sign in

arxiv: 2604.18820 · v2 · submitted 2026-04-20 · 📊 stat.ML · cs.LG· eess.SP· math.OC· stat.AP

Sparse Network Inference under Imperfect Detection and its Application to Ecological Networks

Pith reviewed 2026-05-10 03:11 UTC · model grok-4.3

classification 📊 stat.ML cs.LGeess.SPmath.OCstat.AP
keywords sparse network inferencelow-rank factorizationecological networksimperfect detectionnonnegative matrix factorizationbipartite networkssparsity regularizationADMM algorithm
0
0 comments X

The pith

A nonnegative low-rank factorization model with detection estimation and nonconvex sparsity penalties recovers latent similarity and connectivity structures from sparse ecological count data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to recover both within-group similarity patterns and cross-group interactions in bipartite ecological networks from count data that is sparse and observed under imperfect detection. It introduces a structured sparse nonnegative low-rank factorization that jointly estimates a constant detection probability while applying nonconvex ℓ_{1/2} regularization to the latent factors to enforce sparsity and maintain relative scale between similarity and connectivity components. This addresses limitations in prior models that either overlook detection issues or produce oversparse or unbalanced estimates. The resulting nonconvex nonsmooth problem is solved with an ADMM algorithm that includes adaptive penalization and scale-aware initialization, with proofs of asymptotic feasibility and KKT stationarity for cluster points. Experiments on synthetic data and real ecological networks show gains in recovering the latent factors and the induced structures over existing baselines.

Core claim

The central claim is that imposing nonconvex ℓ_{1/2} regularization on the latent similarity and connectivity factors within a nonnegative low-rank model, while estimating detection probability, yields sparse yet properly scaled estimates of network structure from imperfect count observations; the associated ADMM solver reaches stationary points under mild regularity conditions and outperforms baselines on both synthetic and real ecological datasets.

What carries the argument

Structured sparse nonnegative low-rank factorization with joint detection probability estimation and ℓ_{1/2} regularization on the latent similarity and connectivity matrices, solved by ADMM with adaptive penalization and scale-aware initialization.

If this is right

  • The method produces estimates of within-group similarity and cross-group connectivity that respect both sparsity and relative scale, avoiding oversparse or unbalanced outputs.
  • The ADMM algorithm with adaptive penalization guarantees that cluster points satisfy KKT stationarity under the stated regularity conditions.
  • Improved recovery of latent factors translates to more accurate induced similarity graphs and interaction networks in ecological applications.
  • The framework applies directly to other sparse count-based bipartite networks where detection is imperfect but assumed constant.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The constant-detection assumption could be relaxed to a per-observation or per-row model if additional covariates on detection are available, potentially broadening applicability to heterogeneous sampling.
  • The scale-aware initialization and ℓ_{1/2} penalties may transfer to other nonconvex factorization tasks in recommender systems or single-cell genomics where count data is sparse and incompletely observed.
  • If the true network deviates strongly from low-rank, the method's performance would degrade similarly to other matrix factorization approaches, suggesting a natural diagnostic by comparing reconstruction error to random baselines.

Load-bearing premise

The observed counts arise from a low-rank model with a single constant detection probability that applies uniformly across all observations.

What would settle it

A synthetic dataset generated from a true low-rank structure but with varying detection probabilities across samples, where the method's recovered factors show no improvement or worse structural recovery than baselines that ignore detection.

Figures

Figures reproduced from arXiv: 2604.18820 by Aoran Zhang, C\'esar A. Uribe, Maria J. Guerrero, Tianyao Wei.

Figure 1
Figure 1. Figure 1: shows that the proposed method achieves lower recovery error than both Poisson NMF and the N-mixture baselines over the full optimization trajectory. In particular, MSE(U), MSE(V ), and MSE(α) for the proposed method drop sharply in the first few iterations and stabilize at much smaller values, while the two baselines plateau at higher error levels. This indicates that combining imperfect-detection mod￾eli… view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of Recovery Performance: True vs. Proposed vs. Poisson NMF vs. N-Mixture. [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Recovery performance across different M’s, p0’s, and ρ 0 ’s. scale of the surrogate matrix used for spectral initialization. With an initial guess of the average detection probability, it sets the starting point in a stable, reasonably scaled local region of the nonconvex objective, with the initial U and V at comparable scales. Since the problem is bilinear in U and V , a balanced initialization is more s… view at source ↗
Figure 4
Figure 4. Figure 4: Recovery performance across different λ’s. the algorithm operates in a stable regime with consistently small recovery error across variables with ρ 0 = 10−3 [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗
read the original abstract

Recovering latent structure from count data has received considerable attention in network inference, particularly when one seeks both cross-group interactions and within-group similarity patterns in bipartite networks, which is widely used in ecology research. Such networks are often sparse and inherently imperfect in their detection. Existing models mainly focus on interaction recovery, while the induced similarity graphs are much less studied. Moreover, sparsity is often not controlled, and scale is unbalanced, leading to oversparse or poorly rescaled estimates with degrading structural recovery. To address these issues, we propose a framework for structured sparse nonnegative low-rank factorization with detection probability estimation. We impose nonconvex $\ell_{1/2}$ regularization on the latent similarity and connectivity structures to promote sparsity within-group similarity and cross-group connectivity with better relative scale. The resulting optimization problem is nonconvex and nonsmooth. To solve it, we develop an ADMM-based algorithm with adaptive penalization and scale-aware initialization and establish its asymptotic feasibility and KKT stationarity of cluster points under mild regularity conditions. Experiments on synthetic and real-world ecological datasets demonstrate improved recovery of latent factors and similarity/connectivity structure relative to existing baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a framework for structured sparse nonnegative low-rank factorization with simultaneous estimation of a constant detection probability, for recovering latent similarity and connectivity structures in bipartite networks from imperfect count data. It applies nonconvex ℓ_{1/2} regularization to promote sparsity with better relative scale, develops an ADMM algorithm with adaptive penalization and scale-aware initialization, establishes asymptotic feasibility and KKT stationarity of cluster points under mild regularity conditions, and reports improved empirical recovery of latent factors and structures on synthetic and real-world ecological datasets relative to baselines.

Significance. If the constant-p assumption is appropriate and the nonconvex solver reliably reaches useful stationary points, the approach could improve inference of sparse ecological networks by addressing scale imbalance and imperfect detection in a unified optimization framework, with potential value for applications where both within-group similarity and cross-group interactions matter.

major comments (3)
  1. [Model formulation] Model section: the assumption of a single scalar constant detection probability p across all entries is load-bearing for the recovery guarantees and claims. Real ecological count data typically exhibit heterogeneous detectability arising from species traits, sampling effort, or site effects; the paper should include a robustness analysis or discussion of degradation under mismatch.
  2. [Experiments] Synthetic experiments: data generation follows the exact constant-p low-rank model, so the experiments cannot reveal performance degradation under realistic heterogeneous detection. The reported improvements also lack quantitative details on baseline methods, specific metrics (e.g., relative error, F1 on recovered edges), hyperparameter sensitivity (including the ℓ_{1/2} strengths), and data exclusion rules.
  3. [Algorithm and theory] ADMM algorithm and convergence analysis: while cluster points are shown to satisfy KKT stationarity under mild conditions, nonconvex ℓ_{1/2} problems routinely admit spurious stationary points whose recovered factors can be arbitrarily poor. No landscape analysis or multiple-random-start statistics are supplied to bound the probability of reaching useful solutions.
minor comments (2)
  1. [Preliminaries] Notation for the induced similarity and connectivity graphs from the low-rank factors could be clarified with an explicit equation or diagram.
  2. [Introduction] Add references to prior work on heterogeneous detection models in ecology and on practical behavior of nonconvex ADMM for sparse matrix factorization.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment point by point below, indicating the revisions we will make where appropriate.

read point-by-point responses
  1. Referee: [Model formulation] Model section: the assumption of a single scalar constant detection probability p across all entries is load-bearing for the recovery guarantees and claims. Real ecological count data typically exhibit heterogeneous detectability arising from species traits, sampling effort, or site effects; the paper should include a robustness analysis or discussion of degradation under mismatch.

    Authors: We agree that the constant detection probability is a simplifying assumption central to the model and its analysis. In the revised manuscript, we will add an explicit discussion of this limitation in the model formulation section, including potential impacts on real ecological data. We will also include a new robustness experiment in the synthetic section that introduces heterogeneous detection (e.g., species- or entry-dependent probabilities) and reports the resulting degradation in recovery metrics. revision: yes

  2. Referee: [Experiments] Synthetic experiments: data generation follows the exact constant-p low-rank model, so the experiments cannot reveal performance degradation under realistic heterogeneous detection. The reported improvements also lack quantitative details on baseline methods, specific metrics (e.g., relative error, F1 on recovered edges), hyperparameter sensitivity (including the ℓ_{1/2} strengths), and data exclusion rules.

    Authors: The synthetic data generation matches the model to isolate algorithmic behavior, which is standard practice. We will add mismatched heterogeneous detection experiments to the revision. We will also expand the experimental reporting with detailed tables of relative errors and F1 scores for structure recovery, sensitivity analysis plots for the ℓ_{1/2} parameter, and a clear description of baseline implementations, hyperparameter selection, and real-data preprocessing/exclusion rules. revision: yes

  3. Referee: [Algorithm and theory] ADMM algorithm and convergence analysis: while cluster points are shown to satisfy KKT stationarity under mild conditions, nonconvex ℓ_{1/2} problems routinely admit spurious stationary points whose recovered factors can be arbitrarily poor. No landscape analysis or multiple-random-start statistics are supplied to bound the probability of reaching useful solutions.

    Authors: The provided analysis establishes KKT stationarity for cluster points under the stated conditions. We recognize that nonconvexity admits the possibility of suboptimal stationary points. The scale-aware initialization is intended to improve practical performance. In the revision we will add multiple-random-start experiments with statistics on metric variability across initializations to empirically support reliability. A complete landscape analysis lies outside the current scope. revision: partial

Circularity Check

0 steps flagged

No circularity: derivation and claims are independent of target outputs

full rationale

The paper introduces an explicit modeling assumption (constant detection probability p) and a new nonconvex optimization problem with ℓ_{1/2} regularization solved via ADMM; it then proves KKT stationarity of cluster points under stated regularity conditions. Recovery performance is assessed on both synthetic data generated from the assumed model and real ecological datasets, with comparisons to external baselines. No equation reduces a claimed prediction or recovered structure to a fitted parameter by construction, no load-bearing uniqueness result is imported from the authors' prior work, and no ansatz is smuggled via self-citation. The framework is defined independently of the specific latent factors it recovers, satisfying the criteria for a self-contained derivation.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The framework rests on standard assumptions for nonconvex ADMM convergence and on the statistical model that observed counts arise from latent low-rank factors modulated by a detection probability; no new physical entities are introduced.

free parameters (2)
  • l1/2 regularization strengths
    Two separate penalty parameters control sparsity in similarity and connectivity; their values must be chosen or tuned.
  • detection probability
    Estimated jointly but treated as an unknown parameter in the model.
axioms (2)
  • domain assumption Mild regularity conditions suffice for KKT stationarity of cluster points
    Invoked to guarantee properties of the ADMM iterates.
  • domain assumption Count data generated by low-rank factors plus constant detection probability
    Core modeling assumption for the factorization.

pith-pipeline@v0.9.0 · 5519 in / 1386 out tokens · 62209 ms · 2026-05-10T03:11:24.614189+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages

  1. [1]

    Learning Laplacian matrix in smooth graph signal representations,

    X. Dong, D. Thanou, P. Frossard, and P. Vandergheynst, “Learning Laplacian matrix in smooth graph signal representations,” in2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016, pp. 6160–6164

  2. [2]

    Connecting the dots: Identifying network structure via graph signal processing,

    G. Mateos, S. Segarra, A. G. Marques, and A. Ribeiro, “Connecting the dots: Identifying network structure via graph signal processing,”IEEE Signal Processing Magazine, vol. 36, no. 3, pp. 16–43, 2019

  3. [3]

    Imputing missing data in hourly traffic counts,

    M. A. Shafique, “Imputing missing data in hourly traffic counts,” Sensors, vol. 22, no. 24, p. 9876, 2022

  4. [4]

    Double-observer approach with camera traps can correct imperfect detection and improve the accuracy of density estimation of unmarked animal populations,

    Y . Nakashima, S. Hongo, K. Mizuno, G. Yajima, and Z. C. Dzefck, “Double-observer approach with camera traps can correct imperfect detection and improve the accuracy of density estimation of unmarked animal populations,”Scientific Reports, vol. 12, no. 1, p. 2011, 2022

  5. [5]

    Ecoacoustics for context-rich direct and indirect trophic in- teraction data and ecological network construction,

    W. Dawson, D. M. Evans, C. Abrahams, J. J. N. Kitson, L. Collins, and J. P. Cuff, “Ecoacoustics for context-rich direct and indirect trophic in- teraction data and ecological network construction,”Methods in Ecology and Evolution, vol. 00, pp. 1–17, 2026

  6. [6]

    Bugs and bytes: Entomological biomonitoring through the integration of deep learning and molecular analysis for merged community and network analysis,

    M. D. Suresh, T. Xin, S. M. Cook, and D. M. Evans, “Bugs and bytes: Entomological biomonitoring through the integration of deep learning and molecular analysis for merged community and network analysis,” Agricultural and Forest Entomology, vol. 27, no. 1, pp. 35–49, 2025

  7. [7]

    Graphical representation of landscape heterogeneity identifi- cation through unsupervised acoustic analysis,

    M. J. Guerrero, C. S ´anchez-Giraldo, C. Uribe, V . Martinez Arias, and C. Isaza, “Graphical representation of landscape heterogeneity identifi- cation through unsupervised acoustic analysis,”Methods in Ecology and Evolution, vol. 16, pp. 1255–1272, 05 2025

  8. [8]

    Structure and dynamics of mixed-species choruses in a tropical anuran assemblage: insights from network analysis,

    M. H. L. Duarte, D. Llusia, S. S. Rodrigues, and L. B. Nascimento, “Structure and dynamics of mixed-species choruses in a tropical anuran assemblage: insights from network analysis,”Ethology, vol. 127, pp. 643–650, 2021

  9. [9]

    Sampling networks of ecological interactions,

    P. Jordano, “Sampling networks of ecological interactions,”Functional ecology, vol. 30, no. 12, pp. 1883–1893, 2016

  10. [10]

    Link prediction under imperfect detection: Collaborative filtering for ecological networks,

    X. Fu, E. Seo, J. Clarke, and R. A. Hutchinson, “Link prediction under imperfect detection: Collaborative filtering for ecological networks,” IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 8, pp. 3117–3128, 2019

  11. [11]

    Invariant properties in coevolutionary networks of plant–animal interactions,

    P. Jordano, J. Bascompte, and J. M. Olesen, “Invariant properties in coevolutionary networks of plant–animal interactions,”Ecology letters, vol. 6, no. 1, pp. 69–81, 2003

  12. [12]

    Networks in ecology,

    J. Bascompte, “Networks in ecology,”Basic and applied ecology, vol. 8, no. 6, pp. 485–490, 2007

  13. [13]

    Predicting cryptic links in host-parasite networks,

    T. Dallas, A. W. Park, and J. M. Drake, “Predicting cryptic links in host-parasite networks,”PLoS computational biology, vol. 13, no. 5, p. e1005557, 2017

  14. [14]

    N-mixture models for estimating population size from spatially replicated counts,

    J. A. Royle, “N-mixture models for estimating population size from spatially replicated counts,”Biometrics, vol. 60, no. 1, pp. 108–115, 2004

  15. [15]

    Modeling abundance using n-mixture models: the importance of considering ecological mechanisms,

    L. N. Joseph, C. Elkin, T. G. Martin, and H. P. Possingham, “Modeling abundance using n-mixture models: the importance of considering ecological mechanisms,”Ecological Applications, vol. 19, no. 3, pp. 631–642, 2009

  16. [16]

    Estimating site occupancy rates when detection probabilities are less than one,

    D. I. MacKenzie, J. D. Nichols, G. B. Lachman, S. Droege, J. An- drew Royle, and C. A. Langtimm, “Estimating site occupancy rates when detection probabilities are less than one,”Ecology, vol. 83, no. 8, pp. 2248–2255, 2002

  17. [17]

    Accounting for imperfect detection in ecology: a quantitative review,

    K. F. Kellner and R. K. Swihart, “Accounting for imperfect detection in ecology: a quantitative review,”PloS one, vol. 9, no. 10, p. e111436, 2014

  18. [18]

    The link prediction problem for social networks,

    D. Liben-Nowell and J. Kleinberg, “The link prediction problem for social networks,” inProceedings of the twelfth international conference on Information and knowledge management, 2003, pp. 556–559

  19. [19]

    Matrix factorization techniques for recommender systems,

    Y . Koren, R. Bell, and C. V olinsky, “Matrix factorization techniques for recommender systems,”Computer, vol. 42, no. 8, pp. 30–37, 2009

  20. [20]

    Exact matrix completion via convex opti- mization,

    E. Candes and B. Recht, “Exact matrix completion via convex opti- mization,”Communications of the ACM, vol. 55, no. 6, pp. 111–119, 2012

  21. [21]

    Finding missing links in interaction networks,

    J. C. D. Terry and O. T. Lewis, “Finding missing links in interaction networks,”Ecology, vol. 101, no. 7, p. e03047, 2020

  22. [22]

    On the reliability of n-mixture models for count data,

    R. J. Barker, M. R. Schofield, W. A. Link, and J. R. Sauer, “On the reliability of n-mixture models for count data,”Biometrics, vol. 74, no. 1, pp. 369–377, 2018

  23. [23]

    Algorithms for non-negative matrix factor- ization,

    D. Lee and H. S. Seung, “Algorithms for non-negative matrix factor- ization,”Advances in neural information processing systems, vol. 13, 2000

  24. [24]

    Structured low-rank matrix fac- torization: Optimality, algorithm, and applications to image processing,

    B. Haeffele, E. Young, and R. Vidal, “Structured low-rank matrix fac- torization: Optimality, algorithm, and applications to image processing,” inInternational conference on machine learning. PMLR, 2014, pp. 2007–2015

  25. [25]

    Low-rank matrix completion using alternating minimization,

    P. Jain, P. Netrapalli, and S. Sanghavi, “Low-rank matrix completion using alternating minimization,” inProceedings of the forty-fifth annual ACM symposium on Theory of computing, 2013, pp. 665–674

  26. [26]

    A flexible and efficient algorithmic framework for constrained matrix and tensor factorization,

    K. Huang, N. D. Sidiropoulos, and A. P. Liavas, “A flexible and efficient algorithmic framework for constrained matrix and tensor factorization,” IEEE Transactions on Signal Processing, vol. 64, no. 19, pp. 5052–5065, 2016

  27. [27]

    Non-square matrix sensing without spurious local minima via the burer-monteiro approach,

    D. Park, A. Kyrillidis, C. Carmanis, and S. Sanghavi, “Non-square matrix sensing without spurious local minima via the burer-monteiro approach,” inArtificial Intelligence and Statistics. PMLR, 2017, pp. 65–74

  28. [28]

    Finding low-rank solutions via nonconvex matrix factorization, efficiently and provably,

    D. Park, A. Kyrillidis, C. Caramanis, and S. Sanghavi, “Finding low-rank solutions via nonconvex matrix factorization, efficiently and provably,” SIAM Journal on Imaging Sciences, vol. 11, no. 4, pp. 2165–2204, 2018

  29. [29]

    Robust principal component analysis?

    E. J. Cand `es, X. Li, Y . Ma, and J. Wright, “Robust principal component analysis?”Journal of the ACM (JACM), vol. 58, no. 3, pp. 1–37, 2011

  30. [30]

    Sparse plus low rank matrix decomposition: A discrete optimization approach,

    D. Bertsimas, R. Cory-Wright, and N. A. Johnson, “Sparse plus low rank matrix decomposition: A discrete optimization approach,”Journal of Machine Learning Research, vol. 24, no. 267, pp. 1–51, 2023

  31. [31]

    A unified framework for nonconvex low-rank plus sparse matrix recovery,

    X. Zhang, L. Wang, and Q. Gu, “A unified framework for nonconvex low-rank plus sparse matrix recovery,” inInternational Conference on Artificial Intelligence and Statistics. PMLR, 2018, pp. 1097–1107

  32. [32]

    Low-rank and sparse structure pursuit via alternating minimization,

    Q. Gu, Z. W. Wang, and H. Liu, “Low-rank and sparse structure pursuit via alternating minimization,” inArtificial Intelligence and Statistics. PMLR, 2016, pp. 600–609

  33. [33]

    Recovering low-rank and sparse matrices via robust bilateral factorization,

    F. Shang, Y . Liu, J. Cheng, and H. Cheng, “Recovering low-rank and sparse matrices via robust bilateral factorization,” in2014 IEEE International Conference on Data Mining. IEEE, 2014, pp. 965–970

  34. [34]

    Fast algorithms for sparse reduced-rank regression,

    B. Dubois, J.-F. Delmas, and G. Obozinski, “Fast algorithms for sparse reduced-rank regression,” inThe 22nd international conference on artificial intelligence and statistics. PMLR, 2019, pp. 2415–2424

  35. [35]

    Structured low-rank matrix factorization: Global optimality, algorithms, and applications,

    B. D. Haeffele and R. Vidal, “Structured low-rank matrix factorization: Global optimality, algorithms, and applications,”IEEE transactions on pattern analysis and machine intelligence, vol. 42, no. 6, pp. 1468–1482, 2019

  36. [36]

    Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems,

    M. Hong, Z.-Q. Luo, and M. Razaviyayn, “Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems,”SIAM Journal on Optimization, vol. 26, no. 1, pp. 337–364, 2016

  37. [37]

    Global convergence of admm in nonconvex nonsmooth optimization,

    Y . Wang, W. Yin, and J. Zeng, “Global convergence of admm in nonconvex nonsmooth optimization,”Journal of Scientific Computing, vol. 78, no. 1, pp. 29–63, 2019

  38. [38]

    Penalty dual decomposition method for non- smooth nonconvex optimization—part i: Algorithms and convergence analysis,

    Q. Shi and M. Hong, “Penalty dual decomposition method for non- smooth nonconvex optimization—part i: Algorithms and convergence analysis,”IEEE Transactions on Signal Processing, vol. 68, pp. 4108– 4122, 2020

  39. [39]

    Convergence rates for an inexact linearized admm for nonsmooth nonconvex optimization with nonlinear equality constraints,

    L. El Bourkhissi and I. Necoara, “Convergence rates for an inexact linearized admm for nonsmooth nonconvex optimization with nonlinear equality constraints,”Computational Optimization and Applications, pp. 1–39, 2025

  40. [40]

    Admm for nonconvex optimization under minimal continuity assumption,

    G. Yuan, “Admm for nonconvex optimization under minimal continuity assumption,” inThe Thirteenth International Conference on Learning Representations, 2025

  41. [41]

    Distributed optimization and statistical learning via the alternating direction method of multipliers,

    S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Ecksteinet al., “Distributed optimization and statistical learning via the alternating direction method of multipliers,”Foundations and Trends® in Machine learning, vol. 3, no. 1, pp. 1–122, 2011

  42. [42]

    L1/2 regularization: A thresholding representation theory and a fast solver,

    Z. Xu, X. Chang, F. Xu, and H. Zhang, “L1/2 regularization: A thresholding representation theory and a fast solver,”IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 7, pp. 1013– 1027, 2012

  43. [43]

    Wright and Y

    J. Wright and Y . Ma,High-dimensional data analysis with low- dimensional models: Principles, computation, and applications. Cam- bridge University Press, 2022. 12

  44. [44]

    An adaptive lagrangian-based scheme for nonconvex composite optimization,

    N. Hallak and M. Teboulle, “An adaptive lagrangian-based scheme for nonconvex composite optimization,”Mathematics of Operations Research, vol. 48, no. 4, pp. 2337–2352, 2023

  45. [45]

    Local convergence of exact and inexact augmented lagrangian methods under the second-order sufficient optimality condition,

    D. Fern ´andez and M. V . Solodov, “Local convergence of exact and inexact augmented lagrangian methods under the second-order sufficient optimality condition,”SIAM Journal on Optimization, vol. 22, no. 2, pp. 384–407, 2012

  46. [46]

    Inexact block coordinate descent algorithms for nonsmooth nonconvex optimization,

    Y . Yang, M. Pesavento, Z.-Q. Luo, and B. Ottersten, “Inexact block coordinate descent algorithms for nonsmooth nonconvex optimization,” IEEE Transactions on Signal Processing, vol. 68, pp. 947–961, 2019

  47. [47]

    Soundscape connectomes: Unsupervised graph-based approach for soundscape mapping,

    M. J. Guerrero, A. Einizade, J. H. Giraldo, V . M. Mart ´ınez-Arias, C. Isaza, and C. A. Uribe, “Soundscape connectomes: Unsupervised graph-based approach for soundscape mapping,” inNeurIPS 2025 Work- shop on AI for Non-Human Animal Communication, 2025, openReview

  48. [48]

    Plant pollinator data at hj andrews experimental forest, 2011 to present,

    V . W. Pfeiffer, “Plant pollinator data at hj andrews experimental forest, 2011 to present,” 2017, Environmental Data Initiative

  49. [49]

    Acoustic animal identification using unsupervised learning

    M. J. Guerrero, C. L. Bedoya, J. D. L ´opez, J. M. Daza, and C. Isaza, “Acoustic animal identification using unsupervised learning.”Methods in Ecology and Evolution, vol. 14, p. 1500–1514, 2023

  50. [50]

    Initialization for non-negative matrix factorization: a comprehensive review,

    S. Fathi Hafshejani and Z. Moaberfard, “Initialization for non-negative matrix factorization: a comprehensive review,”International Journal of Data Science and Analytics, vol. 16, no. 1, pp. 119–134, 2023

  51. [51]

    Dropping convexity for faster semi-definite optimization,

    S. Bhojanapalli, A. Kyrillidis, and S. Sanghavi, “Dropping convexity for faster semi-definite optimization,” inConference on Learning Theory. PMLR, 2016, pp. 530–582. APPENDIX A. Implementation Details In all experiments, the proposed algorithm was run for at most 80 or 100 outer iterations. Convergence of the outer loop was declared when both the change ...