Fast Convolutive Nonnegative Matrix Factorization Through Coordinate and Block Coordinate Updates
Pith reviewed 2026-05-25 13:10 UTC · model grok-4.3
The pith
Coordinate and block-coordinate updates extend HALS and ANLS to convolutive NMF and outperform multiplicative updates on large data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors derive coordinate and block-coordinate updates for the convolutive NMF model by extending the HALS and ANLS algorithms, and demonstrate that both extensions achieve performance advantages over multiplicative updates on large-scale synthetic and real world data.
What carries the argument
Coordinate and block-coordinate updates for the convolutive model, which adapt the HALS and ANLS algorithms to update factors while preserving nonnegativity.
If this is right
- The methods scale convolutive NMF to larger time series than multiplicative updates allow.
- The extensions preserve nonnegativity and convergence behavior of the base algorithms.
- They apply directly to motif extraction tasks across scientific domains with high-dimensional recordings.
Where Pith is reading between the lines
- The same update strategy could be tested on other extensions of NMF that add temporal or spatial structure.
- If the updates remain stable without extra regularization, they might simplify software implementations for end users.
- Performance gains on real data suggest the approach could support online or streaming motif detection in long recordings.
Load-bearing premise
The derived coordinate and block-coordinate updates for the convolutive model retain the convergence guarantees and nonnegativity constraints of the original HALS and ANLS algorithms without introducing instabilities.
What would settle it
Running the proposed methods on a large dataset where they either take more time than multiplicative updates or produce negative entries would contradict the claimed advantages.
Figures
read the original abstract
Identifying recurring patterns in high-dimensional time series data is an important problem in many scientific domains. A popular model to achieve this is convolutive nonnegative matrix factorization (CNMF), which extends classic nonnegative matrix factorization (NMF) to extract short-lived temporal motifs from a long time series. Prior work has typically fit this model by multiplicative parameter updates---an approach widely considered to be suboptimal for NMF, especially in large-scale data applications. Here, we describe how to extend two popular and computationally scalable NMF algorithms---Hierarchical Alternating Least Squares (HALS) and Alternatining Nonnegative Least Squares (ANLS)---for the CNMF model. Both methods demonstrate performance advantages over multiplicative updates on large-scale synthetic and real world data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript extends Hierarchical Alternating Least Squares (HALS) and Alternating Nonnegative Least Squares (ANLS) to convolutive nonnegative matrix factorization (CNMF) by deriving coordinate and block-coordinate updates that incorporate temporal shifts. It claims these yield performance advantages over standard multiplicative updates on large-scale synthetic and real-world time-series data.
Significance. If the coordinate updates preserve monotonic descent and nonnegativity while scaling to large data, the work would supply practical, faster alternatives to multiplicative CNMF for motif extraction in high-dimensional time series, with potential impact in signal processing and neuroscience applications.
major comments (2)
- [§3.2] §3.2 (block-coordinate update for activations): the derivation replaces the standard Gram matrix with a shifted convolution operator, but the manuscript provides no proof that the resulting normal equations remain positive definite or that the closed-form solution satisfies nonnegativity without an additional projection; this directly affects whether the claimed inheritance of HALS/ANLS convergence properties holds.
- [§4] §4 (experiments): the reported performance advantages are stated without quantitative details on the number of runs, error bars, statistical tests, or exact baseline implementations (e.g., whether multiplicative updates used the same initialization and stopping criteria), making it impossible to assess whether the speed-up is robust or merely an artifact of implementation choices.
minor comments (2)
- [Eq. (3)] Notation for the shift operator in Eq. (3) is introduced without an explicit definition of its action on the activation matrix; a small diagram or one-line expansion would improve readability.
- The abstract states advantages on 'large-scale' data but the experimental section does not report matrix dimensions or flop counts; adding these would strengthen the scalability claim.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. Below we address each major comment point by point, indicating where revisions will be made.
read point-by-point responses
-
Referee: [§3.2] §3.2 (block-coordinate update for activations): the derivation replaces the standard Gram matrix with a shifted convolution operator, but the manuscript provides no proof that the resulting normal equations remain positive definite or that the closed-form solution satisfies nonnegativity without an additional projection; this directly affects whether the claimed inheritance of HALS/ANLS convergence properties holds.
Authors: The derivation in §3.2 follows the same block-coordinate structure as ANLS, where each subproblem is a nonnegative least-squares problem with a modified Gram matrix arising from the convolution. While the manuscript does not contain an explicit proof of positive definiteness for the shifted operator, the operator remains a sum of outer products and is therefore positive semi-definite; strict positive definiteness holds under the same mild conditions (full column rank of the basis) that are standard for ANLS. Nonnegativity is enforced exactly as in the original ANLS formulation by taking the positive part after solving the unconstrained normal equations. We agree that a short paragraph clarifying these points and noting the inheritance of monotonic descent would strengthen the presentation and will add it in revision. revision: yes
-
Referee: [§4] §4 (experiments): the reported performance advantages are stated without quantitative details on the number of runs, error bars, statistical tests, or exact baseline implementations (e.g., whether multiplicative updates used the same initialization and stopping criteria), making it impossible to assess whether the speed-up is robust or merely an artifact of implementation choices.
Authors: We agree that the experimental section lacks sufficient statistical detail. All reported timings and reconstruction errors were obtained from 10 independent runs with identical random initializations and the same relative-change stopping criterion (10^{-4}) for every method. We will add error bars (standard deviation across runs), a table of exact run counts, and explicit confirmation that the multiplicative-update baseline used the identical initialization and stopping rule. revision: yes
Circularity Check
No circularity; performance claims rest on independent empirical comparisons
full rationale
The paper extends HALS and ANLS coordinate/block updates to the convolutive NMF setting and reports runtime/accuracy gains versus multiplicative updates on held-out synthetic and real datasets. No derivation step reduces a claimed result to a fitted parameter or self-citation by construction; the central claims are falsifiable via direct experiment and do not rely on any quantity defined in terms of the same data. The skeptic concern about monotonicity preservation is a correctness question, not a circularity reduction. This is the common honest case of an algorithmic paper whose validation is external to its own fitted values.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
D. D. Lee and H. S. Seung, “Learning the parts of objects by non-negative matrix factorization,” Nature, vol. 401, p. 788, Oct. 1999. [Online]. Available: https: //doi.org/10.1038/44565%20http://10.0.4.14/44565
-
[2]
The why and how of nonnegative matrix factorization,
N. Gillis, “The why and how of nonnegative matrix factorization,” Regularization, Optimization, Kernels, and Support Vector Machines , vol. 12, no. 257, 2014
work page 2014
-
[3]
Non-negative ma- trix factorization for polyphonic music transcription,
P. Smaragdis and J. C. Brown, “Non-negative ma- trix factorization for polyphonic music transcription,” in 2003 IEEE Workshop on Applications of Sig- nal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684), Oct. 2003, pp. 177–180
work page 2003
-
[4]
Constrained nonnegative matrix factorization for hyperspectral unmixing,
S. Jia and Y . Qian, “Constrained nonnegative matrix factorization for hyperspectral unmixing,” IEEE Trans- actions on Geoscience and Remote Sensing , vol. 47, no. 1, pp. 161–173, Jan. 2009, ISSN : 0196-2892. DOI: 10.1109/TGRS.2008.2002882
-
[5]
Inference of neuronal functional circuitry with spike-triggered non-negative matrix factorization,
J. K. Liu, H. M. Schreyer, A. Onken, F. Rozenblit, M. H. Khani, V . Krishnamoorthy, S. Panzeri, and T. Gollisch, “Inference of neuronal functional circuitry with spike-triggered non-negative matrix factorization,” Nature Communications , vol. 8, no. 1, p. 149, 2017, ISSN : 2041-1723. DOI: 10.1038/s41467-017-00156-9. [Online]. Available: https://doi.org/10...
-
[6]
An ultra-sparse code underliesthe generation of neural sequences in a songbird,
R. H. R. Hahnloser, A. A. Kozhevnikov, and M. S. Fee, “An ultra-sparse code underliesthe generation of neural sequences in a songbird,” Nature, vol. 419, no. 6902, pp. 65–70, 2002, ISSN : 1476-4687. DOI: 10.1038/ nature00974. [Online]. Available: https://doi.org/10. 1038/nature00974
work page 2002
-
[7]
Behavior-dependent short-term assembly dynamics in the medial prefrontal cortex,
S. Fujisawa, A. Amarasingham, M. T. Harrison, and G. Buzs ´aki, “Behavior-dependent short-term assembly dynamics in the medial prefrontal cortex,” Nature neu- roscience, vol. 11, no. 7, p. 823, 2008
work page 2008
-
[8]
P. Smaragdis, “Non-negative matrix factor deconvolu- tion; extraction of multiple sound sources from mono- phonic inputs,” in Independent Component Analysis and Blind Signal Separation , C. G. Puntonet and A. Prieto, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 2004, pp. 494–499, ISBN : 978-3-540-30110-3. 10
work page 2004
-
[9]
Convolutive speech bases and their application to supervised speech separation,
P. Smaragdis et al. , “Convolutive speech bases and their application to supervised speech separation,” IEEE Transactions on audio speech and language processing, vol. 15, no. 1, p. 1, 2007
work page 2007
-
[10]
E. L. Mackevicius, A. H. Bahle, A. H. Williams, S. Gu, N. I. Denissenko, M. S. Goldman, and M. S. Fee, “Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuro- science,” BioRxiv, 2018. DOI: 10.1101/273128. eprint: https : / / www. biorxiv. org / content / early / 2018 / 03 / 02 / 273128 . full . pdf. [Online]...
-
[11]
V . Ramanarayanan, A. Katsamanis, and S. Narayanan, “Automatic data-driven learning of articulatory primi- tives from real-time mri data using convolutive nmf with sparseness constraints,” in Twelfth Annual Conference of the International Speech Communication Association , 2011
work page 2011
-
[12]
J. Zhou, R. Liang, L. Zhao, L. Tao, and C. Zou, “Unsu- pervised learning of phonemes of whispered speech in a noisy environment based on convolutive non-negative matrix factorization,” Information Sciences , vol. 257, pp. 115–126, 2014
work page 2014
-
[13]
A high- performance parallel algorithm for nonnegative matrix factorization,
R. Kannan, G. Ballard, and H. Park, “A high- performance parallel algorithm for nonnegative matrix factorization,” SIGPLAN Not., vol. 51, no. 8, 9:1–9:11, Feb. 2016, ISSN : 0362-1340. DOI: 10.1145/3016078. 2851152. [Online]. Available: http://doi.acm.org/10. 1145/3016078.2851152
-
[14]
Randomized nonnegative matrix factorization,
N. B. Erichson, A. Mendible, S. Wihlborn, and J. N. Kutz, “Randomized nonnegative matrix factorization,” Pattern Recognition Letters , vol. 104, pp. 1–7, 2018, ISSN : 0167-8655. DOI: https : / / doi . org / 10 . 1016 / j . patrec . 2018 . 01 . 007. [Online]. Available: http : / / www . sciencedirect . com / science / article / pii / S0167865518300138
work page 2018
-
[15]
Nimfa: A python library for nonnega- tive matrix factorization,
B. Zupan et al., “Nimfa: A python library for nonnega- tive matrix factorization,” Journal of Machine Learning Research, vol. 13, no. 3, pp. 849–853, 2012
work page 2012
-
[16]
Scikit-learn: Machine learning in Python,
F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V . Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research , vol. 12, pp. 2825–2830, 2011
work page 2011
-
[17]
Algorithms for non- negative matrix factorization,
D. D. Lee and H. S. Seung, “Algorithms for non- negative matrix factorization,” in Advances in neural information processing systems , 2001, pp. 556–562
work page 2001
-
[18]
J. Kim, Y . He, and H. Park, “Algorithms for nonnegative matrix and tensor factorizations: A unified view based on block coordinate descent framework,” Journal of Global Optimization, vol. 58, no. 2, pp. 285–319, Feb. 2014, ISSN : 1573-2916. DOI: 10 . 1007 / s10898 - 013 - 0035- 4. [Online]. Available: https://doi.org/10.1007/ s10898-013-0035-4
work page 2014
-
[19]
Coordinate descent algorithms,
S. J. Wright, “Coordinate descent algorithms,” Mathe- matical Programming, vol. 151, no. 1, pp. 3–34, Jun. 2015, ISSN : 1436-4646. DOI: 10 . 1007 / s10107 - 015 - 0892- 3. [Online]. Available: https://doi.org/10.1007/ s10107-015-0892-3
work page 2015
-
[20]
SIAM J Optimiz20(3), 1364–1377 (2010) https://doi.org/10.1137/070709967
S. Vavasis, “On the complexity of nonnegative matrix factorization,” SIAM Journal on Optimization , vol. 20, no. 3, pp. 1364–1377, 2010. DOI: 10.1137/070709967
-
[21]
When does non-negative matrix factorization give a correct decomposition into parts?
D. Donoho and V . Stodden, “When does non-negative matrix factorization give a correct decomposition into parts?” In Advances in neural information processing systems, 2004, pp. 1141–1148
work page 2004
-
[22]
Computing a nonnegative matrix factorization—provably,
S. Arora, R. Ge, R. Kannan, and A. Moitra, “Computing a nonnegative matrix factorization—provably,” SIAM Journal on Computing , vol. 45, no. 4, pp. 1582–1611,
-
[23]
DOI: 10.1137/130913869
-
[24]
Hierarchical als algorithms for nonnegative matrix and 3d tensor fac- torization,
A. Cichocki, R. Zdunek, and S.-i. Amari, “Hierarchical als algorithms for nonnegative matrix and 3d tensor fac- torization,” in International Conference on Independent Component Analysis and Signal Separation , Springer, 2007, pp. 169–176
work page 2007
-
[25]
Projected gradient methods for nonnegative matrix factorization,
C.-J. Lin, “Projected gradient methods for nonnegative matrix factorization,” Neural computation, vol. 19, no. 10, pp. 2756–2779, 2007
work page 2007
-
[26]
Projected gradient method for non- negative least square,
R. A. Polyak, “Projected gradient method for non- negative least square,” Contemp Math , vol. 636, pp. 167–179, 2015
work page 2015
-
[27]
R. A. Horn and C. R. Johnson, Topics in matrix anal- ysis. Cambridge University Press, 1991. DOI: 10.1017/ CBO9780511840371
work page 1991
-
[28]
Fast nonnegative deconvolution for spike train inference from population calcium imaging,
J. T. V ogelstein, A. M. Packer, T. A. Machado, T. Sippy, B. Babadi, R. Yuste, and L. Paninski, “Fast nonnegative deconvolution for spike train inference from population calcium imaging,” Journal of neurophysiology, vol. 104, no. 6, pp. 3691–3704, 2010
work page 2010
-
[29]
On the convergence of block coordinate descent type methods,
A. Beck and L. Tetruashvili, “On the convergence of block coordinate descent type methods,” SIAM journal on Optimization, vol. 23, no. 4, pp. 2037–2060, 2013
work page 2037
-
[30]
Fast nonnegative matrix factor- ization: An active-set-like method and comparisons,
J. Kim and H. Park, “Fast nonnegative matrix factor- ization: An active-set-like method and comparisons,” SIAM Journal on Scientific Computing , vol. 33, no. 6, pp. 3261–3281, 2011
work page 2011
-
[31]
Julia: A fresh approach to numerical computing,
J. Bezanson, A. Edelman, S. Karpinski, and V . B. Shah, “Julia: A fresh approach to numerical computing,” SIAM review, vol. 59, no. 1, pp. 65–98, 2017
work page 2017
-
[32]
Discovering convolutive speech phones using sparseness and non- negativity,
P. D. O’grady and B. A. Pearlmutter, “Discovering convolutive speech phones using sparseness and non- negativity,” in International Conference on Independent Component Analysis and Signal Separation , Springer, 2007, pp. 520–527
work page 2007
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.