Convolutional Dictionary Learning in Hierarchical Networks

Bahareh Tolooshams; Demba Ba; Javier Zazo

arxiv: 1907.09881 · v1 · pith:5ODR6CTEnew · submitted 2019-07-23 · 💻 cs.LG · stat.ML

Convolutional Dictionary Learning in Hierarchical Networks

Javier Zazo , Bahareh Tolooshams , Demba Ba This is my paper

Pith reviewed 2026-05-24 17:30 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords convolutional dictionary learninghierarchical generative modelpiecewise smooth signalssparse codingdeep networkswavelet domainalternating minimizationMNIST classification

0 comments

The pith

A recursive generative model builds piecewise smooth signals by generating low-pass scale coefficients from the next layer plus sparse high-pass innovations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a hierarchical model for signals like natural images in which low-pass coefficients at each layer arise from filtering the coefficients of the subsequent layer and adding a high-pass detail term produced by filtering a sparse vector. This recursion defines a linear dynamic system that acts as a non-Gaussian Markov process across scales. The model extends multilayer convolutional sparse coding by permitting deeper networks and by mixing sparse detail with non-sparse scale representations. An alternating-minimization procedure learns the filters from observed data at the coarsest scale; the coefficient-estimation half of the procedure unfolds into a deep neural network. The resulting coefficients are shown to support classification on MNIST.

Core claim

We propose a hierarchical deep generative model of piecewise smooth signals that is a recursion across scales: the low pass scale coefficients at one layer are obtained by filtering the scale coefficients at the next layer, and adding a high pass detail innovation obtained by filtering a sparse vector. This recursion describes a linear dynamic system that is a non-Gaussian Markov process across scales and is closely related to multilayer-convolutional sparse coding (ML-CSC) generative model for deep networks, except that our model allows for deeper architectures, and combines sparse and non-sparse signal representations. We propose an alternating minimization algorithm for learning the фильт

What carries the argument

The scale recursion that obtains each layer's low-pass coefficients by filtering the next layer's coefficients and adding a filtered sparse high-pass innovation.

If this is right

The model supports deeper architectures than standard ML-CSC formulations.
The alternating minimization alternates sparse detail coding with smooth scale coding.
Unfolding the coefficient-estimation step produces a deep neural network.
Coefficients extracted by the model serve as features for classification on MNIST.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The recursion supplies an explicit link between classical wavelet filter banks and the layered structure of convolutional networks.
If the statistical match holds, the same recursion could be used to initialize or regularize networks trained on other piecewise-smooth data such as audio or medical images.
The separation of sparse detail and non-sparse scale representations suggests a natural way to add sparsity constraints only to selected layers of an existing network.

Load-bearing premise

The recursion is assumed to faithfully reproduce the empirically observed scale and detail coefficient statistics of natural images in the wavelet domain.

What would settle it

Generate signals from the fitted model and compare the empirical distributions of their wavelet scale and detail coefficients against those of real natural images; a mismatch would falsify the claimed generative connection.

read the original abstract

Filter banks are a popular tool for the analysis of piecewise smooth signals such as natural images. Motivated by the empirically observed properties of scale and detail coefficients of images in the wavelet domain, we propose a hierarchical deep generative model of piecewise smooth signals that is a recursion across scales: the low pass scale coefficients at one layer are obtained by filtering the scale coefficients at the next layer, and adding a high pass detail innovation obtained by filtering a sparse vector. This recursion describes a linear dynamic system that is a non-Gaussian Markov process across scales and is closely related to multilayer-convolutional sparse coding (ML-CSC) generative model for deep networks, except that our model allows for deeper architectures, and combines sparse and non-sparse signal representations. We propose an alternating minimization algorithm for learning the filters in this hierarchical model given observations at layer zero, e.g., natural images. The algorithm alternates between a coefficient-estimation step and a filter update step. The coefficient update step performs sparse (detail) and smooth (scale) coding and, when unfolded, leads to a deep neural network. We use MNIST to demonstrate the representation capabilities of the model, and its derived features (coefficients) for classification.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a recursive generative model extending ML-CSC with mixed sparse/non-sparse representations across scales and an unfolding algorithm, but tests only on MNIST.

read the letter

The main point is a hierarchical model that recurses across scales: low-pass coefficients at one layer come from filtering the next layer plus a high-pass detail innovation from a sparse vector. This extends ML-CSC to deeper stacks and mixed representations, with an alternating-minimization procedure whose coefficient step unfolds into a network. The motivation comes from wavelet coefficient statistics on piecewise smooth signals like images.

Referee Report

0 major / 3 minor

Summary. The manuscript proposes a hierarchical deep generative model for piecewise smooth signals, structured as a recursion across scales in which low-pass scale coefficients at one layer are formed by filtering the scale coefficients from the next layer and adding a high-pass detail innovation obtained by filtering a sparse vector. The model is presented as a non-Gaussian Markov process related to but extending multilayer convolutional sparse coding (ML-CSC) by permitting deeper architectures and mixing sparse and non-sparse representations. An alternating-minimization procedure is derived for learning the filters from observations at layer zero; the coefficient-update step unfolds into a deep network. MNIST experiments are used to illustrate representation capabilities and the utility of the learned coefficients for classification.

Significance. If the recursion and learning procedure are correctly derived and stable, the work supplies a principled generative link between classical wavelet/filter-bank analysis and modern deep convolutional networks, with the unfolding argument providing an explicit construction of the network from the model. The explicit allowance for deeper recursion than ML-CSC and the combination of sparse detail with smooth scale coefficients are concrete technical contributions. The MNIST results supply initial evidence that the derived features are useful for downstream tasks, though the paper does not claim quantitative superiority over existing methods.

minor comments (3)

[Abstract / model motivation] The abstract states that the recursion is 'motivated by the empirically observed properties' of wavelet coefficients but supplies no quantitative comparison or citation to the specific statistics being reproduced; a short paragraph or reference in the model section would clarify the precise empirical regularities being targeted.
[Experiments] The MNIST experiments demonstrate representation and classification utility, yet the model is motivated by natural-image wavelet statistics; a brief discussion of why MNIST suffices for a proof-of-concept (or an additional small experiment on a natural-image patch dataset) would strengthen the experimental section.
[Learning algorithm] The alternating-minimization algorithm is described at a high level; adding pseudocode or explicit update equations for the filter-update step (including any regularization or normalization) would improve reproducibility.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary and recommendation of minor revision. No major comments were provided in the report, so we have no specific points to address or revise.

Circularity Check

0 steps flagged

No significant circularity; model is a proposal with independent learning procedure.

full rationale

The paper proposes a recursive generative model motivated by (but not derived from) empirically observed wavelet coefficient statistics, together with an alternating-minimization algorithm whose unfolded form yields a network. No step claims a first-principles derivation that reduces by construction to fitted inputs, self-citations, or renamed known results. The recursion is an explicit modeling assumption whose validity is separate from the learning procedure; MNIST results demonstrate representation utility without circular claims of prediction or uniqueness. This is the common case of a self-contained constructive proposal.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no identifiable free parameters, axioms, or invented entities beyond the high-level recursion itself.

pith-pipeline@v0.9.0 · 5740 in / 1061 out tokens · 31172 ms · 2026-05-24T17:30:50.111940+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the low pass scale coefficients at one layer are obtained by filtering the scale coefficients at the next layer, and adding a high pass detail innovation obtained by filtering a sparse vector
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

tied filters A_ℓ=A_{ℓ+1} and B_ℓ=B_{ℓ+1} across layers ... resemble wavelet analysis

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 3 internal anchors

[1]

INTRODUCTION With the advent of neural networks and current state-of-the- art performance on many machine learning applications [1], deep learning has become an ubiquitous framework with which to address problems in a wide range of domains. In particular, convolutional neural networks (CNNs) have been very successful for image classiﬁcation [2], as they a...

work page internal anchor Pith review Pith/arXiv arXiv 1907
[2]

(2c) Here,ℓ indicates layer index, where ℓ = 0 refers to the input signal, and ℓ > 0 refers to a deeper encoding

MODEL DESCRIPTION Given a scale signal xL and detail signals u = [ u1,..., uL], we propose the following recursive generative model xℓ−1 = Aℓ ∗ xℓ + Bℓ ∗ uℓ + εℓ, ℓ ∈ { 1,...,L }, (1) and assume the following latent prior distributions: εℓ ∼ N (0,σ 2 ℓ ), ∀ℓ ∈ { 1,...,L } (2a) uℓ ∼ Laplace(0,λℓ), (2b) xℓ ∼ N (0,σ 2 xℓ ). (2c) Here,ℓ indicates layer index,...

work page
[3]

HIERARCHICAL CSC We can synthetize images from scale representation xL and detail signals across layers u ≜ [u1,..., uL] using Eq. (1). However, the analysis step requires solving the inverse prob- lem to ﬁnd appropriate encodings for an imagex0 across lay- ers, i.e., x ≜ [x1,..., xL] and u. We refer to such problem as hierarchical convolutional sparse co...

work page
[4]

Here, ˆx0 = x0 is given as input image, and subsequent estimates ˆxℓ are obtained after solving Eq

(5) for every layer ℓ ∈ { 1,...,L }. Here, ˆx0 = x0 is given as input image, and subsequent estimates ˆxℓ are obtained after solving Eq. (4). Relationship with CNNs and ReLU activations : Eq. (4) incorporates two important regularizers into the model. The ℓ1-norm enforces sparsity on uℓ signal, and accomplishes this result with high resemblance to standar...

work page
[5]

(1) and (2) is non-convex (bilinear) on both ﬁlters Aℓ, Bℓ and variables xℓ, uℓ, across layers

CONVOLUTIONAL DICTIONARY LEARNING The negative of the log-posterior that results from the gen- erative model presented by Eqs. (1) and (2) is non-convex (bilinear) on both ﬁlters Aℓ, Bℓ and variables xℓ, uℓ, across layers. However, it is natural to propose an alternating opti- mization scheme that solves the problem on speciﬁc variables while ﬁxing the re...

work page
[6]

The training step uses Algorithm 1 to solve the inverse prob- lem, and then updates the ﬁlters with stochastic gradient de- scent following Section 4

EXPERIMENTAL RESULTS To illustrate our results, we trained a set of hierarchical mod- els withL = 3 on MNIST database, comprising 60,000 train- ing and 10,000 test grayscale digit images of 28 × 28 pixels. The training step uses Algorithm 1 to solve the inverse prob- lem, and then updates the ﬁlters with stochastic gradient de- scent following Section 4. ...

work page
[7]

This model used a recursive procedure where the scale signals were further decomposed into sub- sequent scale and detail components, providing higher or- der representations

CONCLUSION We proposed a generative convolutional model to analyze sig- nals based on smooth representation (scale), and sparse con- tributions (detail). This model used a recursive procedure where the scale signals were further decomposed into sub- sequent scale and detail components, providing higher or- der representations. Such decomposition used a hi...

work page
[8]

Deep learning,

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436, 2015

work page 2015
[9]

Imagenet classiﬁcation with deep convolutional neural networks,

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hin- ton, “Imagenet classiﬁcation with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105

work page 2012
[10]

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

Jonathan Frankle and Michael Carbin, “The lottery ticket hypothesis: Finding sparse, trainable neural net- works,” arXiv preprint arXiv:1803.03635, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[11]

Energy and Policy Considerations for Deep Learning in NLP

Emma Strubell, Ananya Ganesh, and Andrew McCal- lum, “Energy and policy considerations for deep learn- ing in nlp,” arXiv preprint arXiv:1906.02243, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1906
[12]

A theory for multiresolution sig- nal decomposition: The wavelet representation,

St ´ephane Mallat, “A theory for multiresolution sig- nal decomposition: The wavelet representation,” IEEE Transactions on Pattern Analysis and Machine Intelli- gence, vol. 11, pp. 674–693, 1989

work page 1989
[13]

St ´ephane Mallat, A wavelet tour of signal processing , Elsevier, 1999

work page 1999
[14]

Invariant scattering convolution networks,

Joan Bruna and St ´ephane Mallat, “Invariant scattering convolution networks,” IEEE transactions on pattern analysis and machine intelligence , vol. 35, no. 8, pp. 1872–1886, 2013

work page 2013
[15]

Fast convolutional sparse coding,

Hilton Bristow, Anders Eriksson, and Simon Lucey, “Fast convolutional sparse coding,” in Proc. 2013 IEEE Conference on Computer Vision and Pattern Recogni- tion, 2013, pp. 391–398

work page 2013
[16]

Convo- lutional dictionary learning: A comparative review and new algorithms,

Cristina Garcia-Cardona and Brendt Wohlberg, “Convo- lutional dictionary learning: A comparative review and new algorithms,” IEEE Transactions on Computational Imaging, vol. 4, no. 3, pp. 366–381, Sept. 2018, There are errors in Equations (18) and (19) in the published version of the paper. These have been corrected in the most recent arXiv version

work page 2018
[17]

Multilayer convolutional sparse model- ing: Pursuit and dictionary learning,

Jeremias Sulam, Vardan Papyan, Yaniv Romano, and Michael Elad, “Multilayer convolutional sparse model- ing: Pursuit and dictionary learning,”Signal Processing, IEEE Transactions on, vol. 66, no. 15, pp. 4090–4104, 2018

work page 2018
[18]

Stable signal recovery from incomplete and in- accurate measurements,

Emmanuel J. Cand `es, Justin K. Romberg, and Terence Tao, “Stable signal recovery from incomplete and in- accurate measurements,” Communications on Pure and Applied Mathematics, vol. 59, pp. 1–15, 2006

work page 2006
[19]

Exact recovery of sparsely used overcom- plete dictionaries,

Alekh Agarwal, Animashree Anandkumar, and Praneeth Netrapalli, “Exact recovery of sparsely used overcom- plete dictionaries,” stat, vol. 1050, pp. 8–39, 2013

work page 2013
[20]

Deconvolutional networks,

M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus, “Deconvolutional networks,” in Proc. 2010 IEEE Com- puter Society Conference on Computer Vision and Pat- tern Recognition (CVPR), June 2010, pp. 2528–2535

work page 2010
[21]

Working locally thinking globally: Theoretical guar- antees for convolutional sparse coding,

Vardan Papyan, Jeremias Sulam, and Michael Elad, “Working locally thinking globally: Theoretical guar- antees for convolutional sparse coding,” IEEE Transac- tions on Signal Processing , vol. 65, no. 21, pp. 5687– 5701, 2017

work page 2017
[22]

Sparse approximate solutions to linear systems,

Balas Kausik Natarajan, “Sparse approximate solutions to linear systems,” SIAM journal on computing, vol. 24, no. 2, pp. 227–234, 1995

work page 1995
[23]

Markov random ﬁeld extensions us- ing state space models,

Claus Dethlefsen, “Markov random ﬁeld extensions us- ing state space models,” in Markov random ﬁeld exten- sions using state space models. Oxford University Press, 2003, pp. 493–501

work page 2003
[24]

Convolutional neu- ral networks analyzed via convolutional sparse coding,

V Papyan, Y Romano, and M Elad, “Convolutional neu- ral networks analyzed via convolutional sparse coding,” Journal of Machine Learning Research, vol. 18, pp. 1– 52, 2017

work page 2017
[25]

Deeply-sparse signal representations (DS2P),

Demba Ba, “Deeply-sparse signal representations (DS2P),” 2018

work page 2018
[26]

A fast itera- tive shrinkage-thresholding algorithm for linear inverse problems,

Amir Beck and Marc Teboulle, “A fast itera- tive shrinkage-thresholding algorithm for linear inverse problems,” SIAM journal on imaging sciences , vol. 2, no. 1, pp. 183–202, 2009

work page 2009
[27]

Scalable convolutional dictionary learning with con- strained recurrent sparse auto-encoders,

Bahareh Tolooshams, Sourav Dey, and Demba Ba, “Scalable convolutional dictionary learning with con- strained recurrent sparse auto-encoders,” in Proc. 2018 IEEE 28th International Workshop on Machine Learn- ing for Signal Processing (MLSP), Sept. 2018, pp. 1–6

work page 2018

[1] [1]

INTRODUCTION With the advent of neural networks and current state-of-the- art performance on many machine learning applications [1], deep learning has become an ubiquitous framework with which to address problems in a wide range of domains. In particular, convolutional neural networks (CNNs) have been very successful for image classiﬁcation [2], as they a...

work page internal anchor Pith review Pith/arXiv arXiv 1907

[2] [2]

(2c) Here,ℓ indicates layer index, where ℓ = 0 refers to the input signal, and ℓ > 0 refers to a deeper encoding

MODEL DESCRIPTION Given a scale signal xL and detail signals u = [ u1,..., uL], we propose the following recursive generative model xℓ−1 = Aℓ ∗ xℓ + Bℓ ∗ uℓ + εℓ, ℓ ∈ { 1,...,L }, (1) and assume the following latent prior distributions: εℓ ∼ N (0,σ 2 ℓ ), ∀ℓ ∈ { 1,...,L } (2a) uℓ ∼ Laplace(0,λℓ), (2b) xℓ ∼ N (0,σ 2 xℓ ). (2c) Here,ℓ indicates layer index,...

work page

[3] [3]

HIERARCHICAL CSC We can synthetize images from scale representation xL and detail signals across layers u ≜ [u1,..., uL] using Eq. (1). However, the analysis step requires solving the inverse prob- lem to ﬁnd appropriate encodings for an imagex0 across lay- ers, i.e., x ≜ [x1,..., xL] and u. We refer to such problem as hierarchical convolutional sparse co...

work page

[4] [4]

Here, ˆx0 = x0 is given as input image, and subsequent estimates ˆxℓ are obtained after solving Eq

(5) for every layer ℓ ∈ { 1,...,L }. Here, ˆx0 = x0 is given as input image, and subsequent estimates ˆxℓ are obtained after solving Eq. (4). Relationship with CNNs and ReLU activations : Eq. (4) incorporates two important regularizers into the model. The ℓ1-norm enforces sparsity on uℓ signal, and accomplishes this result with high resemblance to standar...

work page

[5] [5]

(1) and (2) is non-convex (bilinear) on both ﬁlters Aℓ, Bℓ and variables xℓ, uℓ, across layers

CONVOLUTIONAL DICTIONARY LEARNING The negative of the log-posterior that results from the gen- erative model presented by Eqs. (1) and (2) is non-convex (bilinear) on both ﬁlters Aℓ, Bℓ and variables xℓ, uℓ, across layers. However, it is natural to propose an alternating opti- mization scheme that solves the problem on speciﬁc variables while ﬁxing the re...

work page

[6] [6]

The training step uses Algorithm 1 to solve the inverse prob- lem, and then updates the ﬁlters with stochastic gradient de- scent following Section 4

EXPERIMENTAL RESULTS To illustrate our results, we trained a set of hierarchical mod- els withL = 3 on MNIST database, comprising 60,000 train- ing and 10,000 test grayscale digit images of 28 × 28 pixels. The training step uses Algorithm 1 to solve the inverse prob- lem, and then updates the ﬁlters with stochastic gradient de- scent following Section 4. ...

work page

[7] [7]

This model used a recursive procedure where the scale signals were further decomposed into sub- sequent scale and detail components, providing higher or- der representations

CONCLUSION We proposed a generative convolutional model to analyze sig- nals based on smooth representation (scale), and sparse con- tributions (detail). This model used a recursive procedure where the scale signals were further decomposed into sub- sequent scale and detail components, providing higher or- der representations. Such decomposition used a hi...

work page

[8] [8]

Deep learning,

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton, “Deep learning,” nature, vol. 521, no. 7553, pp. 436, 2015

work page 2015

[9] [9]

Imagenet classiﬁcation with deep convolutional neural networks,

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hin- ton, “Imagenet classiﬁcation with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105

work page 2012

[10] [10]

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

Jonathan Frankle and Michael Carbin, “The lottery ticket hypothesis: Finding sparse, trainable neural net- works,” arXiv preprint arXiv:1803.03635, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[11] [11]

Energy and Policy Considerations for Deep Learning in NLP

Emma Strubell, Ananya Ganesh, and Andrew McCal- lum, “Energy and policy considerations for deep learn- ing in nlp,” arXiv preprint arXiv:1906.02243, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1906

[12] [12]

A theory for multiresolution sig- nal decomposition: The wavelet representation,

St ´ephane Mallat, “A theory for multiresolution sig- nal decomposition: The wavelet representation,” IEEE Transactions on Pattern Analysis and Machine Intelli- gence, vol. 11, pp. 674–693, 1989

work page 1989

[13] [13]

St ´ephane Mallat, A wavelet tour of signal processing , Elsevier, 1999

work page 1999

[14] [14]

Invariant scattering convolution networks,

Joan Bruna and St ´ephane Mallat, “Invariant scattering convolution networks,” IEEE transactions on pattern analysis and machine intelligence , vol. 35, no. 8, pp. 1872–1886, 2013

work page 2013

[15] [15]

Fast convolutional sparse coding,

Hilton Bristow, Anders Eriksson, and Simon Lucey, “Fast convolutional sparse coding,” in Proc. 2013 IEEE Conference on Computer Vision and Pattern Recogni- tion, 2013, pp. 391–398

work page 2013

[16] [16]

Convo- lutional dictionary learning: A comparative review and new algorithms,

Cristina Garcia-Cardona and Brendt Wohlberg, “Convo- lutional dictionary learning: A comparative review and new algorithms,” IEEE Transactions on Computational Imaging, vol. 4, no. 3, pp. 366–381, Sept. 2018, There are errors in Equations (18) and (19) in the published version of the paper. These have been corrected in the most recent arXiv version

work page 2018

[17] [17]

Multilayer convolutional sparse model- ing: Pursuit and dictionary learning,

Jeremias Sulam, Vardan Papyan, Yaniv Romano, and Michael Elad, “Multilayer convolutional sparse model- ing: Pursuit and dictionary learning,”Signal Processing, IEEE Transactions on, vol. 66, no. 15, pp. 4090–4104, 2018

work page 2018

[18] [18]

Stable signal recovery from incomplete and in- accurate measurements,

Emmanuel J. Cand `es, Justin K. Romberg, and Terence Tao, “Stable signal recovery from incomplete and in- accurate measurements,” Communications on Pure and Applied Mathematics, vol. 59, pp. 1–15, 2006

work page 2006

[19] [19]

Exact recovery of sparsely used overcom- plete dictionaries,

Alekh Agarwal, Animashree Anandkumar, and Praneeth Netrapalli, “Exact recovery of sparsely used overcom- plete dictionaries,” stat, vol. 1050, pp. 8–39, 2013

work page 2013

[20] [20]

Deconvolutional networks,

M. D. Zeiler, D. Krishnan, G. W. Taylor, and R. Fergus, “Deconvolutional networks,” in Proc. 2010 IEEE Com- puter Society Conference on Computer Vision and Pat- tern Recognition (CVPR), June 2010, pp. 2528–2535

work page 2010

[21] [21]

Working locally thinking globally: Theoretical guar- antees for convolutional sparse coding,

Vardan Papyan, Jeremias Sulam, and Michael Elad, “Working locally thinking globally: Theoretical guar- antees for convolutional sparse coding,” IEEE Transac- tions on Signal Processing , vol. 65, no. 21, pp. 5687– 5701, 2017

work page 2017

[22] [22]

Sparse approximate solutions to linear systems,

Balas Kausik Natarajan, “Sparse approximate solutions to linear systems,” SIAM journal on computing, vol. 24, no. 2, pp. 227–234, 1995

work page 1995

[23] [23]

Markov random ﬁeld extensions us- ing state space models,

Claus Dethlefsen, “Markov random ﬁeld extensions us- ing state space models,” in Markov random ﬁeld exten- sions using state space models. Oxford University Press, 2003, pp. 493–501

work page 2003

[24] [24]

Convolutional neu- ral networks analyzed via convolutional sparse coding,

V Papyan, Y Romano, and M Elad, “Convolutional neu- ral networks analyzed via convolutional sparse coding,” Journal of Machine Learning Research, vol. 18, pp. 1– 52, 2017

work page 2017

[25] [25]

Deeply-sparse signal representations (DS2P),

Demba Ba, “Deeply-sparse signal representations (DS2P),” 2018

work page 2018

[26] [26]

A fast itera- tive shrinkage-thresholding algorithm for linear inverse problems,

Amir Beck and Marc Teboulle, “A fast itera- tive shrinkage-thresholding algorithm for linear inverse problems,” SIAM journal on imaging sciences , vol. 2, no. 1, pp. 183–202, 2009

work page 2009

[27] [27]

Scalable convolutional dictionary learning with con- strained recurrent sparse auto-encoders,

Bahareh Tolooshams, Sourav Dey, and Demba Ba, “Scalable convolutional dictionary learning with con- strained recurrent sparse auto-encoders,” in Proc. 2018 IEEE 28th International Workshop on Machine Learn- ing for Signal Processing (MLSP), Sept. 2018, pp. 1–6

work page 2018