Deep Unfolding: Model-Based Inspiration of Novel Deep Architectures
read the original abstract
Model-based methods and deep neural networks have both been tremendously successful paradigms in machine learning. In model-based methods, problem domain knowledge can be built into the constraints of the model, typically at the expense of difficulties during inference. In contrast, deterministic deep neural networks are constructed in such a way that inference is straightforward, but their architectures are generic and it is unclear how to incorporate knowledge. This work aims to obtain the advantages of both approaches. To do so, we start with a model-based approach and an associated inference algorithm, and \emph{unfold} the inference iterations as layers in a deep network. Rather than optimizing the original model, we \emph{untie} the model parameters across layers, in order to create a more powerful network. The resulting architecture can be trained discriminatively to perform accurate inference within a fixed network size. We show how this framework allows us to interpret conventional networks as mean-field inference in Markov random fields, and to obtain new architectures by instead using belief propagation as the inference algorithm. We then show its application to a non-negative matrix factorization model that incorporates the problem-domain knowledge that sound sources are additive. Deep unfolding of this model yields a new kind of non-negative deep neural network, that can be trained using a multiplicative backpropagation-style update algorithm. We present speech enhancement experiments showing that our approach is competitive with conventional neural networks despite using far fewer parameters.
This paper has not been read by Pith yet.
Forward citations
Cited by 5 Pith papers
-
Mechanistic Interpretability with Sparse Autoencoder Neural Operators
SAE-NOs extend sparse autoencoders to function spaces via Fourier neural operators with concept and domain sparsity, learning localized patterns more efficiently and generalizing across discretizations on vision data.
-
CPCANet: Deep Unfolding Common Principal Component Analysis for Domain Generalization
CPCANet deep-unfolds Common PCA to learn domain-invariant subspaces, achieving state-of-the-art zero-shot domain generalization on standard benchmarks.
-
HyperLISTA-ABT: An Ultra-light Unfolded Network for Accurate Multi-component Differential Tomographic SAR Inversion
HyperLISTA-ABT is an ultra-light unfolded network with analytically determined weights and adaptive blockwise thresholding for accurate multi-component differential TomoSAR inversion.
-
Computationally Efficient Sparse Signal Recovery via Linear Sketching and Deep Unfolding
DU-PSISTA combines linear sketching with periodic ISTA and deep unfolding to achieve linear convergence to a neighborhood of the true sparse signal at lower computational cost when the period and sketch size are chose...
-
Deep Learning for CSI Feedback Based on Superimposed Coding
A multi-task neural network recovers superimposed downlink CSI and uplink data sequences in FDD massive MIMO, improving CSI estimation over standalone SC while maintaining similar UL-US detection across varying SNR and PPC.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.