Towards Interpretable Deep Extreme Multi-label Learning

Bowen Kuo; I-Ling Cheng; Pei-Ju Lee; Wenjui Mao; Yihuang Kang

arxiv: 1907.01723 · v1 · pith:H4UGL5TJnew · submitted 2019-07-03 · 📊 stat.ML · cs.LG· stat.AP

Towards Interpretable Deep Extreme Multi-label Learning

Yihuang Kang , I-Ling Cheng , Wenjui Mao , Bowen Kuo , Pei-Ju Lee This is my paper

Pith reviewed 2026-05-25 10:25 UTC · model grok-4.3

classification 📊 stat.ML cs.LGstat.AP

keywords extreme multi-label learninginterpretable machine learningdeep autoencoderslabel hierarchiesmulti-label classificationnon-negative representationsimage tagging

0 comments

The pith

A two-step XML method pairs a deep non-negative autoencoder with downstream classifiers to produce both accurate many-label predictions and explicit label hierarchies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a two-step process for extreme multi-label learning in which a deep non-negative autoencoder first compresses the label space into interpretable structures. These structures are then passed to standard multi-label classifiers. The resulting model is shown to manage data sets that contain thousands of labels while also surfacing hierarchies and dependencies among those labels. A reader who accepts the claim would conclude that black-box concerns in XML can be reduced without sacrificing the ability to handle very large output spaces, at least for tasks such as image tagging.

Core claim

The authors claim that feeding the output of a deep non-negative autoencoder into conventional multi-label classifiers yields both competitive accuracy on many-label problems and human-readable label hierarchies and dependencies that explain how the model recognizes the presence of multiple objects in an image.

What carries the argument

The deep non-negative autoencoder, which learns non-negative latent representations that expose label hierarchies and dependencies for use by the downstream classifier.

If this is right

The two-step pipeline scales to data sets containing many thousands of labels.
The learned hierarchies make the model's label decisions traceable to explicit dependencies.
Interpretability extends to image data where the model must decide which of many objects are present.
The same autoencoder step can be paired with different downstream multi-label classifiers without retraining the representation layer.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same non-negative representation might be reused across multiple downstream tasks that share the same label vocabulary.
If the hierarchies prove stable, they could serve as a form of weak supervision for new data sets that lack full annotations.
The approach suggests a route to auditing XML models for systematic biases in how certain label combinations are recognized.

Load-bearing premise

The non-negative autoencoder will produce label hierarchies and dependencies that remain faithful to the original data and genuinely aid human interpretation of the final classifier.

What would settle it

An experiment in which the hierarchies extracted by the autoencoder are shown to contradict known label co-occurrence statistics in the training data or to provide no measurable gain in human ability to predict the model's decisions on held-out images.

read the original abstract

Many Machine Learning algorithms, such as deep neural networks, have long been criticized for being "black-boxes"-a kind of models unable to provide how it arrive at a decision without further efforts to interpret. This problem has raised concerns on model applications' trust, safety, nondiscrimination, and other ethical issues. In this paper, we discuss the machine learning interpretability of a real-world application, eXtreme Multi-label Learning (XML), which involves learning models from annotated data with many pre-defined labels. We propose a two-step XML approach that combines deep non-negative autoencoder with other multi-label classifiers to tackle different data applications with a large number of labels. Our experimental result shows that the proposed approach is able to cope with many-label problems as well as to provide interpretable label hierarchies and dependencies that helps us understand how the model recognizes the existences of objects in an image.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper proposes a two-step approach for extreme multi-label learning (XML) that integrates a deep non-negative autoencoder to extract label hierarchies and dependencies, which are then combined with standard multi-label classifiers. It claims this handles large label spaces (e.g., image object recognition) while providing interpretability into model decisions, supported by asserted experimental results.

Significance. If the hierarchies prove faithful to data and useful for interpretation, the work could advance trustworthy ML in high-cardinality label settings. However, the manuscript supplies no mechanism details, faithfulness metrics, or evaluation, so significance cannot be assessed from the given text.

major comments (1)

[Abstract] Abstract: The central claim that the non-negative autoencoder step yields interpretable label hierarchies and dependencies rests on unshown experimental results. No hierarchy extraction procedure, quantitative faithfulness metric (e.g., co-occurrence alignment or taxonomy match), or human-subject usefulness evaluation is described, leaving the interpretability assertion unsupported.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review and the opportunity to clarify our work. We address the single major comment below, providing references to the manuscript's existing content while acknowledging areas where additional support can be added.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that the non-negative autoencoder step yields interpretable label hierarchies and dependencies rests on unshown experimental results. No hierarchy extraction procedure, quantitative faithfulness metric (e.g., co-occurrence alignment or taxonomy match), or human-subject usefulness evaluation is described, leaving the interpretability assertion unsupported.

Authors: The manuscript describes the hierarchy extraction procedure in Section 3: the deep non-negative autoencoder is trained with a non-negativity constraint on the decoder weights, allowing the learned weight matrix to directly encode label dependencies and hierarchical structure (see the reconstruction objective and the interpretation paragraph following Equation (4)). Section 4 then presents experimental support via both improved multi-label classification metrics on large-scale datasets and qualitative visualizations of the extracted hierarchies (e.g., parent-child label groupings on the Delicious and EUR-Lex benchmarks). We agree, however, that no quantitative faithfulness metrics (such as co-occurrence alignment scores or taxonomy matching) or human-subject studies are reported; these would strengthen the interpretability claims and will be added in revision. revision: partial

Circularity Check

0 steps flagged

Empirical method proposal with no derivation chain or self-referential reductions

full rationale

The paper describes a two-step empirical approach combining a deep non-negative autoencoder with multi-label classifiers for extreme multi-label learning, asserting that it yields interpretable label hierarchies based on experimental results. No equations, parameter-fitting procedures, uniqueness theorems, or derivation steps are presented in the abstract or context that would allow any claim to reduce to its own inputs by construction. No self-citations are invoked as load-bearing premises. The central claims rest on reported experiments rather than mathematical self-definition, making the work self-contained against external benchmarks with no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only; no free parameters, axioms, or invented entities are stated.

pith-pipeline@v0.9.0 · 5692 in / 1007 out tokens · 26058 ms · 2026-05-25T10:25:40.464491+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

[1]

Such dramatic increase of data with multimedia contents (e.g

Introduction In recent decades, the advance of information technology and ubiquitous computing devices, have fueled the explosive growth of data—the Big Data [1], which is coined by researchers and practitioners to describe this unprecedented phenomenon. Such dramatic increase of data with multimedia contents (e.g. images, audios, videos, and texts) has a...

work page
[2]

black-box

for a given data and thus often outperform other learning algorithms in terms of accuracy of prediction when dealing with massive datasets. DNNs have been very successful in many real-world applications, such as object detection, machine translation, and image captioning [5]–[7]. However, DNNs and many other ensemble machine learning algorithms are often ...

work page
[3]

black-boxes

Background and Related Work Machine learning algorithms have been reshaping nearly every corner of our world. From complicated flight planning to everyday grocery shopping, people rely on these algorithms to help make decisions. In recent decades, cheap computation, explosive growth of data, and evolution of deep model architectures [4] have even expanded...

work page 2018
[4]

As discussed previously, our proposed non-negative autoencoder is a kind of generalization of the NMF and its non-negative conceptual label sets are relatively easy to interpret

Interpretable Extreme Multi-label Learning We here consider the proposed approach, a two-step interpretable extreme multi-label learning with label compression based on deep non-negative autoencoder. As discussed previously, our proposed non-negative autoencoder is a kind of generalization of the NMF and its non-negative conceptual label sets are relative...

work page
[5]

Fried Chicken

Experimental Result To demonstrate the proposed approach, we collected recipe-ingredient text and dish image data from BBC Food Recipe website [28] (BBC). The recipes without dish images were removed, as we here are only interested in explaining images with label (ingredient) sets at different levels of abstractions. There are total 3,379 recipes with ima...

work page
[6]

Conclusion We proposed a novel two-step extreme multi-label classification approach that applies deep non-negative autoencoder to the label compression and pseudo label generation of the multi-label learning. The experiment on real-world annotated image data shows that the approach is able to not only build multi-label classification models that cope with...

work page
[7]

Big data: A survey,

M. Chen, S. Mao, and Y. Liu, “Big data: A survey,” Mob. Netw. Appl., vol. 19, no. 2, pp. 171–209, 2014

work page 2014
[8]

Representation learning: A review and new perspectives,

Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A review and new perspectives,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 1798–1828, 2013

work page 2013
[9]

Deep learning,

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, p. 436, 2015

work page 2015
[10]

Learning deep architectures for AI,

Y. Bengio, “Learning deep architectures for AI,” Found. Trends Mach. Learn., vol. 2, no. 1, pp. 1–127, 2009

work page 2009
[11]

Very deep convolutional networks for large-scale image recognition,

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” ArXiv Prepr. ArXiv14091556, 2014

work page 2014
[12]

On using very large target vocabulary for neural machine translation,

S. Jean, K. Cho, R. Memisevic, and Y. Bengio, “On using very large target vocabulary for neural machine translation,” ArXiv Prepr. ArXiv14122007, 2014

work page 2014
[13]

Show and tell: A neural image caption generator,

O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: A neural image caption generator,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3156–3164

work page 2015
[14]

To explain or to predict?,

G. Shmueli, “To explain or to predict?,” Stat. Sci., vol. 25, no. 3, pp. 289–310, 2010

work page 2010
[15]

Towards a rigorous science of interpretable machine learning,

F. Doshi-Velez and B. Kim, “Towards a rigorous science of interpretable machine learning,” ArXiv Prepr. ArXiv170208608, 2017

work page 2017
[16]

Why should i trust you?: Explaining the predictions of any classifier,

M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should i trust you?: Explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135–1144

work page 2016
[17]

Fastxml: A fast, accurate and stable tree-classifier for extreme multi-label learning,

Y. Prabhu and M. Varma, “Fastxml: A fast, accurate and stable tree-classifier for extreme multi-label learning,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 263–272

work page 2014
[18]

Deep Extreme Multi-label Learning,

W. Zhang, J. Yan, X. Wang, and H. Zha, “Deep Extreme Multi-label Learning,” in Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018, pp. 100–107

work page 2018
[19]

Sparse local embeddings for extreme multi-label classification,

K. Bhatia, H. Jain, P. Kar, M. Varma, and P. Jain, “Sparse local embeddings for extreme multi-label classification,” in Advances in Neural Information Processing Systems, 2015, pp. 730–738

work page 2015
[20]

Deep learning for extreme multi-label text classification,

J. Liu, W.-C. Chang, Y. Wu, and Y. Yang, “Deep learning for extreme multi-label text classification,” in Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017, pp. 115–124

work page 2017
[21]

Deep speech: Scaling up end-to-end speech recognition,

A. Hannun et al., “Deep speech: Scaling up end-to-end speech recognition,” ArXiv Prepr. ArXiv14125567, 2014

work page 2014
[22]

Explainable artificial intelligence (XAI),

D. Gunning, “Explainable artificial intelligence (XAI),” Def. Adv. Res. Proj. Agency DARPA Nd Web, 2017

work page 2017
[23]

European Union regulations on algorithmic decision-making and a ‘right to explanation,’

B. Goodman and S. Flaxman, “European Union regulations on algorithmic decision-making and a ‘right to explanation,’” Jun. 2016

work page 2016
[24]

Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications,

H. Jain, Y. Prabhu, and M. Varma, “Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 935–944

work page 2016
[25]

A literature survey on algorithms for multi-label learning,

M. S. Sorower, “A literature survey on algorithms for multi-label learning,” Or. State Univ. Corvallis, vol. 18, 2010

work page 2010
[26]

Multi- label learning with millions of labels: Recommending advertiser bid phrases for web pages,

R. Agrawal, A. Gupta, Y. Prabhu, and M. Varma, “Multi- label learning with millions of labels: Recommending advertiser bid phrases for web pages,” in Proceedings of the 22nd international conference on World Wide Web, 2013, pp. 13–24

work page 2013
[27]

Online Multi-Label Classification: A Label Compression Method,

Z. Ahmadi and S. Kramer, “Online Multi-Label Classification: A Label Compression Method,” ArXiv Prepr. ArXiv180401491, 2018

work page 2018
[28]

Multilabel classification with principal label space transformation,

F. Tai and H.-T. Lin, “Multilabel classification with principal label space transformation,” Neural Comput., vol. 24, no. 9, pp. 2508–2542, 2012

work page 2012
[29]

Robust extreme multi-label learning,

C. Xu, D. Tao, and C. Xu, “Robust extreme multi-label learning,” in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1275–1284

work page 2016
[30]

Reducing the dimensionality of data with neural networks,

G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, Jul. 2006

work page 2006
[31]

Learning the parts of objects by non-negative matrix factorization Nature

“Learning the parts of objects by non-negative matrix factorization Nature.” [Online]. Available: https://www.nature.com/articles/44565

work page
[32]

Non-negative matrix factorization with sparseness constraints,

P. O. Hoyer, “Non-negative matrix factorization with sparseness constraints,” J. Mach. Learn. Res., vol. 5, no. Nov, pp. 1457–1469, 2004

work page 2004
[33]

On the expressive power of deep architectures,

Y. Bengio and O. Delalleau, “On the expressive power of deep architectures,” in International Conference on Algorithmic Learning Theory, 2011, pp. 18–36

work page 2011
[34]

Recipes - BBC Food

“Recipes - BBC Food.” [Online]. Available: https://www.bbc.com/food/recipes. [Accessed: 11-Dec-2018]

work page 2018
[35]

Food recognition and recipe analysis: integrating visual content, context and external knowledge,

L. Herranz, W. Min, and S. Jiang, “Food recognition and recipe analysis: integrating visual content, context and external knowledge,” ArXiv Prepr. ArXiv180107239, 2018

work page 2018
[36]

Flavor network and the principles of food pairing,

Y.-Y. Ahn, S. E. Ahnert, J. P. Bagrow, and A.-L. Barabási, “Flavor network and the principles of food pairing,” Sci. Rep., vol. 1, p. 196, 2011

work page 2011
[37]

R: A Language and Environment for Statistical Computing,

R Core Team, “R: A Language and Environment for Statistical Computing,” R Foundation for Statistical Computing, Vienna, Austria, 2018. [Online]. Available: http://www.r- project.org/

work page 2018
[38]

Chollet and J

F. Chollet and J. J. Allaire, R interface to Keras. GitHub, 2017

work page 2017

[1] [1]

Such dramatic increase of data with multimedia contents (e.g

Introduction In recent decades, the advance of information technology and ubiquitous computing devices, have fueled the explosive growth of data—the Big Data [1], which is coined by researchers and practitioners to describe this unprecedented phenomenon. Such dramatic increase of data with multimedia contents (e.g. images, audios, videos, and texts) has a...

work page

[2] [2]

black-box

for a given data and thus often outperform other learning algorithms in terms of accuracy of prediction when dealing with massive datasets. DNNs have been very successful in many real-world applications, such as object detection, machine translation, and image captioning [5]–[7]. However, DNNs and many other ensemble machine learning algorithms are often ...

work page

[3] [3]

black-boxes

Background and Related Work Machine learning algorithms have been reshaping nearly every corner of our world. From complicated flight planning to everyday grocery shopping, people rely on these algorithms to help make decisions. In recent decades, cheap computation, explosive growth of data, and evolution of deep model architectures [4] have even expanded...

work page 2018

[4] [4]

As discussed previously, our proposed non-negative autoencoder is a kind of generalization of the NMF and its non-negative conceptual label sets are relatively easy to interpret

Interpretable Extreme Multi-label Learning We here consider the proposed approach, a two-step interpretable extreme multi-label learning with label compression based on deep non-negative autoencoder. As discussed previously, our proposed non-negative autoencoder is a kind of generalization of the NMF and its non-negative conceptual label sets are relative...

work page

[5] [5]

Fried Chicken

Experimental Result To demonstrate the proposed approach, we collected recipe-ingredient text and dish image data from BBC Food Recipe website [28] (BBC). The recipes without dish images were removed, as we here are only interested in explaining images with label (ingredient) sets at different levels of abstractions. There are total 3,379 recipes with ima...

work page

[6] [6]

Conclusion We proposed a novel two-step extreme multi-label classification approach that applies deep non-negative autoencoder to the label compression and pseudo label generation of the multi-label learning. The experiment on real-world annotated image data shows that the approach is able to not only build multi-label classification models that cope with...

work page

[7] [7]

Big data: A survey,

M. Chen, S. Mao, and Y. Liu, “Big data: A survey,” Mob. Netw. Appl., vol. 19, no. 2, pp. 171–209, 2014

work page 2014

[8] [8]

Representation learning: A review and new perspectives,

Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A review and new perspectives,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 1798–1828, 2013

work page 2013

[9] [9]

Deep learning,

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521, no. 7553, p. 436, 2015

work page 2015

[10] [10]

Learning deep architectures for AI,

Y. Bengio, “Learning deep architectures for AI,” Found. Trends Mach. Learn., vol. 2, no. 1, pp. 1–127, 2009

work page 2009

[11] [11]

Very deep convolutional networks for large-scale image recognition,

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” ArXiv Prepr. ArXiv14091556, 2014

work page 2014

[12] [12]

On using very large target vocabulary for neural machine translation,

S. Jean, K. Cho, R. Memisevic, and Y. Bengio, “On using very large target vocabulary for neural machine translation,” ArXiv Prepr. ArXiv14122007, 2014

work page 2014

[13] [13]

Show and tell: A neural image caption generator,

O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: A neural image caption generator,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3156–3164

work page 2015

[14] [14]

To explain or to predict?,

G. Shmueli, “To explain or to predict?,” Stat. Sci., vol. 25, no. 3, pp. 289–310, 2010

work page 2010

[15] [15]

Towards a rigorous science of interpretable machine learning,

F. Doshi-Velez and B. Kim, “Towards a rigorous science of interpretable machine learning,” ArXiv Prepr. ArXiv170208608, 2017

work page 2017

[16] [16]

Why should i trust you?: Explaining the predictions of any classifier,

M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should i trust you?: Explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135–1144

work page 2016

[17] [17]

Fastxml: A fast, accurate and stable tree-classifier for extreme multi-label learning,

Y. Prabhu and M. Varma, “Fastxml: A fast, accurate and stable tree-classifier for extreme multi-label learning,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 263–272

work page 2014

[18] [18]

Deep Extreme Multi-label Learning,

W. Zhang, J. Yan, X. Wang, and H. Zha, “Deep Extreme Multi-label Learning,” in Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018, pp. 100–107

work page 2018

[19] [19]

Sparse local embeddings for extreme multi-label classification,

K. Bhatia, H. Jain, P. Kar, M. Varma, and P. Jain, “Sparse local embeddings for extreme multi-label classification,” in Advances in Neural Information Processing Systems, 2015, pp. 730–738

work page 2015

[20] [20]

Deep learning for extreme multi-label text classification,

J. Liu, W.-C. Chang, Y. Wu, and Y. Yang, “Deep learning for extreme multi-label text classification,” in Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017, pp. 115–124

work page 2017

[21] [21]

Deep speech: Scaling up end-to-end speech recognition,

A. Hannun et al., “Deep speech: Scaling up end-to-end speech recognition,” ArXiv Prepr. ArXiv14125567, 2014

work page 2014

[22] [22]

Explainable artificial intelligence (XAI),

D. Gunning, “Explainable artificial intelligence (XAI),” Def. Adv. Res. Proj. Agency DARPA Nd Web, 2017

work page 2017

[23] [23]

European Union regulations on algorithmic decision-making and a ‘right to explanation,’

B. Goodman and S. Flaxman, “European Union regulations on algorithmic decision-making and a ‘right to explanation,’” Jun. 2016

work page 2016

[24] [24]

Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications,

H. Jain, Y. Prabhu, and M. Varma, “Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 935–944

work page 2016

[25] [25]

A literature survey on algorithms for multi-label learning,

M. S. Sorower, “A literature survey on algorithms for multi-label learning,” Or. State Univ. Corvallis, vol. 18, 2010

work page 2010

[26] [26]

Multi- label learning with millions of labels: Recommending advertiser bid phrases for web pages,

R. Agrawal, A. Gupta, Y. Prabhu, and M. Varma, “Multi- label learning with millions of labels: Recommending advertiser bid phrases for web pages,” in Proceedings of the 22nd international conference on World Wide Web, 2013, pp. 13–24

work page 2013

[27] [27]

Online Multi-Label Classification: A Label Compression Method,

Z. Ahmadi and S. Kramer, “Online Multi-Label Classification: A Label Compression Method,” ArXiv Prepr. ArXiv180401491, 2018

work page 2018

[28] [28]

Multilabel classification with principal label space transformation,

F. Tai and H.-T. Lin, “Multilabel classification with principal label space transformation,” Neural Comput., vol. 24, no. 9, pp. 2508–2542, 2012

work page 2012

[29] [29]

Robust extreme multi-label learning,

C. Xu, D. Tao, and C. Xu, “Robust extreme multi-label learning,” in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1275–1284

work page 2016

[30] [30]

Reducing the dimensionality of data with neural networks,

G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, Jul. 2006

work page 2006

[31] [31]

Learning the parts of objects by non-negative matrix factorization Nature

“Learning the parts of objects by non-negative matrix factorization Nature.” [Online]. Available: https://www.nature.com/articles/44565

work page

[32] [32]

Non-negative matrix factorization with sparseness constraints,

P. O. Hoyer, “Non-negative matrix factorization with sparseness constraints,” J. Mach. Learn. Res., vol. 5, no. Nov, pp. 1457–1469, 2004

work page 2004

[33] [33]

On the expressive power of deep architectures,

Y. Bengio and O. Delalleau, “On the expressive power of deep architectures,” in International Conference on Algorithmic Learning Theory, 2011, pp. 18–36

work page 2011

[34] [34]

Recipes - BBC Food

“Recipes - BBC Food.” [Online]. Available: https://www.bbc.com/food/recipes. [Accessed: 11-Dec-2018]

work page 2018

[35] [35]

Food recognition and recipe analysis: integrating visual content, context and external knowledge,

L. Herranz, W. Min, and S. Jiang, “Food recognition and recipe analysis: integrating visual content, context and external knowledge,” ArXiv Prepr. ArXiv180107239, 2018

work page 2018

[36] [36]

Flavor network and the principles of food pairing,

Y.-Y. Ahn, S. E. Ahnert, J. P. Bagrow, and A.-L. Barabási, “Flavor network and the principles of food pairing,” Sci. Rep., vol. 1, p. 196, 2011

work page 2011

[37] [37]

R: A Language and Environment for Statistical Computing,

R Core Team, “R: A Language and Environment for Statistical Computing,” R Foundation for Statistical Computing, Vienna, Austria, 2018. [Online]. Available: http://www.r- project.org/

work page 2018

[38] [38]

Chollet and J

F. Chollet and J. J. Allaire, R interface to Keras. GitHub, 2017

work page 2017