Short-term Electric Load Forecasting Using TensorFlow and Deep Auto-Encoders

Xin Shi

REVIEW 1 major objections 1 minor 35 references

Reviewed by Pith at T0; open to challenge.

T0 means a machine referee read the full paper against a public rubric. The mark states how deep the mechanical check went, never who wrote it. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.3

A TensorFlow-based deep auto-encoder model forecasts short-term electric loads more accurately than traditional neural networks by using multidimensional data.

2026-05-24 18:33 UTC pith:RIYNIE7K

load-bearing objection This applies standard deep auto-encoders in TensorFlow to load forecasting but shows no numbers, baselines, or details to support its performance claims. the 1 major comments →

arxiv 1907.08941 v1 pith:RIYNIE7K submitted 2019-07-21 eess.SP cs.SYeess.SY

Short-term Electric Load Forecasting Using TensorFlow and Deep Auto-Encoders

Xin Shi This is my paper

classification eess.SP cs.SYeess.SY

keywords electric load forecastingdeep auto-encoderTensorFlowshort-term forecastbig dataneural networkprediction modelmultidimensional data

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a forecasting model for short-term electric load that incorporates historical load values, temperature, and day types using deep auto-encoder networks in TensorFlow. It addresses challenges in big data environments where traditional neural networks struggle with overfitting, slow convergence, and local optima. The approach is tested in case studies showing gains in accuracy, stability, and the ability to scale. This matters because accurate load forecasts are essential for efficient power grid management and reducing costs in energy systems.

Core claim

The paper establishes that a new distributed short-term load forecast method based on TensorFlow and Deep Auto-Encoder Networks (DAENs), which takes into account multidimensional load-related data sets including historical load value, temperature, day type, etc., overcomes the shortcomings of traditional neural network methods such as over-fitting, slow convergence and local optimum, etc., and demonstrates obvious advantages in prediction accuracy, stability, and expansibility.

What carries the argument

Deep Auto-Encoder Networks (DAENs) implemented in TensorFlow that process multidimensional inputs to produce load forecasts while avoiding common neural network pitfalls.

Load-bearing premise

The multidimensional load-related data sets are both available at sufficient quality and that the deep auto-encoder architecture inherently overcomes overfitting, slow convergence, and local-optimum issues without additional regularization or hyper-parameter tuning.

What would settle it

Running the proposed DAEN method and a traditional neural network on the same new dataset and finding that the traditional method achieves equal or higher accuracy with comparable stability.

Watch this falsifier — get emailed when new claim-graph text bears on it.

If this is right

The model can handle larger volumes of data without the performance issues seen in standard networks.
It supports distributed computing for real-time applications in power systems.
Forecasts become more reliable for planning and operation decisions in electricity markets.
Expansibility allows easy addition of new data types or features.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach might generalize to forecasting other variables like renewable energy output if similar multidimensional data is available.
Integration with existing power system software could be straightforward given the TensorFlow implementation.
Future work could test the method on datasets from different regions to confirm robustness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Desk Editor's Note

This applies standard deep auto-encoders in TensorFlow to load forecasting but shows no numbers, baselines, or details to support its performance claims.

read the letter

The main thing to know is that this paper applies an established deep auto-encoder architecture via TensorFlow to short-term electric load forecasting with multidimensional inputs like historical load, temperature, and day type. It claims better accuracy, stability, and expansibility than traditional neural nets while handling big data, but the abstract supplies zero quantitative results, error metrics, dataset descriptions, or baseline definitions to back that up. The full text may contain more, yet nothing in the provided material allows verification of the central claim. What is actually new is limited to framing an existing method for this domain and sketching a distributed TensorFlow workflow with an algorithm flowchart. The paper does reasonably well at identifying practical needs in power-system forecasting under variable demand and at listing common neural-net drawbacks such as overfitting and local optima. Those points are standard but clearly stated. The soft spots are large and load-bearing. Without any shown equations, training protocol, regularization steps, or comparative tables, the assertion that the architecture inherently solves convergence and overfitting issues cannot be assessed. The weakest assumption—that the data are high-quality and the model needs no extra tuning—remains untested. This work is aimed at power engineers or operators who might want a ready TensorFlow template for applied forecasting. Readers seeking methodological advance, reproducible experiments, or falsifiable predictions will find little of value. I would not bring it to a reading group. I would not cite it. It does not deserve peer review because there is no substantive content or evidence for a referee to evaluate.

Referee Report

1 major / 1 minor

Summary. The paper proposes a short-term electric load forecasting model based on Deep Auto-Encoder Networks (DAENs) implemented via TensorFlow. It incorporates multidimensional load-related data (historical load values, temperature, day type) and claims that the approach overcomes overfitting, slow convergence, and local-optima problems of traditional neural networks. Case-study results are asserted to demonstrate clear advantages in prediction accuracy, stability, and expansibility relative to conventional neural-network baselines.

Significance. If the empirical superiority claims were substantiated with rigorous, reproducible comparisons, the work could offer a practical distributed forecasting framework suitable for big-data power-system applications, potentially improving operational planning and grid stability.

major comments (1)

[Abstract] Abstract: the central claim that 'case study results show that the proposed method has obvious advantages in prediction accuracy, stability, and expansibility' is presented without any quantitative metrics (MAPE, RMSE, etc.), error bars, baseline definitions, dataset descriptions, or statistical tests. This absence leaves the primary contribution unsupported.

minor comments (1)

[Abstract] The abstract refers to a 'new distributed short-term load forecast method' and an 'algorithm flowchart' but provides no description of the distribution mechanism, flowchart, or implementation details.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on the abstract. We address it point by point below.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'case study results show that the proposed method has obvious advantages in prediction accuracy, stability, and expansibility' is presented without any quantitative metrics (MAPE, RMSE, etc.), error bars, baseline definitions, dataset descriptions, or statistical tests. This absence leaves the primary contribution unsupported.

Authors: We agree that the abstract as written does not include quantitative support for the stated advantages. The body of the manuscript contains the case-study results with MAPE, RMSE, and baseline comparisons, but these are not summarized numerically in the abstract. In the revised version we will expand the abstract to report the key quantitative metrics (MAPE and RMSE values for the proposed DAEN method versus the traditional neural-network baselines), the dataset size and features, and a brief statement of the evaluation protocol. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The abstract and available description contain no equations, derivations, fitted parameters presented as predictions, or self-citations. The central claim is an empirical case-study comparison of prediction accuracy, stability, and expansibility against traditional neural networks, with no internal mathematical chain that reduces to its own inputs by construction. No load-bearing steps match any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies insufficient technical content to identify free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5661 in / 1053 out tokens · 20679 ms · 2026-05-24T18:33:13.878885+00:00 · methodology

0 comments

read the original abstract

This paper conducts research on the short-term electric load forecast method under the background of big data. It builds a new electric load forecast model based on Deep Auto-Encoder Networks (DAENs), which takes into account multidimensional load-related data sets including historical load value, temperature, day type, etc. A new distributed short-term load forecast method based on TensorFlow and DAENs is therefore proposed, with an algorithm flowchart designed. This method overcomes the shortcomings of traditional neural network methods, such as over-fitting, slow convergence and local optimum, etc. Case study results show that the proposed method has obvious advantages in prediction accuracy, stability, and expansibility compared with those based on traditional neural networks. Thus, this model can better meet the demands of short-term electric load forecasting under big data scenario.

Figures

Figures reproduced from arXiv: 1907.08941 by Xin Shi.

**Figure 1.** Figure 1: AE Network Structure. One basic AE can be viewed as a three-layer traditional neural network, including one input layer, one hidden layer and one output layer. The input layer and output layer are of the same size. Transition from the input layer to the hidden layer is called the encoding process. The transition from the hidden layer to the output layer is called the decoding process. Assume f and g respec… view at source ↗

**Figure 2.** Figure 2: Load Forecasting Model Based on DAENs. A parallel algorithm flow chart is designed, shown in [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Parallel Algorithm Flow Chart Based on DAENs and TensorFlow. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Network Data Flow Graph. Taking GPU 1 for example, the tracing results are separately shown in [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 7.** Figure 7: Error Variation in Fine-tuning. The horizontal coordi [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 6.** Figure 6: Error Variation in Pre-training. The horizontal coordi [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 8.** Figure 8: Comparison of the Forecast Value and the Actual One. [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗

**Figure 9.** Figure 9: We can see that the CDF of the DAENs-based forecast [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗

discussion (0)

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages · 3 internal anchors

[1]

Short- term electric load forecasting using echo state networks and pca decomposition,

F. M. Bianchi, E. D. Santis, A. Rizzi, and A. Sadeghian, “Short- term electric load forecasting using echo state networks and pca decomposition,” IEEE Access, vol. 3, pp. 1931–1943, 2015. 8

work page 1931
[2]

Short-term load forecast of microgrids by a new bilevel prediction strategy,

N. Amjady, F. Keynia, and H. Zareipour, “Short-term load forecast of microgrids by a new bilevel prediction strategy,” IEEE Transactions on Smart Grid , vol. 1, no. 3, pp. 286–294, Dec 2010

work page 2010
[3]

P. J. Brockwell and R. A. Davis, Time series: theory and methods. Springer Science & Business Media, 2013

work page 2013
[4]

The time series approach to short term load forecasting,

M. T. Hagan and S. M. Behr, “The time series approach to short term load forecasting,” IEEE Transactions on Power Systems , vol. 2, no. 3, pp. 785–791, 1987

work page 1987
[5]

A regression- based approach to short-term system load forecasting,

A. D. Papalexopoulos and T. C. Hesterberg, “A regression- based approach to short-term system load forecasting,” IEEE Transactions on Power Systems , vol. 5, no. 4, pp. 1535–1547, 1990

work page 1990
[6]

Introduction to grey system theory,

D. Julong, “Introduction to grey system theory,” The Journal of grey system, vol. 1, no. 1, pp. 1–24, 1989

work page 1989
[7]

Application of gray sys- tem theory in load forecasting [j],

J.-f. ZHANG, Y .-a. WU, and J.-j. WU, “Application of gray sys- tem theory in load forecasting [j],” Electric Power Automation Equipment, vol. 5, p. 005, 2004

work page 2004
[8]

Neural networks for short-term load forecasting: A review and evaluation,

H. S. Hippert, C. E. Pedreira, and R. C. Souza, “Neural networks for short-term load forecasting: A review and evaluation,” IEEE Transactions on power systems, vol. 16, no. 1, pp. 44–55, 2001

work page 2001
[9]

Comparative study of short term load forecasting using multilayer feed forward neural network with back propagation learning and radial basis functional neural network,

A. Tiwari, A. D. Dubey, and D. Patel, “Comparative study of short term load forecasting using multilayer feed forward neural network with back propagation learning and radial basis functional neural network,” SAMRIDDHI: A Journal of Physical Sciences, Engineering and Technology , vol. 7, no. 1, 2015

work page 2015
[10]

A new hybrid modiﬁed ﬁreﬂy algorithm and support vector regression model for accurate short term load forecasting,

A. Kavousi-Fard, H. Samet, and F. Marzbani, “A new hybrid modiﬁed ﬁreﬂy algorithm and support vector regression model for accurate short term load forecasting,” Expert systems with applications, vol. 41, no. 13, pp. 6047–6056, 2014

work page 2014
[11]

A strategy for short-term load forecasting by support vector regression machines,

E. Ceperic, V . Ceperic, and A. Baric, “A strategy for short-term load forecasting by support vector regression machines,” IEEE Transactions on Power Systems, vol. 28, no. 4, pp. 4356–4364, 2013

work page 2013
[12]

Short-term load forecasting of australian national electricity market by an ensemble model of extreme learning machine,

R. Zhang, Z. Y . Dong, Y . Xu, K. Meng, and K. P. Wong, “Short-term load forecasting of australian national electricity market by an ensemble model of extreme learning machine,” IET Generation, Transmission & Distribution, vol. 7, no. 4, pp. 391–397, 2013

work page 2013
[13]

Electricity price forecasting with extreme learning machine and bootstrapping,

X. Chen, Z. Y . Dong, K. Meng, Y . Xu, K. P. Wong, and H. Ngan, “Electricity price forecasting with extreme learning machine and bootstrapping,” IEEE Transactions on Power Systems , vol. 27, no. 4, pp. 2055–2062, 2012

work page 2055
[14]

Reducing the dimen- sionality of data with neural networks,

G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimen- sionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, 2006

work page 2006
[15]

A fast learning algorithm for deep belief nets,

G. E. Hinton, S. Osindero, and Y .-W. Teh, “A fast learning algorithm for deep belief nets,” Neural computation , vol. 18, no. 7, pp. 1527–1554, 2006

work page 2006
[16]

Contractive auto-encoders: Explicit invariance during feature extraction,

S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y . Bengio, “Contractive auto-encoders: Explicit invariance during feature extraction,” in Proceedings of the 28th international conference on machine learning (ICML-11) , 2011, pp. 833–840

work page 2011
[17]

Stacked convolutional auto-encoders for hierarchical feature extraction,

J. Masci, U. Meier, D. Cires ¸an, and J. Schmidhuber, “Stacked convolutional auto-encoders for hierarchical feature extraction,” in International Conference on Artiﬁcial Neural Networks . Springer, 2011, pp. 52–59

work page 2011
[18]

Latent feature representation with stacked auto-encoder for ad/mci diagnosis,

H.-I. Suk, S.-W. Lee, D. Shen, A. D. N. Initiative et al., “Latent feature representation with stacked auto-encoder for ad/mci diagnosis,” Brain Structure and Function , vol. 220, no. 2, pp. 841–859, 2015

work page 2015
[19]

Autoen- coder networks for hiv classiﬁcation,

B. L. Betechuoh, T. Marwala, and T. Tettey, “Autoen- coder networks for hiv classiﬁcation,” CURRENT SCIENCE- BANGALORE-, vol. 91, no. 11, p. 1467, 2006

work page 2006
[20]

Extracting and composing robust features with denoising au- toencoders,

P. Vincent, H. Larochelle, Y . Bengio, and P.-A. Manzagol, “Extracting and composing robust features with denoising au- toencoders,” in Proceedings of the 25th international conference on Machine learning . ACM, 2008, pp. 1096–1103

work page 2008
[21]

Deep autoencoder neural networks for gene ontology annotation predictions,

D. Chicco, P. Sadowski, and P. Baldi, “Deep autoencoder neural networks for gene ontology annotation predictions,” in Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics . ACM, 2014, pp. 533–540

work page 2014
[22]

An autoencoder neural-network based low-dimensionality approach to excitation modeling for hmm-based text-to-speech,

S. Vishnubhotla, R. Fernandez, and B. Ramabhadran, “An autoencoder neural-network based low-dimensionality approach to excitation modeling for hmm-based text-to-speech,” in 2010 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2010, pp. 4614–4617

work page 2010
[23]

Autoencoder networks for wa- ter demand predictive modelling,

S. I. Msiza and T. Marwala, “Autoencoder networks for wa- ter demand predictive modelling,” in International Conference on Simulation and Modeling Methodologies, Technologies and Applications, 2016, pp. 231–238

work page 2016
[24]

Lecture 6a overview of mini-batch gradient descent,

H. Geoffrey, S. Nitish, and S. Kevin, “Lecture 6a overview of mini-batch gradient descent,” http://www. cs.toronto.edu/ ti- jmen/csc321/slides/lecture slides lec6.pdf, 2016

work page 2016
[25]

Adam: A method for stochastic opti- mization,

D. Kingma and J. Ba, “Adam: A method for stochastic opti- mization,” Computer Science, 2015

work page 2015
[26]

Lecture 6e rmsprop: Divide the gradient by a running average of its recent magnitude,

H. Geoffrey, S. Nitish, and S. Kevin, “Lecture 6e rmsprop: Divide the gradient by a running average of its recent magnitude,” http://www.cs.toronto.edu/ tij- men/csc321/slides/lecture slides lec6.pdf, 2016

work page 2016
[27]

Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition,

D. Yu, K. Yao, H. Su, G. Li, and F. Seide, “Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition,” in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2013, pp. 7893–7897

work page 2013
[28]

Tensorﬂow: Large-scale machine learning on heterogeneous systems, 2015,

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin et al., “Tensorﬂow: Large-scale machine learning on heterogeneous systems, 2015,” Software available from tensorﬂow. org , vol. 1, 2015

work page 2015
[29]

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

——, “Tensorﬂow: Large-scale machine learning on heteroge- neous distributed systems,” arXiv preprint arXiv:1603.04467 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[30]

Asrła real-time speech recogni- tion on portable devices,

A. S. Sharma and R. Bhalley, “Asrła real-time speech recogni- tion on portable devices,” in Advances in Computing, Commu- nication, & Automation (ICACCA)(Fall), International Confer- ence on. IEEE, 2016, pp. 1–4

work page 2016
[31]

Adversarial examples in the physical world

A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial exam- ples in the physical world,” arXiv preprint arXiv:1607.02533 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[32]

Tensorﬂow: Biologys gate- way to deep learning?

L. Rampasek and A. Goldenberg, “Tensorﬂow: Biologys gate- way to deep learning?” Cell systems, vol. 2, no. 1, pp. 12–14, 2016

work page 2016
[33]

Deep or shallow, nlp is breaking out,

G. Goth, “Deep or shallow, nlp is breaking out,” Communica- tions of the ACM , vol. 59, no. 3, pp. 13–16, 2016

work page 2016
[34]

WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia

D. Hewlett, A. Lacoste, L. Jones, I. Polosukhin, A. Fandrianto, J. Han, M. Kelcey, and D. Berthelot, “Wikireading: A novel large-scale language understanding task over wikipedia,” arXiv preprint arXiv:1608.03542, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[35]

K. P. Murphy, Machine learning: a probabilistic perspective . MIT press, 2012

work page 2012

[1] [1]

Short- term electric load forecasting using echo state networks and pca decomposition,

F. M. Bianchi, E. D. Santis, A. Rizzi, and A. Sadeghian, “Short- term electric load forecasting using echo state networks and pca decomposition,” IEEE Access, vol. 3, pp. 1931–1943, 2015. 8

work page 1931

[2] [2]

Short-term load forecast of microgrids by a new bilevel prediction strategy,

N. Amjady, F. Keynia, and H. Zareipour, “Short-term load forecast of microgrids by a new bilevel prediction strategy,” IEEE Transactions on Smart Grid , vol. 1, no. 3, pp. 286–294, Dec 2010

work page 2010

[3] [3]

P. J. Brockwell and R. A. Davis, Time series: theory and methods. Springer Science & Business Media, 2013

work page 2013

[4] [4]

The time series approach to short term load forecasting,

M. T. Hagan and S. M. Behr, “The time series approach to short term load forecasting,” IEEE Transactions on Power Systems , vol. 2, no. 3, pp. 785–791, 1987

work page 1987

[5] [5]

A regression- based approach to short-term system load forecasting,

A. D. Papalexopoulos and T. C. Hesterberg, “A regression- based approach to short-term system load forecasting,” IEEE Transactions on Power Systems , vol. 5, no. 4, pp. 1535–1547, 1990

work page 1990

[6] [6]

Introduction to grey system theory,

D. Julong, “Introduction to grey system theory,” The Journal of grey system, vol. 1, no. 1, pp. 1–24, 1989

work page 1989

[7] [7]

Application of gray sys- tem theory in load forecasting [j],

J.-f. ZHANG, Y .-a. WU, and J.-j. WU, “Application of gray sys- tem theory in load forecasting [j],” Electric Power Automation Equipment, vol. 5, p. 005, 2004

work page 2004

[8] [8]

Neural networks for short-term load forecasting: A review and evaluation,

H. S. Hippert, C. E. Pedreira, and R. C. Souza, “Neural networks for short-term load forecasting: A review and evaluation,” IEEE Transactions on power systems, vol. 16, no. 1, pp. 44–55, 2001

work page 2001

[9] [9]

Comparative study of short term load forecasting using multilayer feed forward neural network with back propagation learning and radial basis functional neural network,

A. Tiwari, A. D. Dubey, and D. Patel, “Comparative study of short term load forecasting using multilayer feed forward neural network with back propagation learning and radial basis functional neural network,” SAMRIDDHI: A Journal of Physical Sciences, Engineering and Technology , vol. 7, no. 1, 2015

work page 2015

[10] [10]

A new hybrid modiﬁed ﬁreﬂy algorithm and support vector regression model for accurate short term load forecasting,

A. Kavousi-Fard, H. Samet, and F. Marzbani, “A new hybrid modiﬁed ﬁreﬂy algorithm and support vector regression model for accurate short term load forecasting,” Expert systems with applications, vol. 41, no. 13, pp. 6047–6056, 2014

work page 2014

[11] [11]

A strategy for short-term load forecasting by support vector regression machines,

E. Ceperic, V . Ceperic, and A. Baric, “A strategy for short-term load forecasting by support vector regression machines,” IEEE Transactions on Power Systems, vol. 28, no. 4, pp. 4356–4364, 2013

work page 2013

[12] [12]

Short-term load forecasting of australian national electricity market by an ensemble model of extreme learning machine,

R. Zhang, Z. Y . Dong, Y . Xu, K. Meng, and K. P. Wong, “Short-term load forecasting of australian national electricity market by an ensemble model of extreme learning machine,” IET Generation, Transmission & Distribution, vol. 7, no. 4, pp. 391–397, 2013

work page 2013

[13] [13]

Electricity price forecasting with extreme learning machine and bootstrapping,

X. Chen, Z. Y . Dong, K. Meng, Y . Xu, K. P. Wong, and H. Ngan, “Electricity price forecasting with extreme learning machine and bootstrapping,” IEEE Transactions on Power Systems , vol. 27, no. 4, pp. 2055–2062, 2012

work page 2055

[14] [14]

Reducing the dimen- sionality of data with neural networks,

G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimen- sionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, 2006

work page 2006

[15] [15]

A fast learning algorithm for deep belief nets,

G. E. Hinton, S. Osindero, and Y .-W. Teh, “A fast learning algorithm for deep belief nets,” Neural computation , vol. 18, no. 7, pp. 1527–1554, 2006

work page 2006

[16] [16]

Contractive auto-encoders: Explicit invariance during feature extraction,

S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y . Bengio, “Contractive auto-encoders: Explicit invariance during feature extraction,” in Proceedings of the 28th international conference on machine learning (ICML-11) , 2011, pp. 833–840

work page 2011

[17] [17]

Stacked convolutional auto-encoders for hierarchical feature extraction,

J. Masci, U. Meier, D. Cires ¸an, and J. Schmidhuber, “Stacked convolutional auto-encoders for hierarchical feature extraction,” in International Conference on Artiﬁcial Neural Networks . Springer, 2011, pp. 52–59

work page 2011

[18] [18]

Latent feature representation with stacked auto-encoder for ad/mci diagnosis,

H.-I. Suk, S.-W. Lee, D. Shen, A. D. N. Initiative et al., “Latent feature representation with stacked auto-encoder for ad/mci diagnosis,” Brain Structure and Function , vol. 220, no. 2, pp. 841–859, 2015

work page 2015

[19] [19]

Autoen- coder networks for hiv classiﬁcation,

B. L. Betechuoh, T. Marwala, and T. Tettey, “Autoen- coder networks for hiv classiﬁcation,” CURRENT SCIENCE- BANGALORE-, vol. 91, no. 11, p. 1467, 2006

work page 2006

[20] [20]

Extracting and composing robust features with denoising au- toencoders,

P. Vincent, H. Larochelle, Y . Bengio, and P.-A. Manzagol, “Extracting and composing robust features with denoising au- toencoders,” in Proceedings of the 25th international conference on Machine learning . ACM, 2008, pp. 1096–1103

work page 2008

[21] [21]

Deep autoencoder neural networks for gene ontology annotation predictions,

D. Chicco, P. Sadowski, and P. Baldi, “Deep autoencoder neural networks for gene ontology annotation predictions,” in Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics . ACM, 2014, pp. 533–540

work page 2014

[22] [22]

An autoencoder neural-network based low-dimensionality approach to excitation modeling for hmm-based text-to-speech,

S. Vishnubhotla, R. Fernandez, and B. Ramabhadran, “An autoencoder neural-network based low-dimensionality approach to excitation modeling for hmm-based text-to-speech,” in 2010 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2010, pp. 4614–4617

work page 2010

[23] [23]

Autoencoder networks for wa- ter demand predictive modelling,

S. I. Msiza and T. Marwala, “Autoencoder networks for wa- ter demand predictive modelling,” in International Conference on Simulation and Modeling Methodologies, Technologies and Applications, 2016, pp. 231–238

work page 2016

[24] [24]

Lecture 6a overview of mini-batch gradient descent,

H. Geoffrey, S. Nitish, and S. Kevin, “Lecture 6a overview of mini-batch gradient descent,” http://www. cs.toronto.edu/ ti- jmen/csc321/slides/lecture slides lec6.pdf, 2016

work page 2016

[25] [25]

Adam: A method for stochastic opti- mization,

D. Kingma and J. Ba, “Adam: A method for stochastic opti- mization,” Computer Science, 2015

work page 2015

[26] [26]

Lecture 6e rmsprop: Divide the gradient by a running average of its recent magnitude,

H. Geoffrey, S. Nitish, and S. Kevin, “Lecture 6e rmsprop: Divide the gradient by a running average of its recent magnitude,” http://www.cs.toronto.edu/ tij- men/csc321/slides/lecture slides lec6.pdf, 2016

work page 2016

[27] [27]

Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition,

D. Yu, K. Yao, H. Su, G. Li, and F. Seide, “Kl-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition,” in 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2013, pp. 7893–7897

work page 2013

[28] [28]

Tensorﬂow: Large-scale machine learning on heterogeneous systems, 2015,

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin et al., “Tensorﬂow: Large-scale machine learning on heterogeneous systems, 2015,” Software available from tensorﬂow. org , vol. 1, 2015

work page 2015

[29] [29]

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

——, “Tensorﬂow: Large-scale machine learning on heteroge- neous distributed systems,” arXiv preprint arXiv:1603.04467 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[30] [30]

Asrła real-time speech recogni- tion on portable devices,

A. S. Sharma and R. Bhalley, “Asrła real-time speech recogni- tion on portable devices,” in Advances in Computing, Commu- nication, & Automation (ICACCA)(Fall), International Confer- ence on. IEEE, 2016, pp. 1–4

work page 2016

[31] [31]

Adversarial examples in the physical world

A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial exam- ples in the physical world,” arXiv preprint arXiv:1607.02533 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[32] [32]

Tensorﬂow: Biologys gate- way to deep learning?

L. Rampasek and A. Goldenberg, “Tensorﬂow: Biologys gate- way to deep learning?” Cell systems, vol. 2, no. 1, pp. 12–14, 2016

work page 2016

[33] [33]

Deep or shallow, nlp is breaking out,

G. Goth, “Deep or shallow, nlp is breaking out,” Communica- tions of the ACM , vol. 59, no. 3, pp. 13–16, 2016

work page 2016

[34] [34]

WikiReading: A Novel Large-scale Language Understanding Task over Wikipedia

D. Hewlett, A. Lacoste, L. Jones, I. Polosukhin, A. Fandrianto, J. Han, M. Kelcey, and D. Berthelot, “Wikireading: A novel large-scale language understanding task over wikipedia,” arXiv preprint arXiv:1608.03542, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[35] [35]

K. P. Murphy, Machine learning: a probabilistic perspective . MIT press, 2012

work page 2012