Deep neural network-based classification model for Sentiment Analysis

Deming Sheng; Donghang Pan; Jingling Yuan; Lin Li

arxiv: 1907.02046 · v1 · pith:SOOGSPZRnew · submitted 2019-07-03 · 💻 cs.CL · cs.IR· cs.LG

Deep neural network-based classification model for Sentiment Analysis

Donghang Pan , Jingling Yuan , Lin Li , Deming Sheng This is my paper

Pith reviewed 2026-05-25 09:57 UTC · model grok-4.3

classification 💻 cs.CL cs.IRcs.LG

keywords sentiment analysisimplicit sentimentdeep neural networkBi-LSTMattention mechanismtext classificationLSTMCNN

0 comments

The pith

Bi-LSTM with word-level attention achieves the highest recall for positive implicit sentiment classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs deep neural network models to classify implicit sentiment, where opinions in text are not stated directly. It tests DNN, LSTM, Bi-LSTM, and CNN architectures on a public dataset, then augments the Bi-LSTM with an attention mechanism that focuses on key words. Results show the attention-enhanced Bi-LSTM reaches the best R value specifically on the positive category, while LSTM-series and CNN models already beat the plain DNN. A reader would care because social networks contain abundant implicit opinions that explicit-text methods miss.

Core claim

Classification models based on DNN, LSTM, Bi-LSTM and CNN were established to judge the tendency of the user's implicit sentiment text. Based on the Bi-LSTM model, the classification model of word-level attention mechanism is studied. The experimental results on the public dataset show that the established LSTM series classification model and CNN classification model can achieve good sentiment classification effect, and the classification effect is significantly better than the DNN model. The Bi-LSTM based attention mechanism classification model obtained the optimal R value in the positive category identification.

What carries the argument

Bi-LSTM model with added word-level attention mechanism that assigns weights to important words for implicit sentiment judgment.

If this is right

LSTM-series and CNN models deliver significantly better classification than the DNN baseline on implicit sentiment.
The word-level attention addition to Bi-LSTM produces the single best R score for positive-category detection.
All tested deep models reach usable sentiment classification performance on the public dataset.
Implicit sentiment classification can be addressed by standard sequence and convolutional architectures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same architecture stack could be tested on implicit sentiment in languages other than the one used in the public dataset.
Real-time social-media dashboards might incorporate the attention-Bi-LSTM to surface understated user opinions.
Pairing the model with user-history features could further lift accuracy on ambiguous cases.

Load-bearing premise

The chosen public dataset contains representative examples of implicit sentiment and the R value is an appropriate unbiased measure of classification quality.

What would settle it

Re-evaluating the same four model families plus the attention variant on a fresh implicit-sentiment dataset and finding that the Bi-LSTM attention version no longer records the highest R on the positive class.

Figures

Figures reproduced from arXiv: 1907.02046 by Deming Sheng, Donghang Pan, Jingling Yuan, Lin Li.

**Figure 1.** Figure 1: Classification model frame. The difference between different classification models is the network structure selected by the deep neural network layer. The underlying word embedding layer and the top-level softmax classification layer use the same structure. The underlying word embedding layer and the top-level softmax classification layer use the same structure. The word pretraining technique is used in t… view at source ↗

**Figure 2.** Figure 2: Bi-LSTM structure. C. CNN and DNN Although CNN and artificial neural networks separate the complex relationships between elements, both can extract and synthesize features through complex network structures. CNN can capture local important information through convolution and pooling operations. Therefore it is also applied to related natural language processing tasks. Multi-layer fully connected neural net… view at source ↗

**Figure 4.** Figure 4: DNN structure. D. Classification model based on sentence attention mechanism Compared to the LSTM model, the LSTM uses the update gate structure instead of the forget gate and input gate in the LSTM. The reset gate is used to control the degree of ignoring the status information of the previous moment. This design allows the LSTM to maintain the LSTM effect while streamlining the network structure, resulti… view at source ↗

**Figure 5.** Figure 5: Bi-LSTM based attention structure. The appropriate weights are assigned to the input by the method of the below formula, and a fully connected layer is added to the output part to realize the synthesis of the features. The input features are weighted differently by assigning weights, and a fully connected layer is added to the output part to realize the synthesis of the features. The core formula is shown … view at source ↗

read the original abstract

The growing prosperity of social networks has brought great challenges to the sentimental tendency mining of users. As more and more researchers pay attention to the sentimental tendency of online users, rich research results have been obtained based on the sentiment classification of explicit texts. However, research on the implicit sentiment of users is still in its infancy. Aiming at the difficulty of implicit sentiment classification, a research on implicit sentiment classification model based on deep neural network is carried out. Classification models based on DNN, LSTM, Bi-LSTM and CNN were established to judge the tendency of the user's implicit sentiment text. Based on the Bi-LSTM model, the classification model of word-level attention mechanism is studied. The experimental results on the public dataset show that the established LSTM series classification model and CNN classification model can achieve good sentiment classification effect, and the classification effect is significantly better than the DNN model. The Bi-LSTM based attention mechanism classification model obtained the optimal R value in the positive category identification.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Routine 2019-era neural models applied to implicit sentiment with no new methods and an undefined R metric in the claims.

read the letter

This paper takes standard models like DNN, LSTM, Bi-LSTM, CNN, and Bi-LSTM with word-level attention and applies them to implicit sentiment classification on a public dataset. The headline result is that the attention version does best on positive cases via an R value. Nothing in the work introduces new architectures or derivations; it is a direct comparison of models already common by 2019. The setup does cover a sensible range of baselines for the task and correctly notes that implicit sentiment was still underdeveloped compared to explicit cases at the time. That is the main thing it does well. The soft spots are clear and central. The abstract supplies no definition or formula for the R value, no dataset size or balance details, no training protocol, and no statistical tests or error bars. The stress-test note is accurate on this point: without knowing what R measures or whether the public dataset has representative implicit examples, the optimality claim cannot be evaluated. The full text might add those elements, but the provided description leaves the evaluation ungrounded. This is the sort of paper that could serve as a basic reference for someone first exploring neural nets on implicit sentiment. Most readers looking for advances or reproducible benchmarks will find little to use. I would not bring it to a reading group or cite it. It does not show enough rigor in reporting to merit peer review time.

Referee Report

2 major / 1 minor

Summary. The manuscript develops DNN, LSTM, Bi-LSTM, CNN, and Bi-LSTM-with-word-level-attention models for implicit sentiment classification and reports that LSTM-series and CNN models outperform DNN on a public dataset, with the Bi-LSTM attention model achieving the optimal R value for positive-category identification.

Significance. If the experimental protocol, metric definition, and dataset details were supplied and the optimality claim held under standard evaluation, the work would supply a useful empirical comparison of neural architectures for the under-studied task of implicit sentiment detection.

major comments (2)

[Abstract] Abstract: the metric denoted 'R value' is never defined, no numerical scores are reported, and no comparison to precision, recall, F1, or accuracy is supplied, rendering the central optimality claim for the Bi-LSTM attention model impossible to evaluate.
[Abstract] Abstract: the public dataset is referenced but never named or characterized (size, class balance, proportion of implicit cases, labeling procedure), and no training protocol, hyper-parameters, or statistical significance tests are described, so the comparative performance statements cannot be verified.

minor comments (1)

[Abstract] Abstract: the title refers to general sentiment analysis while the body focuses exclusively on implicit sentiment; a more precise title would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments highlighting deficiencies in the abstract. We will revise the manuscript to supply the missing definitions, numerical results, dataset characterization, and experimental details so that all claims become verifiable.

read point-by-point responses

Referee: [Abstract] Abstract: the metric denoted 'R value' is never defined, no numerical scores are reported, and no comparison to precision, recall, F1, or accuracy is supplied, rendering the central optimality claim for the Bi-LSTM attention model impossible to evaluate.

Authors: We agree that the abstract omits the definition of the R value and does not report numerical scores or comparisons with standard metrics. In the revised manuscript we will (i) explicitly define the R value, (ii) tabulate the concrete numerical scores obtained by each model on the positive class, and (iii) provide the corresponding precision, recall, F1, and accuracy figures to substantiate the optimality claim. revision: yes
Referee: [Abstract] Abstract: the public dataset is referenced but never named or characterized (size, class balance, proportion of implicit cases, labeling procedure), and no training protocol, hyper-parameters, or statistical significance tests are described, so the comparative performance statements cannot be verified.

Authors: We concur that these experimental details are absent from the abstract. The revised version will name the public dataset, report its size, class balance and proportion of implicit instances, describe the labeling procedure, specify the training protocol and hyper-parameters, and include statistical significance tests for the reported performance differences. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical model comparisons with no derivations or self-referential fits

full rationale

The paper reports experimental results from training DNN, LSTM, Bi-LSTM, CNN and attention-augmented Bi-LSTM models on a public dataset for implicit sentiment classification. No equations, first-principles derivations, parameter-fitting steps presented as predictions, or uniqueness theorems appear in the provided text. Claims reduce to direct performance measurements (e.g., optimal R value) rather than any reduction of outputs to inputs by construction. Self-citation load-bearing, ansatz smuggling, or renaming of known results are absent. The work is self-contained as standard empirical ML evaluation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no equations, parameters or modeling assumptions; ledger is empty by default.

pith-pipeline@v0.9.0 · 5698 in / 957 out tokens · 31426 ms · 2026-05-25T09:57:24.152238+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · 1 internal anchor

[1]

Y. Wang, X. Lin, L. Wu, et al, Robust subspace clustering for multi - view data by exploiting correlation consensus IEEE Transactions on Image Processing, 24(11):3939-3949, 2015

work page 2015
[2]

Y. Wang, L. Wu, X. Lin, J. Gao. Multiview Spectral Clustering via Structured Low -Rank Matrix Factorization. IEEE Transactions on Neural Networks and Learning Systems 29 (10), 4833-4843, 2018

work page 2018
[3]

Y. Wang, W. Zhang, L. Wu et al., Iterative Views Agreement: An Iterative Low-Rank based Structured Optimization Method to Multi - View Spectral Clustering. IJCAI 2016

work page 2016
[4]

Y. Wang, X. Lin, L. Wu, W. Zhang. Effective Multi-Query Expansions: Collaborative Deep Networks for Robust Landmark Retrieval. IEEE Transactions on Image Processing 26 (3), 1393-1404, 2017

work page 2017
[5]

L. Wu, Y. Wang, X. Li, J. Gao. Deep Attention -based Spatially Recursive Networks for Fine -Grained Visual Recognition. IEEE Transactions on Cybernetics 49 (5), 1791-1802, 2019

work page 2019
[6]

What -and-Where to Match: Deep Spatially Multiplicative Integration Networks for Person Re - identification Pattern Recognition 76, 727-738, 2018

L Wu, Y Wang, X Li, J Gao. What -and-Where to Match: Deep Spatially Multiplicative Integration Networks for Person Re - identification Pattern Recognition 76, 727-738, 2018

work page 2018
[7]

L. Wu, Y. Wang, L. Shao. Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval. IEEE Transactions on Image Processing 28 (4), 1602-1612, 2019

work page 2019
[8]

Where-and-When to Look: Deep Siamese Attention Networks for Video -based Person Re -identificationIEEE Transactions on Multimedia 21 (6), 1412-1424, 2019

L Wu, Y Wang, J Gao, X Li. Where-and-When to Look: Deep Siamese Attention Networks for Video -based Person Re -identificationIEEE Transactions on Multimedia 21 (6), 1412-1424, 2019

work page 2019
[9]

Deep Adaptive Feature Embedding with Local Sample Distributions for Person Re -identification, Pattern Recognition 73, 275-288, 2018

L Wu, Y Wang, J Gao, X Li. Deep Adaptive Feature Embedding with Local Sample Distributions for Person Re -identification, Pattern Recognition 73, 275-288, 2018

work page 2018
[10]

Mining topic sentiment in micro -blogging based on microblogger social relation

Huang FL, Yu G, Zhang JL, Li CX, Yuan CA, Lu JL. Mining topic sentiment in micro -blogging based on microblogger social relation. Journal of Software, 2017,28(3):694-707

work page 2017
[11]

Huang FL，Feng S，Wang DL and Yu G.Mining topics sentiment in Microblogging based multi -feature fusion.[J]Chinese Journal of computer, 2017, 40(4): 872-888

work page 2017
[12]

Revisiting Correlations between Intrinsic and Extrinsic Evaluations of Word Embeddings

Yuanyuan Qiu, Hongzheng Li, Shen Li, Yingdi Jiang, Renfen Hu, Lijiao Yang. Revisiting Correlations between Intrinsic and Extrinsic Evaluations of Word Embeddings. Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. Springer, Cham, 2018. 209-221

work page 2018
[13]

Eff icient Estimation of Word Representations in Vector Space[J]

Mikolov T , Chen K , Corrado G , et al. Eff icient Estimation of Word Representations in Vector Space[J]. Computer Science(CS), 2013

work page 2013
[14]

Distributed representations of sentences and documents[C]//International conference on machine learning

Le Q, Mikolov T. Distributed representations of sentences and documents[C]//International conference on machine learning. 2014: 1188-1196

work page 2014
[15]

Sentiment analysis based on light reviews[J].Journal of Software, 2014,25(12):2790-2807

Zhang L, Qian GQ, Fan WG, H ua K, Zhang L. Sentiment analysis based on light reviews[J].Journal of Software, 2014,25(12):2790-2807

work page 2014
[16]

Analogical Reasoning on Chinese Morphological and Semantic Relations[J]

Li S, Zhao Z, Hu R, et al. Analogical Reasoning on Chinese Morphological and Semantic Relations[J]. 2018

work page 2018
[17]

Yu H, Hatzivassiloglou V. Towards answering o pinion questions: Separating facts from opinions and identifying the polarity of opinion sentences[C]//Proceedings of the 2003 conference on Empirical methods in natural language processing. Association for Computational Linguistics(ACL), 2003: 129-136

work page 2003
[18]

Sentiment classification using machine learning techniques[J]

Wawre S V, Deshmukh S N. Sentiment classification using machine learning techniques[J]. International Journal of Science and Research (IJSR), 2016, 5(4): 819-821

work page 2016
[19]

Topics in semantic representation[J]

Griffiths T L, Steyvers M, Tenenbaum J B. Topics in semantic representation[J]. Psychological review(PR), 2007, 114(2): 211

work page 2007
[20]

Liang B, Liu Q,Xu J.et al.Aspect -based sentiment analysis based on multi-attention CNN[J].Journal of computer research and development,2017,54(8):1724-1735

work page 2017
[21]

Journal of computer research and development,2018,55(5):945-957

Chen K,Liang B,Ke WD,et al.Chinese micro -blog sentiment analysis based on muti-channels convolutional neural networks[J]. Journal of computer research and development,2018,55(5):945-957

work page 2018
[22]

Chinese journal of computers,2017(4)

He YX,Sun ST,Niu FF and Li F.A deep-learning model enhanced with emotion semantics for microblog sentiment analysis[J]. Chinese journal of computers,2017(4)

work page 2017
[23]

Yu J, Jiang J. Learning sentence embeddings with auxiliary tasks for cross-domain sentiment classification[C]//Proceedings of the 2016 conference on empirical methods in natural language processing(NLP). 2016: 236-246

work page 2016
[24]

Yang Z, Yang D, Dy er C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 1480-1489

work page 2016
[25]

Dropout: a simple way to prevent neural networks from overfitting[J]

Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1): 1929-1958

work page 2014
[26]

Convolutional Neural Networks for Sentence Classification

Kim Y. Convolutional neural networks for sentence classification[J]. arXiv preprint arXiv:1408.5882, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[1] [1]

Y. Wang, X. Lin, L. Wu, et al, Robust subspace clustering for multi - view data by exploiting correlation consensus IEEE Transactions on Image Processing, 24(11):3939-3949, 2015

work page 2015

[2] [2]

Y. Wang, L. Wu, X. Lin, J. Gao. Multiview Spectral Clustering via Structured Low -Rank Matrix Factorization. IEEE Transactions on Neural Networks and Learning Systems 29 (10), 4833-4843, 2018

work page 2018

[3] [3]

Y. Wang, W. Zhang, L. Wu et al., Iterative Views Agreement: An Iterative Low-Rank based Structured Optimization Method to Multi - View Spectral Clustering. IJCAI 2016

work page 2016

[4] [4]

Y. Wang, X. Lin, L. Wu, W. Zhang. Effective Multi-Query Expansions: Collaborative Deep Networks for Robust Landmark Retrieval. IEEE Transactions on Image Processing 26 (3), 1393-1404, 2017

work page 2017

[5] [5]

L. Wu, Y. Wang, X. Li, J. Gao. Deep Attention -based Spatially Recursive Networks for Fine -Grained Visual Recognition. IEEE Transactions on Cybernetics 49 (5), 1791-1802, 2019

work page 2019

[6] [6]

What -and-Where to Match: Deep Spatially Multiplicative Integration Networks for Person Re - identification Pattern Recognition 76, 727-738, 2018

L Wu, Y Wang, X Li, J Gao. What -and-Where to Match: Deep Spatially Multiplicative Integration Networks for Person Re - identification Pattern Recognition 76, 727-738, 2018

work page 2018

[7] [7]

L. Wu, Y. Wang, L. Shao. Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval. IEEE Transactions on Image Processing 28 (4), 1602-1612, 2019

work page 2019

[8] [8]

Where-and-When to Look: Deep Siamese Attention Networks for Video -based Person Re -identificationIEEE Transactions on Multimedia 21 (6), 1412-1424, 2019

L Wu, Y Wang, J Gao, X Li. Where-and-When to Look: Deep Siamese Attention Networks for Video -based Person Re -identificationIEEE Transactions on Multimedia 21 (6), 1412-1424, 2019

work page 2019

[9] [9]

Deep Adaptive Feature Embedding with Local Sample Distributions for Person Re -identification, Pattern Recognition 73, 275-288, 2018

L Wu, Y Wang, J Gao, X Li. Deep Adaptive Feature Embedding with Local Sample Distributions for Person Re -identification, Pattern Recognition 73, 275-288, 2018

work page 2018

[10] [10]

Mining topic sentiment in micro -blogging based on microblogger social relation

Huang FL, Yu G, Zhang JL, Li CX, Yuan CA, Lu JL. Mining topic sentiment in micro -blogging based on microblogger social relation. Journal of Software, 2017,28(3):694-707

work page 2017

[11] [11]

Huang FL，Feng S，Wang DL and Yu G.Mining topics sentiment in Microblogging based multi -feature fusion.[J]Chinese Journal of computer, 2017, 40(4): 872-888

work page 2017

[12] [12]

Revisiting Correlations between Intrinsic and Extrinsic Evaluations of Word Embeddings

Yuanyuan Qiu, Hongzheng Li, Shen Li, Yingdi Jiang, Renfen Hu, Lijiao Yang. Revisiting Correlations between Intrinsic and Extrinsic Evaluations of Word Embeddings. Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. Springer, Cham, 2018. 209-221

work page 2018

[13] [13]

Eff icient Estimation of Word Representations in Vector Space[J]

Mikolov T , Chen K , Corrado G , et al. Eff icient Estimation of Word Representations in Vector Space[J]. Computer Science(CS), 2013

work page 2013

[14] [14]

Distributed representations of sentences and documents[C]//International conference on machine learning

Le Q, Mikolov T. Distributed representations of sentences and documents[C]//International conference on machine learning. 2014: 1188-1196

work page 2014

[15] [15]

Sentiment analysis based on light reviews[J].Journal of Software, 2014,25(12):2790-2807

Zhang L, Qian GQ, Fan WG, H ua K, Zhang L. Sentiment analysis based on light reviews[J].Journal of Software, 2014,25(12):2790-2807

work page 2014

[16] [16]

Analogical Reasoning on Chinese Morphological and Semantic Relations[J]

Li S, Zhao Z, Hu R, et al. Analogical Reasoning on Chinese Morphological and Semantic Relations[J]. 2018

work page 2018

[17] [17]

Yu H, Hatzivassiloglou V. Towards answering o pinion questions: Separating facts from opinions and identifying the polarity of opinion sentences[C]//Proceedings of the 2003 conference on Empirical methods in natural language processing. Association for Computational Linguistics(ACL), 2003: 129-136

work page 2003

[18] [18]

Sentiment classification using machine learning techniques[J]

Wawre S V, Deshmukh S N. Sentiment classification using machine learning techniques[J]. International Journal of Science and Research (IJSR), 2016, 5(4): 819-821

work page 2016

[19] [19]

Topics in semantic representation[J]

Griffiths T L, Steyvers M, Tenenbaum J B. Topics in semantic representation[J]. Psychological review(PR), 2007, 114(2): 211

work page 2007

[20] [20]

Liang B, Liu Q,Xu J.et al.Aspect -based sentiment analysis based on multi-attention CNN[J].Journal of computer research and development,2017,54(8):1724-1735

work page 2017

[21] [21]

Journal of computer research and development,2018,55(5):945-957

Chen K,Liang B,Ke WD,et al.Chinese micro -blog sentiment analysis based on muti-channels convolutional neural networks[J]. Journal of computer research and development,2018,55(5):945-957

work page 2018

[22] [22]

Chinese journal of computers,2017(4)

He YX,Sun ST,Niu FF and Li F.A deep-learning model enhanced with emotion semantics for microblog sentiment analysis[J]. Chinese journal of computers,2017(4)

work page 2017

[23] [23]

Yu J, Jiang J. Learning sentence embeddings with auxiliary tasks for cross-domain sentiment classification[C]//Proceedings of the 2016 conference on empirical methods in natural language processing(NLP). 2016: 236-246

work page 2016

[24] [24]

Yang Z, Yang D, Dy er C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 1480-1489

work page 2016

[25] [25]

Dropout: a simple way to prevent neural networks from overfitting[J]

Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1): 1929-1958

work page 2014

[26] [26]

Convolutional Neural Networks for Sentence Classification

Kim Y. Convolutional neural networks for sentence classification[J]. arXiv preprint arXiv:1408.5882, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014