Citizens' Emotion on GST: A Spatio-Temporal Analysis over Twitter Data

Ankit Rai; Deepak Uniyal

arxiv: 1906.08693 · v1 · pith:CSNEH6PYnew · submitted 2019-06-20 · 💻 cs.IR · cs.SI

Citizens' Emotion on GST: A Spatio-Temporal Analysis over Twitter Data

Deepak Uniyal , Ankit Rai This is my paper

Pith reviewed 2026-05-25 19:10 UTC · model grok-4.3

classification 💻 cs.IR cs.SI

keywords GSTTwitteremotion analysissentiment analysisspatio-temporal analysisNRC lexiconpublic policyIndia

0 comments

The pith

Over 142,000 tweets classified by NRC lexicon map emotional responses to GST rollout over time and space.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper collects tweets about the Goods and Services Tax posted in India during July 2017 and applies the NRC emotion lexicon to label them for eight basic emotions plus positive and negative sentiment. It runs a temporal analysis across 142,508 tweets and a spatial analysis across 58,613 tweets, both gathered via the Twitter streaming API. The goal is to trace how public emotions shifted in the weeks after GST implementation. This produces a record of citizen reaction that can be examined for patterns by date and by location.

Core claim

We have performed temporal analysis and spatial analysis on 1,42,508 and 58,613 tweets respectively using the National Research Council Canada (NRC) emotion Lexicon for eight basic emotions and two sentiments on tweets posted during the post-GST implementation period from July 04, 2017 to July 25, 2017.

What carries the argument

NRC emotion Lexicon applied to tweets collected via Twitter streaming API to assign scores for joy, trust, anticipation, surprise, fear, sadness, anger, disgust, positive, and negative.

If this is right

Policy makers obtain a dated record of emotional reaction that can be checked against specific GST rule changes.
Regional differences in emotion scores become visible when tweets are grouped by location.
The same lexicon pipeline can be rerun on later periods to measure whether emotions stabilized after the initial rollout.
Negative emotions such as anger or disgust can be tracked as early indicators of public resistance to the tax.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could be applied to other policy events mentioned in the abstract, such as demonetization, to compare emotional signatures.
Location metadata in the spatial subset allows testing whether urban versus rural areas showed different emotion distributions.
If lexicon accuracy proves low on informal text, replacing it with a domain-specific emotion dictionary would be a direct next step.

Load-bearing premise

The NRC lexicon, developed on general text, correctly identifies the emotions expressed in short, informal tweets about a specific Indian tax policy, and the collected tweets represent the broader public's views.

What would settle it

A random sample of several hundred GST tweets manually labeled for the same eight emotions and two sentiments shows low agreement with the NRC lexicon outputs.

Figures

Figures reproduced from arXiv: 1906.08693 by Ankit Rai, Deepak Uniyal.

**Figure 2.** Figure 2: Showing the varying Sentiments, Emotions and Hourly Frequency of Tweets over [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Showing the Word Cloud for Top 40 Hashtags and Top 40 Mentions [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Showing the Variation of Sentiments and Emotions On Tweets Addressed To Mr. [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

read the original abstract

People might not be close-at-hand but they still are - by virtue of the social network. The social network has transformed lives in many ways. People can express their views, opinions and life experiences on various platforms be it Twitter, Facebook or any other medium there is. Such events constitute of reviewing a product or service, conveying views on political banters, predicting share prices or giving feedback on the government policies like Demonetization or GST. These social platforms can be used to investigate the insights of the emotional curve that the general public is generating. This kind of analysis can help make a product better, predict the future prospects and also to implement the public policies in a better way. Such kind of research on sentiment analysis is increasing rapidly. In this research paper, we have performed temporal analysis and spatial analysis on 1,42,508 and 58,613 tweets respectively and these tweets were posted during the post-GST implementation period from July 04, 2017 to July 25, 2017. The tweets were collected using the Twitter streaming API. A well-known lexicon, National Research Council Canada (NRC) emotion Lexicon is used for opinion mining that exhibits a blend of eight basic emotions i.e. joy, trust, anticipation, surprise, fear sadness, anger, disgust and two sentiments i.e. positive and negative for 6,554 words.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Routine NRC lexicon run on GST tweets with no validation, results, or domain checks shown.

read the letter

The paper gathers tweets about India's GST rollout and tags them with the NRC emotion lexicon for temporal and spatial views. They pulled 142508 tweets over three weeks for the time series and 58613 for the maps, all via the streaming API, then assigned the eight basic emotions plus positive/negative labels using the existing 6554-word lexicon. That is the entire contribution. The collection itself is a modest, concrete step that someone else could replicate with the same API calls. Beyond that, nothing new is introduced in method or insight. The abstract states the analyses were done but supplies no output tables, no emotion curves, no maps, no accuracy numbers, and no comparison to any ground truth or alternative lexicon. The NRC resource was built on general English text; the paper gives no evidence it was adapted for short tweets, Hinglish, sarcasm, or policy-specific language. Without a held-out human-labeled sample or error analysis, the reported emotion distributions rest on an untested assumption. Twitter sampling bias is also unaddressed. The work is therefore a descriptive exercise rather than a supported finding. It might be of passing interest to someone running a quick case study on Indian policy sentiment, but it does not advance sentiment-analysis techniques or provide reliable evidence on public reaction to GST. I would not bring it to a reading group, would not cite it, and would not send it to referees in its current form.

Referee Report

2 major / 2 minor

Summary. The manuscript collects 142,508 temporal and 58,613 spatial tweets posted between 4–25 July 2017 using the Twitter streaming API and applies the NRC emotion lexicon to extract eight basic emotions (joy, trust, anticipation, surprise, fear, sadness, anger, disgust) plus positive/negative sentiment for spatio-temporal analysis of public reaction to GST implementation.

Significance. If the lexicon outputs were shown to be reliable on this corpus, the work would supply a concrete, large-scale example of lexicon-based emotion tracking on policy-related social media, potentially useful for monitoring public response to fiscal reforms. The scale of the tweet collection is a modest strength, but the absence of any reported results, validation, or error analysis means the manuscript currently contributes only a methods sketch rather than a supported empirical finding.

major comments (2)

[Abstract] Abstract: the text states that temporal and spatial analyses 'have been performed' on the cited tweet volumes yet supplies no quantitative results, no emotion time-series, no spatial maps, no summary statistics, and no comparison to any baseline or ground truth. The central claim therefore reduces to a description of data collection and lexicon choice rather than a demonstrated outcome.
[Abstract] Abstract / Methods (lexicon application): the NRC lexicon was constructed on general English text; the manuscript provides no domain adaptation, no held-out accuracy evaluation against human labels on GST tweets, no handling of tweet-specific artifacts (hashtags, abbreviations, Hinglish, sarcasm), and no error analysis. Because the temporal curves and spatial maps rest entirely on these unvalidated labels, any systematic mismatch between lexicon and domain would render the reported patterns indistinguishable from noise.

minor comments (2)

[Abstract] Abstract: Indian-style thousand separator (1,42,508) is used once and then omitted; adopt consistent international notation throughout.
[Abstract] Abstract: the sentence describing the NRC lexicon ends abruptly after '6,554 words' without stating how many of those words actually appear in the collected tweets or how ties/zero-count words are handled.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments accurately identify that the submitted manuscript describes data collection and lexicon application but does not present the analysis results or any validation of the emotion labels. We will revise the manuscript accordingly to strengthen the empirical contribution.

read point-by-point responses

Referee: [Abstract] Abstract: the text states that temporal and spatial analyses 'have been performed' on the cited tweet volumes yet supplies no quantitative results, no emotion time-series, no spatial maps, no summary statistics, and no comparison to any baseline or ground truth. The central claim therefore reduces to a description of data collection and lexicon choice rather than a demonstrated outcome.

Authors: We agree with the observation. The abstract and body state that temporal and spatial analyses were performed on the collected tweets, yet the submitted manuscript contains no quantitative results, time-series, maps, statistics, or baseline comparisons. This was an omission during preparation. In the revised manuscript we will add a dedicated results section containing the emotion time-series, spatial distribution maps, summary statistics on emotion frequencies, and any feasible comparisons to baselines or prior work. revision: yes
Referee: [Abstract] Abstract / Methods (lexicon application): the NRC lexicon was constructed on general English text; the manuscript provides no domain adaptation, no held-out accuracy evaluation against human labels on GST tweets, no handling of tweet-specific artifacts (hashtags, abbreviations, Hinglish, sarcasm), and no error analysis. Because the temporal curves and spatial maps rest entirely on these unvalidated labels, any systematic mismatch between lexicon and domain would render the reported patterns indistinguishable from noise.

Authors: The referee correctly identifies a core limitation. The NRC lexicon was applied without domain adaptation, without accuracy evaluation on GST tweets, and without explicit handling of tweet artifacts or error analysis. We will revise the methods and add a new evaluation subsection that reports results from manual annotation of a random sample of tweets (e.g., precision/recall against human labels) and a discussion of limitations arising from Hinglish, sarcasm, and abbreviations. Simple preprocessing steps for common hashtags and abbreviations will also be described. revision: yes

Circularity Check

0 steps flagged

No circularity: purely descriptive application of external lexicon

full rationale

The paper performs temporal and spatial analysis by applying the pre-existing NRC emotion lexicon (an external resource developed independently) to a collected set of tweets. No parameters are fitted, no predictions are generated from the data itself, no self-citations form the load-bearing justification, and no derivations reduce to the inputs by construction. The work is an application of an off-the-shelf tool to new data, with all core steps (lexicon lookup, aggregation over time/space) independent of the target results.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The analysis rests on the untested assumption that the NRC lexicon transfers accurately to informal tweets about GST and that the sampled tweets represent public opinion. No free parameters or invented entities are introduced.

axioms (2)

domain assumption The NRC emotion lexicon accurately captures emotions in short, informal tweets about Indian tax policy
The lexicon is applied directly without domain-specific validation or adaptation mentioned in the abstract.
domain assumption Tweets collected via the Twitter streaming API during the stated period are representative of citizens' emotions on GST
The abstract states the collection method and counts but offers no sampling-bias discussion.

pith-pipeline@v0.9.0 · 5779 in / 1310 out tokens · 21397 ms · 2026-05-25T19:10:43.784034+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

A well-known lexicon, National Research Council Canada (NRC) emotion Lexicon is used for opinion mining that exhibits a blend of eight basic emotions i.e. joy, trust, anticipation, surprise, fear sadness, anger, disgust and two sentiments
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

temporal analysis and spatial analysis on 1,42,508 and 58,613 tweets respectively

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages · 2 internal anchors

[1]

Sentiment Analysis of Twitter Data: A Survey of Techniques

Kharde, Vishal, and Prof Sonawane. ”Sentiment analysis of twitter data: a survey of techniques.” arXiv preprint arXiv:1601.06971 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[2]

”Sentiment analysis algorithms and applications: A survey.” Ain Shams Engineering Journal 5, no

Medhat, Walaa, Ahmed Hassan, and Hoda Korashy. ”Sentiment analysis algorithms and applications: A survey.” Ain Shams Engineering Journal 5, no. 4 (2014): 1093- 1113

work page 2014
[3]

and Toshniwal, D., 2019

Agarwal, A. and Toshniwal, D., 2019. ”SmPFT: Social media based proﬁle fusion technique for data enrichment. Computer Networks”, 158, pp.123-131

work page 2019
[4]

and Toshniwal, D., 2019

Agarwal, A. and Toshniwal, D., 2019. ”Face off: Travel habits, Road conditions and Trafﬁc city characteristics bared using Twitter”. IEEE Access

work page 2019
[5]

Sandner, and Isabell M

Tumasjan, Andranik, Timm Oliver Sprenger, Philipp G. Sandner, and Isabell M. Welpe. ”Predicting elections with twitter: What 140 characters reveal about political sentiment.” Icwsm 10, no. 1 (2010): 178-185

work page 2010
[6]

Fake news detection on social media: A data mining perspective

Shu K, Sliva A, Wang S, Tang J, Liu H. Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter. 2017 Sep 1;19(1):22-36

work page 2017
[7]

and Chen, M., 2018, July

Krishnan, S. and Chen, M., 2018, July. Identifying Tweets with Fake News. In 2018 IEEE International Conference on Information Reuse and Integration (IRI) (pp. 460- 464). IEEE

work page 2018
[8]

and Narayanan, S., 2012, July

Wang, H., Can, D., Kazemzadeh, A., Bar, F. and Narayanan, S., 2012, July. A sys- tem for real-time twitter sentiment analysis of 2012 us presidential election cycle. In Proceedings of the ACL 2012 System Demonstrations (pp. 115-120). Association for Computational Linguistics

work page 2012
[9]

and Kolya, A.K., 2017, November

Das, S. and Kolya, A.K., 2017, November. Sense GST: Text mining & sentiment anal- ysis of GST tweets by Naive Bayes algorithm. In Research in Computational Intelli- gence and Communication Networks (ICRCICN), 2017 Third International Conference on (pp. 239-244). IEEE

work page 2017
[10]

and Roy, S., 2018, January

Ganguly, M. and Roy, S., 2018, January. A social network analysis of opinions on GST in India within Twitter. In Proceedings of the Workshop Program of the 19th In- ternational Conference on Distributed Computing and Networking (p. 18). ACM

work page 2018
[11]

and Shinde, V ., 2014

Mane, S.B., Sawant, Y ., Kazi, S. and Shinde, V ., 2014. Real time sentiment analysis of twitter data using hadoop. IJCSIT) International Journal of Computer Science and Information Technologies, 5(3), pp.3098-3100. CITIZENS ’ EMOTION ON GST: A S PATIO-T EMPORAL ANALYSIS OVER TWITTER DATA 11

work page 2014
[12]

and Majhi, B., 2016, October

Pagolu, V .S., Reddy, K.N., Panda, G. and Majhi, B., 2016, October. Sentiment analy- sis of Twitter data for predicting stock market movements. In Signal Processing, Com- munication, Power and Embedded System (SCOPES), 2016 International Conference on (pp. 1345-1350). IEEE

work page 2016
[13]

and Toshniwal, D., 2018, June

Agarwal, A. and Toshniwal, D., 2018, June. Application of Lexicon Based Approach in Sentiment Analysis for short Tweets. In 2018 International Conference on Advances in Computing and Communication Engineering (ICACCE) (pp. 189-193). IEEE

work page 2018
[14]

and Toshniwal, D., 2018

Agarwal, A., Singh, R. and Toshniwal, D., 2018. Geospatial sentiment analysis using twitter data for UK-EU referendum. Journal of Information and Optimization Sciences, 39(1), pp.303-317

work page 2018
[15]

and Mittal, A., 2015, December

Agarwal, A., Gupta, B., Bhatt, G. and Mittal, A., 2015, December. Construction of a Semi-Automated model for FAQ Retrieval via Short Message Service. In Proceedings of the 7th Forum for Information Retrieval Evaluation (pp. 35-38). ACM

work page 2015
[16]

A Datamining Approach for Emotions Extraction and Discovering Cricketers performance from Stadium to Sensex

Agarwal Amit, Brijraj Singh, Jatin Bedi, and Durga Toshniwal. ”A Datamining Ap- proach for Emotions Extraction and Discovering Cricketers performance from Stadium to Sensex.”arXiv preprint arXiv:1809.00310 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[1] [1]

Sentiment Analysis of Twitter Data: A Survey of Techniques

Kharde, Vishal, and Prof Sonawane. ”Sentiment analysis of twitter data: a survey of techniques.” arXiv preprint arXiv:1601.06971 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[2] [2]

”Sentiment analysis algorithms and applications: A survey.” Ain Shams Engineering Journal 5, no

Medhat, Walaa, Ahmed Hassan, and Hoda Korashy. ”Sentiment analysis algorithms and applications: A survey.” Ain Shams Engineering Journal 5, no. 4 (2014): 1093- 1113

work page 2014

[3] [3]

and Toshniwal, D., 2019

Agarwal, A. and Toshniwal, D., 2019. ”SmPFT: Social media based proﬁle fusion technique for data enrichment. Computer Networks”, 158, pp.123-131

work page 2019

[4] [4]

and Toshniwal, D., 2019

Agarwal, A. and Toshniwal, D., 2019. ”Face off: Travel habits, Road conditions and Trafﬁc city characteristics bared using Twitter”. IEEE Access

work page 2019

[5] [5]

Sandner, and Isabell M

Tumasjan, Andranik, Timm Oliver Sprenger, Philipp G. Sandner, and Isabell M. Welpe. ”Predicting elections with twitter: What 140 characters reveal about political sentiment.” Icwsm 10, no. 1 (2010): 178-185

work page 2010

[6] [6]

Fake news detection on social media: A data mining perspective

Shu K, Sliva A, Wang S, Tang J, Liu H. Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter. 2017 Sep 1;19(1):22-36

work page 2017

[7] [7]

and Chen, M., 2018, July

Krishnan, S. and Chen, M., 2018, July. Identifying Tweets with Fake News. In 2018 IEEE International Conference on Information Reuse and Integration (IRI) (pp. 460- 464). IEEE

work page 2018

[8] [8]

and Narayanan, S., 2012, July

Wang, H., Can, D., Kazemzadeh, A., Bar, F. and Narayanan, S., 2012, July. A sys- tem for real-time twitter sentiment analysis of 2012 us presidential election cycle. In Proceedings of the ACL 2012 System Demonstrations (pp. 115-120). Association for Computational Linguistics

work page 2012

[9] [9]

and Kolya, A.K., 2017, November

Das, S. and Kolya, A.K., 2017, November. Sense GST: Text mining & sentiment anal- ysis of GST tweets by Naive Bayes algorithm. In Research in Computational Intelli- gence and Communication Networks (ICRCICN), 2017 Third International Conference on (pp. 239-244). IEEE

work page 2017

[10] [10]

and Roy, S., 2018, January

Ganguly, M. and Roy, S., 2018, January. A social network analysis of opinions on GST in India within Twitter. In Proceedings of the Workshop Program of the 19th In- ternational Conference on Distributed Computing and Networking (p. 18). ACM

work page 2018

[11] [11]

and Shinde, V ., 2014

Mane, S.B., Sawant, Y ., Kazi, S. and Shinde, V ., 2014. Real time sentiment analysis of twitter data using hadoop. IJCSIT) International Journal of Computer Science and Information Technologies, 5(3), pp.3098-3100. CITIZENS ’ EMOTION ON GST: A S PATIO-T EMPORAL ANALYSIS OVER TWITTER DATA 11

work page 2014

[12] [12]

and Majhi, B., 2016, October

Pagolu, V .S., Reddy, K.N., Panda, G. and Majhi, B., 2016, October. Sentiment analy- sis of Twitter data for predicting stock market movements. In Signal Processing, Com- munication, Power and Embedded System (SCOPES), 2016 International Conference on (pp. 1345-1350). IEEE

work page 2016

[13] [13]

and Toshniwal, D., 2018, June

Agarwal, A. and Toshniwal, D., 2018, June. Application of Lexicon Based Approach in Sentiment Analysis for short Tweets. In 2018 International Conference on Advances in Computing and Communication Engineering (ICACCE) (pp. 189-193). IEEE

work page 2018

[14] [14]

and Toshniwal, D., 2018

Agarwal, A., Singh, R. and Toshniwal, D., 2018. Geospatial sentiment analysis using twitter data for UK-EU referendum. Journal of Information and Optimization Sciences, 39(1), pp.303-317

work page 2018

[15] [15]

and Mittal, A., 2015, December

Agarwal, A., Gupta, B., Bhatt, G. and Mittal, A., 2015, December. Construction of a Semi-Automated model for FAQ Retrieval via Short Message Service. In Proceedings of the 7th Forum for Information Retrieval Evaluation (pp. 35-38). ACM

work page 2015

[16] [16]

A Datamining Approach for Emotions Extraction and Discovering Cricketers performance from Stadium to Sensex

Agarwal Amit, Brijraj Singh, Jatin Bedi, and Durga Toshniwal. ”A Datamining Ap- proach for Emotions Extraction and Discovering Cricketers performance from Stadium to Sensex.”arXiv preprint arXiv:1809.00310 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018