pith. sign in

arxiv: 2402.01720 · v4 · submitted 2024-01-26 · 💻 cs.CY · cs.AI· cs.CL· cs.LG

Deep Learning Based Amharic Chatbot for FAQs in Universities

Pith reviewed 2026-05-24 04:42 UTC · model grok-4.3

classification 💻 cs.CY cs.AIcs.CLcs.LG
keywords Amharic chatbotdeep learningFAQ answering systemAmharic natural language processinguniversity student supporttext classificationlow-resource language chatbot
0
0 comments X

The pith

A deep neural network classifies Amharic university FAQ sentences at 91.55 percent accuracy after standard text preprocessing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper builds a chatbot that answers common university questions written in Amharic by turning input sentences into tokens, normalizing them, removing stop words, and stemming before classification. Three algorithms were tested, and the deep neural network trained with TensorFlow and Keras produced the highest accuracy along with low validation loss. The finished system runs on Facebook Messenger through a Heroku server so students can get answers at any hour. Readers would care because the work shows a concrete way to reduce repetitive administrative load for both students and staff in a language that presents script and morphology difficulties.

Core claim

The authors show that a deep neural network using Adam optimization and SoftMax activation, applied after tokenization, normalization, stop-word removal, and stemming, classifies Amharic FAQ inputs with 91.55 percent accuracy and 0.3548 validation loss, outperforming support vector machines and multinomial naive Bayes while handling Fidel script variation and morphological complexity sufficiently for deployment.

What carries the argument

Deep neural network classifier with Adam optimizer and SoftMax activation, trained on preprocessed Amharic FAQ tokens to map inputs to fixed responses.

If this is right

  • The chatbot can run continuously on a public messaging platform without human staff for routine queries.
  • Amharic-specific text issues can be managed well enough for practical FAQ use with standard NLP steps.
  • Adding resources such as Amharic WordNet would allow the same architecture to address questions beyond the current FAQ set.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same preprocessing-plus-deep-network pattern could be tried on other languages that use non-Latin scripts and rich morphology.
  • Live deployment data could be used to retrain the model periodically and close gaps that appear only in actual use.
  • Extending the input scope from single-sentence FAQs to short multi-turn dialogues would test whether the current accuracy holds.

Load-bearing premise

The collected FAQ dataset plus the chosen preprocessing steps are taken to represent typical student questions and to handle Amharic script and word-form variations without introducing bias that inflates the reported accuracy.

What would settle it

Accuracy falling below 80 percent on a fresh collection of real student questions gathered independently from the original training set would show the model does not generalize.

Figures

Figures reproduced from arXiv: 2402.01720 by Goitom Ybrah Hailu, Hadush Hailu, Shishay Welay.

Figure 1
Figure 1. Figure 1: Conceptual framework of FAQ chatbot The user part defines the end-users or students who need answers and information about their studying and other related questions in the university. The user interface (UI) part describes how a user communicates and interacts with the chatbot. It is a series of elements of natural languages that allow for interaction between user-chatbot models. This means users can comm… view at source ↗
read the original abstract

University students often spend a considerable amount of time seeking answers to common questions from administrators or teachers. This can become tedious for both parties, leading to a need for a solution. In response, this paper proposes a chatbot model that utilizes natural language processing and deep learning techniques to answer frequently asked questions (FAQs) in the Amharic language. Chatbots are computer programs that simulate human conversation through the use of artificial intelligence (AI), acting as a virtual assistant to handle questions and other tasks. The proposed chatbot program employs tokenization, normalization, stop word removal, and stemming to analyze and categorize Amharic input sentences. Three machine learning model algorithms were used to classify tokens and retrieve appropriate responses: Support Vector Machine (SVM), Multinomial Na\"ive Bayes, and deep neural networks implemented through TensorFlow, Keras, and NLTK. The deep learning model achieved the best results with 91.55% accuracy and a validation loss of 0.3548 using an Adam optimizer and SoftMax activation function. The chatbot model was integrated with Facebook Messenger and deployed on a Heroku server for 24-hour accessibility. The experimental results demonstrate that the chatbot framework achieved its objectives and effectively addressed challenges such as Amharic Fidel variation, morphological variation, and lexical gaps. Future research could explore the integration of Amharic WordNet to narrow the lexical gap and support more complex questions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes an Amharic-language FAQ chatbot for university students that applies tokenization, normalization, stop-word removal and stemming, then classifies inputs with SVM, Multinomial Naive Bayes or a TensorFlow/Keras DNN; the DNN is reported to reach 91.55 % accuracy (validation loss 0.3548) with Adam and softmax, and the system is deployed on Facebook Messenger via Heroku. The authors claim the pipeline successfully addresses Amharic Fidel variation, morphology and lexical gaps.

Significance. If the experimental claims are substantiated, the work would supply a practical, always-available FAQ service for Amharic-speaking students and would constitute one of the few documented end-to-end deployments of a DNN classifier for this language in an educational setting. The explicit comparison of three model families and the public deployment on a widely used messaging platform are concrete strengths.

major comments (3)
  1. [Abstract] Abstract: the headline claim that the DNN 'achieved the best results with 91.55% accuracy' is presented without any information on dataset cardinality, number of FAQ classes, train/validation/test split sizes or ratios, or number of held-out examples. Without these quantities the reported scalar cannot be interpreted as evidence of generalization over morphological or Fidel variation.
  2. [Abstract] Abstract / Results: no multiple-run statistics, no per-class or error analysis, and no description of the feature representations or hyper-parameter settings used for the SVM and Naive Bayes baselines are supplied. Consequently the assertion that the DNN is superior cannot be evaluated and the central performance claim remains unverifiable.
  3. [Methodology] Methodology: the preprocessing steps are listed at a high level, yet no concrete implementation details (e.g., how Fidel-script normalization was performed, size of the stop-word list, or stemming rules) or ablation results showing their contribution to the final accuracy are given. This information is load-bearing for the claim that the system overcomes Amharic-specific linguistic challenges.
minor comments (2)
  1. [Abstract] Abstract: 'Naïve' is misspelled as 'Naïve' with an escaped quote; standard spelling is 'Naive Bayes'.
  2. [Abstract] The final sentence asserts that 'the experimental results demonstrate that the chatbot framework achieved its objectives' without additional quantitative support beyond the single accuracy figure.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important gaps in reporting that affect the verifiability of our results. We agree that additional details are needed and will revise the manuscript accordingly to address each point.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the headline claim that the DNN 'achieved the best results with 91.55% accuracy' is presented without any information on dataset cardinality, number of FAQ classes, train/validation/test split sizes or ratios, or number of held-out examples. Without these quantities the reported scalar cannot be interpreted as evidence of generalization over morphological or Fidel variation.

    Authors: We agree that these dataset details are essential for interpreting the accuracy figure. In the revised manuscript we will expand both the abstract and a new 'Dataset' subsection to report the total number of samples, number of FAQ classes, the exact train/validation/test split sizes and ratios, and the number of held-out examples used for evaluation. revision: yes

  2. Referee: [Abstract] Abstract / Results: no multiple-run statistics, no per-class or error analysis, and no description of the feature representations or hyper-parameter settings used for the SVM and Naive Bayes baselines are supplied. Consequently the assertion that the DNN is superior cannot be evaluated and the central performance claim remains unverifiable.

    Authors: We acknowledge that the absence of these elements prevents proper evaluation of the DNN's superiority. The revision will add multiple-run statistics (mean and standard deviation across runs where available), per-class metrics with error analysis, and complete descriptions of feature representations (e.g., TF-IDF vectors) together with the hyper-parameter settings employed for the SVM and Multinomial Naive Bayes baselines. revision: yes

  3. Referee: [Methodology] Methodology: the preprocessing steps are listed at a high level, yet no concrete implementation details (e.g., how Fidel-script normalization was performed, size of the stop-word list, or stemming rules) or ablation results showing their contribution to the final accuracy are given. This information is load-bearing for the claim that the system overcomes Amharic-specific linguistic challenges.

    Authors: We agree that concrete implementation details and ablation results are necessary to substantiate the linguistic claims. The revised methodology section will specify the exact Fidel-script normalization procedure, the size and composition of the stop-word list, and the stemming rules applied. We will also include ablation experiments quantifying the accuracy contribution of each preprocessing step. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical accuracy on held-out data

full rationale

The paper reports an empirical result (91.55% accuracy, 0.3548 validation loss) obtained by training and evaluating standard classifiers (SVM, Naive Bayes, DNN) on a collected Amharic FAQ dataset after standard preprocessing steps. No equations, fitted parameters renamed as predictions, self-citations used as load-bearing uniqueness theorems, or ansatzes smuggled via prior work appear in the provided text. The central claim is a direct performance measurement rather than a derivation that reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard supervised classification assumptions and the adequacy of generic NLP preprocessing for Amharic; no new free parameters, axioms, or invented entities are introduced beyond the choice of three off-the-shelf algorithms.

axioms (1)
  • domain assumption Training and validation data are drawn from the same distribution and the validation accuracy reflects real-world performance.
    Implicit in reporting a single validation accuracy as the primary result without further qualification.

pith-pipeline@v0.9.0 · 5795 in / 1270 out tokens · 20441 ms · 2026-05-24T04:42:42.485294+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

  1. [1]

    Multi-Agent Communication System with Chatbots,

    A. Ali, Z. MEMON, A. H. JALBANI, and M. SHAIKH, “Multi-Agent Communication System with Chatbots,” Mehran Univ. Res. J. Eng. Technol., vol. 37, no. 3, pp. 663–672, 2018

  2. [2]

    A Study of T oday’s A.I. through Chatbots and Rediscovery of Machine Intelligence,

    A. Khanna, B. Pandey, K. Vashishta, K. Kalia, B. Pradeepkumar, and T. Das, “A Study of T oday’s A.I. through Chatbots and Rediscovery of Machine Intelligence,” Int. J. u - e-Service, Sci. Technol., vol. 8, no. 7, pp. 277–284, 2015

  3. [3]

    Natural Language Processing : State of The Art , Current Trends an d Challenges,

    D. Khurana, A. Koli, K. Khatter, and S. Singh, “Natural Language Processing : State of The Art , Current Trends an d Challenges,” Manav Rachna Int. Univ., no. Figure 1

  4. [4]

    Chatbots : Are they Really Useful ?,

    B. A. Shawar and E. Atwell, “Chatbots : Are they Really Useful ?,” ResearchGate, no. January, 2007

  5. [5]

    FAQchat as an Information Retrieval System,

    B. A. Shawar, E. Atwell, and A. Roberts, “FAQchat as an Information Retrieval System,” ResearchGate, n o. January, 2005

  6. [6]

    Chatbot for University Related FAQs,

    B. R.Ranoliya, N. Raghuwanshi, and S. Singh, “Chatbot for University Related FAQs,” IEEE, pp. 1525–1530, 2017

  7. [7]

    Implementation of FAQ Pages using Chatbot,

    S. Nair, S. AD, S. SP, and T. Sinha, “Implementation of FAQ Pages using Chatbot,” Int. J. Comput. Sci. Inf. Secur. , vol. 16, no. 6, pp. 187–194, 2018

  8. [8]

    Chatbot using TensorFlow for small Businesses,

    R. Singh, M. Paste, N. Shinde, H. Patel, and N. Mishra, “Chatbot using TensorFlow for small Businesses,” Second Int. Conf. Inven. Commun. Comput. Technol., no. Icicct, pp. 1614– 1619, 2018

  9. [9]

    Arabic q uestion- answering via instance based learning from an FAQ corpus,

    Bayan Abu Shawar and Eric Atwell, “Arabic q uestion- answering via instance based learning from an FAQ corpus,” CL2009 Int. Conf. Corpus Linguist., 2009

  10. [10]

    Botta : An Arabic Dialect Chatbot,

    D. A. Ali and N. Habash, “Botta : An Arabic Dialect Chatbot,” Int. Conf. Comput. Linguist., pp. 208–212, 2016

  11. [11]

    TETEYEQ ( ተጠየቅ): AMHARIC QUESTION ANSWERING SYSTEM FOR FACTOID QUESTIONS,

    S. M. Yimam, “TETEYEQ ( ተጠየቅ): AMHARIC QUESTION ANSWERING SYSTEM FOR FACTOID QUESTIONS,” 2009

  12. [12]

    LETEYEQ ( ሌጠየቅ)-A Web Based Amharic Question Answering System for Factoid Questions Using Machine Learning Approach,

    D. Abebaw Zeleke, “LETEYEQ ( ሌጠየቅ)-A Web Based Amharic Question Answering System for Factoid Questions Using Machine Learning Approach,” 2013

  13. [13]

    Amharic Question Answering For Definitional, Biographical and Description Questions,

    T. Abedissa, “Amharic Question Answering For Definitional, Biographical and Description Questions,” ADDIS ABABA UNIVERSITY, 2013

  14. [14]

    DESIGN SCIENCE IN INFORMATION SYSTEMS RESEARCH,

    A. R. Hevner, S. Ram, S. T. March, and J. Park, “DESIGN SCIENCE IN INFORMATION SYSTEMS RESEARCH,” MIS Q., vol. 10, no. 2, pp. 199–217, 2004

  15. [15]

    A Design Science Research Methodology for Information Systems Research,

    K. Peffers, T. Tuure, and C. Samir, “A Design Science Research Methodology for Information Systems Research,” J. Manag. Inf. Syst., vol. 24, no. 3, pp. 45–78

  16. [16]

    ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION,

    D. P. Kingma and J. L. Ba, “ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION,” ICLR 2015 , pp. 1 –15, 2017