pith. sign in

arxiv: 1907.03202 · v1 · pith:D3F55QBVnew · submitted 2019-07-06 · 💻 cs.CL · cs.NE

Evolutionary Algorithm for Sinhala to English Translation

Pith reviewed 2026-05-25 01:19 UTC · model grok-4.3

classification 💻 cs.CL cs.NE
keywords machine translationevolutionary algorithmSinhala languagelow-resource translationnatural language processinggrammar correction
0
0 comments X

The pith

An evolutionary algorithm finds the correct English meaning of Sinhala sentences then applies grammar correction to produce accurate translations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Sinhala has limited digital text and complex grammar rules that make standard statistical or neural machine translation methods impractical. This paper applies an evolutionary algorithm to search for the right meaning of a Sinhala sentence, generate an English translation from that meaning, and then correct the grammar of the output. The authors state that the combined process yields accurate results while avoiding the need for large training corpora or hand-crafted linguistic rules.

Core claim

The paper claims that an evolutionary algorithm identifies the correct meaning of Sinhala text, carries out the translation to English, and passes the result to a grammar-correction step, achieving accurate translations.

What carries the argument

The evolutionary algorithm that searches for the correct English meaning of Sinhala input sentences.

If this is right

  • Translation succeeds without requiring large amounts of parallel training data.
  • Complex Sinhala grammar is navigated through evolutionary search rather than explicit statistical rules.
  • A separate grammar-correction step is applied after the evolutionary translation.
  • Accurate English output is reported for the Sinhala-to-English task.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same evolutionary search could be applied to other low-resource languages that have sparse digital text.
  • Performance might improve if the evolutionary fitness function were augmented with modern embedding-based similarity measures.
  • The approach could be tested on longer or more syntactically varied Sinhala sentences to check scalability.

Load-bearing premise

An evolutionary algorithm can reliably discover the correct English meaning of Sinhala sentences without large training data or explicit linguistic rules.

What would settle it

A test collection of Sinhala sentences with known correct English translations on which the evolutionary algorithm consistently produces semantically wrong or ungrammatical output.

read the original abstract

Machine Translation (MT) is an area in natural language processing, which focus on translating from one language to another. Many approaches ranging from statistical methods to deep learning approaches are used in order to achieve MT. However, these methods either require a large number of data or a clear understanding about the language. Sinhala language has less digital text which could be used to train a deep neural network. Furthermore, Sinhala has complex rules therefore, it is harder to create statistical rules in order to apply statistical methods in MT. This research focuses on Sinhala to English translation using an Evolutionary Algorithm (EA). EA is used to identifying the correct meaning of Sinhala text and to translate it to English. The Sinhala text is passed to identify the meaning in order to get the correct meaning of the sentence. With the use of the EA the translation is carried out. The translated text is passed on to grammatically correct the sentence. This has shown to achieve accurate results.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes using an evolutionary algorithm (EA) for Sinhala-to-English machine translation. It notes that Sinhala has limited digital text and complex rules, making data-intensive or rule-based methods impractical, and claims that an EA identifies correct sentence meanings, performs the translation, and applies grammatical correction to yield accurate results.

Significance. A working EA-based MT system that requires neither large parallel corpora nor explicit linguistic rules would be a notable contribution for low-resource languages. However, the manuscript supplies no implementation details, fitness function, representation scheme, or evaluation data, so no assessment of significance is possible.

major comments (3)
  1. [Abstract] Abstract: the assertion that the method 'has shown to achieve accurate results' is unsupported by any metrics (e.g., BLEU, accuracy), test-set description, baseline comparisons, or even the number of sentences evaluated.
  2. [Abstract] Abstract / method (entirely absent): the fitness function that would allow the EA to rank candidate English translations for semantic correctness is never defined, nor are the chromosome representation, population initialization, or selection operators described. This directly undermines the claim that the approach operates without large data or explicit rules.
  3. [Abstract] Abstract: the two-stage pipeline (EA translation followed by separate grammatical correction) is stated without any indication of how the grammatical corrector is implemented or whether it relies on the very linguistic resources the EA is meant to avoid.
minor comments (1)
  1. [Abstract] Abstract contains several grammatical issues ('which focus' should be 'which focuses'; 'identifying the correct meaning' should be 'to identify the correct meaning').

Simulated Author's Rebuttal

3 responses · 2 unresolved

We thank the referee for their comments. The manuscript is indeed limited to a high-level description without implementation specifics or evaluation results. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion that the method 'has shown to achieve accurate results' is unsupported by any metrics (e.g., BLEU, accuracy), test-set description, baseline comparisons, or even the number of sentences evaluated.

    Authors: We agree that the claim of achieving accurate results lacks any supporting metrics, test-set details, baselines, or evaluation count. The manuscript provides no such evidence. We will revise the abstract to remove or qualify this unsupported statement. revision: yes

  2. Referee: [Abstract] Abstract / method (entirely absent): the fitness function that would allow the EA to rank candidate English translations for semantic correctness is never defined, nor are the chromosome representation, population initialization, or selection operators described. This directly undermines the claim that the approach operates without large data or explicit rules.

    Authors: The manuscript contains no definition of the fitness function, chromosome representation, population initialization, or selection operators. These elements are entirely absent, so we cannot supply them or demonstrate how the approach avoids data or rules. revision: no

  3. Referee: [Abstract] Abstract: the two-stage pipeline (EA translation followed by separate grammatical correction) is stated without any indication of how the grammatical corrector is implemented or whether it relies on the very linguistic resources the EA is meant to avoid.

    Authors: The manuscript states that translated text is passed for grammatical correction but gives no implementation details for the corrector or its resource requirements. We cannot clarify this aspect as the information is not present in the work. revision: no

standing simulated objections not resolved
  • No evaluation data, metrics, or test sentences exist in the manuscript to support accuracy claims.
  • No EA implementation details (fitness function, representation, operators) are available to describe.

Circularity Check

0 steps flagged

No circularity: purely empirical description with no derivations or self-referential steps

full rationale

The paper contains no equations, no parameter-fitting steps, and no mathematical derivations. Its central claim is an empirical assertion that an EA plus post-processing produces accurate Sinhala-to-English translations. No load-bearing step reduces to a self-definition, a fitted input renamed as prediction, or a self-citation chain; the text simply describes the intended workflow without any formal reduction that could be circular. The absence of any claimed 'first-principles result' or uniqueness theorem means the circularity patterns do not apply.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical parameters, axioms, or new entities are described in the abstract.

pith-pipeline@v0.9.0 · 5707 in / 969 out tokens · 30058 ms · 2026-05-25T01:19:40.667543+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · 5 internal anchors

  1. [1]

    Advances in natural language processing,

    J. Hirschberg and C. D. Manning, "Advances in natural language processing," Science, vol. 349, pp. 261-266, 2015

  2. [2]

    Conditional Random Fields based named entity recognition for sinhala,

    K. Senevirathne, N. Attanayake, A. Dhananjanie, W. Weragoda, A. Nugaliyadde, and S. Thelijjagoda, "Conditional Random Fields based named entity recognition for sinhala," in 2015 IEEE 10th International Conference on Industrial and Information Systems (ICIIS), 2015, pp. 302- 307

  3. [3]

    Jumping NLP curves: a review of natural language processing research [review article],

    E. Cambria and B. White, "Jumping NLP curves: a review of natural language processing research [review article]," IEEE Computational Intelligence Magazine, vol. 9, pp. 48-57, 2014

  4. [4]

    C. D. Manning, C. D. Manning, and H. Schütze, Foundations of statistical natural language processing: MIT press, 1999

  5. [5]

    “Mahoshadha

    J. Jayakody, T. Gamlath, W. Lasantha, K. Premachandra, A. Nugaliyadde, and Y. Mallawarachchi, "“Mahoshadha”, the Sinhala Tagged Corpus Based Question Answering System," in Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems: Volume 1, 2016, pp. 313-322

  6. [6]

    A Morphological analyzer to enable English to Sinhala Machine Translation,

    B. Hettige and A. S. Karunananda, "A Morphological analyzer to enable English to Sinhala Machine Translation," in 2006 International Conference on Information and Automation, 2006, pp. 21-26

  7. [7]

    Machine translation approaches and survey for Indian languages,

    P. Antony, "Machine translation approaches and survey for Indian languages," International Journal of Computational Linguistics & Chinese Language Processing, Volume 18, Number 1, March 2013, vol. 18, 2013

  8. [8]

    Developing lexicon databases for English to Sinhala machine translation,

    B. Hettige and A. Karunananda, "Developing lexicon databases for English to Sinhala machine translation," in 2007 International Conference on Industrial and Information Systems, 2007, pp. 215- 220

  9. [9]

    Japanese-Sinhalese “machine translation system Jaw/Sinhalese,

    Y. I. Samantha Thelijjagodzf and T. Ikeda, "Japanese-Sinhalese “machine translation system Jaw/Sinhalese," Journal of the National Science Foundation of Sri Lanka, vol. 35, p. 2, 2007

  10. [10]

    A statistical machine translation approach to sinhala-tamil language translation,

    R. Weerasinghe, "A statistical machine translation approach to sinhala-tamil language translation," Towards an ICT enabled Society, p. 136, 2003

  11. [11]

    Statistical machine translation of systems for Sinhala-Tamil,

    S. Sripirakas, A. Weerasinghe, and D. L. Herath, "Statistical machine translation of systems for Sinhala-Tamil," in 2010 International Conference on Advances in ICT for Emerging Regions (ICTer), 2010, pp. 62-68

  12. [12]

    Deep learning,

    Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, pp. 436-444, 2015

  13. [13]

    Reinforced memory network for question answering,

    A. Nugaliyadde, K. W. Wong, F. Sohel, and H. Xie, "Reinforced memory network for question answering," in International Conference on Neural Information Processing, 2017, pp. 482-490

  14. [14]

    Integration of Bilingual Lists for Domain- Specific Statistical Machine Translation for Sinhala-Tamil,

    F. Farhath, S. Ranathunga, S. Jayasena, and G. Dias, "Integration of Bilingual Lists for Domain- Specific Statistical Machine Translation for Sinhala-Tamil," in 2018 Moratuwa Engineering Research Conference (MERCon), 2018, pp. 538- 543

  15. [15]

    Memory Networks

    J. Weston, S. Chopra, and A. Bordes, "Memory networks," arXiv preprint arXiv:1410.3916, 2014

  16. [16]

    Mnemonic Reader: Machine Comprehension with Iterative Aligning and Multi-hop Answer Pointing,

    M. Hu, Y. Peng, and X. Qiu, "Mnemonic Reader: Machine Comprehension with Iterative Aligning and Multi-hop Answer Pointing," 2017

  17. [17]

    Towards Neural Network-based Reasoning

    B. Peng, Z. Lu, H. Li, and K.-F. Wong, "Towards neural network-based reasoning," arXiv preprint arXiv:1508.05508, 2015

  18. [18]

    Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

    K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, et al., "Learning phrase representations using RNN encoder-decoder for statistical machine translation," arXiv preprint arXiv:1406.1078, 2014

  19. [19]

    Neural Machine Translation by Jointly Learning to Align and Translate

    D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv:1409.0473, 2014

  20. [20]

    Language Modeling through Long Term Memory Network

    A. Nugaliyadde, K. W. Wong, F. Sohel, and H. Xie, "Language Modeling through Long Term Memory Network," arXiv preprint arXiv:1904.08936, 2019

  21. [21]

    Modeling server workloads for campus email traffic using recurrent neural networks,

    S. Boukoros, A. Nugaliyadde, A. Marnerides, C. Vassilakis, P. Koutsakis, and K. W. Wong, "Modeling server workloads for campus email traffic using recurrent neural networks," in International Conference on Neural Information Processing, 2017, pp. 57-66

  22. [22]

    C. A. C. Coello, G. B. Lamont, and D. A. Van Veldhuizen, Evolutionary algorithms for solving multi-objective problems vol. 5: Springer, 2007

  23. [23]

    Part-of-speech tagging with evolutionary algorithms,

    L. Araujo, "Part-of-speech tagging with evolutionary algorithms," in International Conference on Intelligent Text Processing and Computational Linguistics, 2002, pp. 230-239