Evolutionary Algorithm for Sinhala to English Translation

A. Nugaliyadde; J.K. Joseph; W.M.T. Chathurika; Y. Mallawarachchi

arxiv: 1907.03202 · v1 · pith:D3F55QBVnew · submitted 2019-07-06 · 💻 cs.CL · cs.NE

Evolutionary Algorithm for Sinhala to English Translation

J.K. Joseph , W.M.T. Chathurika , A. Nugaliyadde , Y. Mallawarachchi This is my paper

Pith reviewed 2026-05-25 01:19 UTC · model grok-4.3

classification 💻 cs.CL cs.NE

keywords machine translationevolutionary algorithmSinhala languagelow-resource translationnatural language processinggrammar correction

0 comments

The pith

An evolutionary algorithm finds the correct English meaning of Sinhala sentences then applies grammar correction to produce accurate translations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Sinhala has limited digital text and complex grammar rules that make standard statistical or neural machine translation methods impractical. This paper applies an evolutionary algorithm to search for the right meaning of a Sinhala sentence, generate an English translation from that meaning, and then correct the grammar of the output. The authors state that the combined process yields accurate results while avoiding the need for large training corpora or hand-crafted linguistic rules.

Core claim

The paper claims that an evolutionary algorithm identifies the correct meaning of Sinhala text, carries out the translation to English, and passes the result to a grammar-correction step, achieving accurate translations.

What carries the argument

The evolutionary algorithm that searches for the correct English meaning of Sinhala input sentences.

If this is right

Translation succeeds without requiring large amounts of parallel training data.
Complex Sinhala grammar is navigated through evolutionary search rather than explicit statistical rules.
A separate grammar-correction step is applied after the evolutionary translation.
Accurate English output is reported for the Sinhala-to-English task.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same evolutionary search could be applied to other low-resource languages that have sparse digital text.
Performance might improve if the evolutionary fitness function were augmented with modern embedding-based similarity measures.
The approach could be tested on longer or more syntactically varied Sinhala sentences to check scalability.

Load-bearing premise

An evolutionary algorithm can reliably discover the correct English meaning of Sinhala sentences without large training data or explicit linguistic rules.

What would settle it

A test collection of Sinhala sentences with known correct English translations on which the evolutionary algorithm consistently produces semantically wrong or ungrammatical output.

read the original abstract

Machine Translation (MT) is an area in natural language processing, which focus on translating from one language to another. Many approaches ranging from statistical methods to deep learning approaches are used in order to achieve MT. However, these methods either require a large number of data or a clear understanding about the language. Sinhala language has less digital text which could be used to train a deep neural network. Furthermore, Sinhala has complex rules therefore, it is harder to create statistical rules in order to apply statistical methods in MT. This research focuses on Sinhala to English translation using an Evolutionary Algorithm (EA). EA is used to identifying the correct meaning of Sinhala text and to translate it to English. The Sinhala text is passed to identify the meaning in order to get the correct meaning of the sentence. With the use of the EA the translation is carried out. The translated text is passed on to grammatically correct the sentence. This has shown to achieve accurate results.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper claims an evolutionary algorithm delivers accurate Sinhala-English translations without large data or rules, but never describes the fitness function or shows any evaluation.

read the letter

The one thing to know is that the work applies an evolutionary algorithm to Sinhala-English machine translation and states that it achieves accurate results, yet the description never explains how candidate translations are represented, how the population evolves, or what the fitness function actually scores. Without that piece the central claim cannot be checked against the motivating premise that the method avoids both large parallel corpora and explicit linguistic rules. The authors rightly flag that Sinhala has sparse digital text and complex grammar, so standard statistical or neural routes are hard to apply; that diagnosis is fair. The paper also mentions a post-translation grammar-correction step. Those points are reasonable observations about the language pair. Beyond that the contribution is thin. Evolutionary algorithms have been tried on machine translation before, and nothing here demonstrates a new representation, operator, or fitness design that would distinguish this instance. The results statement stands alone with no test set, no baseline, no automatic metric, and no human judgment reported. If the full manuscript contains those details they are not visible in the supplied text. This kind of note might interest a narrow group working specifically on Sinhala or other extremely low-resource languages who are collecting any alternative idea they can find. For anyone else the absence of method and evidence makes it hard to engage. I would not bring it to a reading group or cite it. It does not look ready for peer review; the algorithmic core and the empirical support both need to be supplied before a referee could usefully assess it.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes using an evolutionary algorithm (EA) for Sinhala-to-English machine translation. It notes that Sinhala has limited digital text and complex rules, making data-intensive or rule-based methods impractical, and claims that an EA identifies correct sentence meanings, performs the translation, and applies grammatical correction to yield accurate results.

Significance. A working EA-based MT system that requires neither large parallel corpora nor explicit linguistic rules would be a notable contribution for low-resource languages. However, the manuscript supplies no implementation details, fitness function, representation scheme, or evaluation data, so no assessment of significance is possible.

major comments (3)

[Abstract] Abstract: the assertion that the method 'has shown to achieve accurate results' is unsupported by any metrics (e.g., BLEU, accuracy), test-set description, baseline comparisons, or even the number of sentences evaluated.
[Abstract] Abstract / method (entirely absent): the fitness function that would allow the EA to rank candidate English translations for semantic correctness is never defined, nor are the chromosome representation, population initialization, or selection operators described. This directly undermines the claim that the approach operates without large data or explicit rules.
[Abstract] Abstract: the two-stage pipeline (EA translation followed by separate grammatical correction) is stated without any indication of how the grammatical corrector is implemented or whether it relies on the very linguistic resources the EA is meant to avoid.

minor comments (1)

[Abstract] Abstract contains several grammatical issues ('which focus' should be 'which focuses'; 'identifying the correct meaning' should be 'to identify the correct meaning').

Simulated Author's Rebuttal

3 responses · 2 unresolved

We thank the referee for their comments. The manuscript is indeed limited to a high-level description without implementation specifics or evaluation results. We address each major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the assertion that the method 'has shown to achieve accurate results' is unsupported by any metrics (e.g., BLEU, accuracy), test-set description, baseline comparisons, or even the number of sentences evaluated.

Authors: We agree that the claim of achieving accurate results lacks any supporting metrics, test-set details, baselines, or evaluation count. The manuscript provides no such evidence. We will revise the abstract to remove or qualify this unsupported statement. revision: yes
Referee: [Abstract] Abstract / method (entirely absent): the fitness function that would allow the EA to rank candidate English translations for semantic correctness is never defined, nor are the chromosome representation, population initialization, or selection operators described. This directly undermines the claim that the approach operates without large data or explicit rules.

Authors: The manuscript contains no definition of the fitness function, chromosome representation, population initialization, or selection operators. These elements are entirely absent, so we cannot supply them or demonstrate how the approach avoids data or rules. revision: no
Referee: [Abstract] Abstract: the two-stage pipeline (EA translation followed by separate grammatical correction) is stated without any indication of how the grammatical corrector is implemented or whether it relies on the very linguistic resources the EA is meant to avoid.

Authors: The manuscript states that translated text is passed for grammatical correction but gives no implementation details for the corrector or its resource requirements. We cannot clarify this aspect as the information is not present in the work. revision: no

standing simulated objections not resolved

No evaluation data, metrics, or test sentences exist in the manuscript to support accuracy claims.
No EA implementation details (fitness function, representation, operators) are available to describe.

Circularity Check

0 steps flagged

No circularity: purely empirical description with no derivations or self-referential steps

full rationale

The paper contains no equations, no parameter-fitting steps, and no mathematical derivations. Its central claim is an empirical assertion that an EA plus post-processing produces accurate Sinhala-to-English translations. No load-bearing step reduces to a self-definition, a fitted input renamed as prediction, or a self-citation chain; the text simply describes the intended workflow without any formal reduction that could be circular. The absence of any claimed 'first-principles result' or uniqueness theorem means the circularity patterns do not apply.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical parameters, axioms, or new entities are described in the abstract.

pith-pipeline@v0.9.0 · 5707 in / 969 out tokens · 30058 ms · 2026-05-25T01:19:40.667543+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · 5 internal anchors

[1]

Advances in natural language processing,

J. Hirschberg and C. D. Manning, "Advances in natural language processing," Science, vol. 349, pp. 261-266, 2015

work page 2015
[2]

Conditional Random Fields based named entity recognition for sinhala,

K. Senevirathne, N. Attanayake, A. Dhananjanie, W. Weragoda, A. Nugaliyadde, and S. Thelijjagoda, "Conditional Random Fields based named entity recognition for sinhala," in 2015 IEEE 10th International Conference on Industrial and Information Systems (ICIIS), 2015, pp. 302- 307

work page 2015
[3]

Jumping NLP curves: a review of natural language processing research [review article],

E. Cambria and B. White, "Jumping NLP curves: a review of natural language processing research [review article]," IEEE Computational Intelligence Magazine, vol. 9, pp. 48-57, 2014

work page 2014
[4]

C. D. Manning, C. D. Manning, and H. Schütze, Foundations of statistical natural language processing: MIT press, 1999

work page 1999
[5]

“Mahoshadha

J. Jayakody, T. Gamlath, W. Lasantha, K. Premachandra, A. Nugaliyadde, and Y. Mallawarachchi, "“Mahoshadha”, the Sinhala Tagged Corpus Based Question Answering System," in Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems: Volume 1, 2016, pp. 313-322

work page 2016
[6]

A Morphological analyzer to enable English to Sinhala Machine Translation,

B. Hettige and A. S. Karunananda, "A Morphological analyzer to enable English to Sinhala Machine Translation," in 2006 International Conference on Information and Automation, 2006, pp. 21-26

work page 2006
[7]

Machine translation approaches and survey for Indian languages,

P. Antony, "Machine translation approaches and survey for Indian languages," International Journal of Computational Linguistics & Chinese Language Processing, Volume 18, Number 1, March 2013, vol. 18, 2013

work page 2013
[8]

Developing lexicon databases for English to Sinhala machine translation,

B. Hettige and A. Karunananda, "Developing lexicon databases for English to Sinhala machine translation," in 2007 International Conference on Industrial and Information Systems, 2007, pp. 215- 220

work page 2007
[9]

Japanese-Sinhalese “machine translation system Jaw/Sinhalese,

Y. I. Samantha Thelijjagodzf and T. Ikeda, "Japanese-Sinhalese “machine translation system Jaw/Sinhalese," Journal of the National Science Foundation of Sri Lanka, vol. 35, p. 2, 2007

work page 2007
[10]

A statistical machine translation approach to sinhala-tamil language translation,

R. Weerasinghe, "A statistical machine translation approach to sinhala-tamil language translation," Towards an ICT enabled Society, p. 136, 2003

work page 2003
[11]

Statistical machine translation of systems for Sinhala-Tamil,

S. Sripirakas, A. Weerasinghe, and D. L. Herath, "Statistical machine translation of systems for Sinhala-Tamil," in 2010 International Conference on Advances in ICT for Emerging Regions (ICTer), 2010, pp. 62-68

work page 2010
[12]

Deep learning,

Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, pp. 436-444, 2015

work page 2015
[13]

Reinforced memory network for question answering,

A. Nugaliyadde, K. W. Wong, F. Sohel, and H. Xie, "Reinforced memory network for question answering," in International Conference on Neural Information Processing, 2017, pp. 482-490

work page 2017
[14]

Integration of Bilingual Lists for Domain- Specific Statistical Machine Translation for Sinhala-Tamil,

F. Farhath, S. Ranathunga, S. Jayasena, and G. Dias, "Integration of Bilingual Lists for Domain- Specific Statistical Machine Translation for Sinhala-Tamil," in 2018 Moratuwa Engineering Research Conference (MERCon), 2018, pp. 538- 543

work page 2018
[15]

Memory Networks

J. Weston, S. Chopra, and A. Bordes, "Memory networks," arXiv preprint arXiv:1410.3916, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[16]

Mnemonic Reader: Machine Comprehension with Iterative Aligning and Multi-hop Answer Pointing,

M. Hu, Y. Peng, and X. Qiu, "Mnemonic Reader: Machine Comprehension with Iterative Aligning and Multi-hop Answer Pointing," 2017

work page 2017
[17]

Towards Neural Network-based Reasoning

B. Peng, Z. Lu, H. Li, and K.-F. Wong, "Towards neural network-based reasoning," arXiv preprint arXiv:1508.05508, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[18]

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, et al., "Learning phrase representations using RNN encoder-decoder for statistical machine translation," arXiv preprint arXiv:1406.1078, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[19]

Neural Machine Translation by Jointly Learning to Align and Translate

D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv:1409.0473, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[20]

Language Modeling through Long Term Memory Network

A. Nugaliyadde, K. W. Wong, F. Sohel, and H. Xie, "Language Modeling through Long Term Memory Network," arXiv preprint arXiv:1904.08936, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904
[21]

Modeling server workloads for campus email traffic using recurrent neural networks,

S. Boukoros, A. Nugaliyadde, A. Marnerides, C. Vassilakis, P. Koutsakis, and K. W. Wong, "Modeling server workloads for campus email traffic using recurrent neural networks," in International Conference on Neural Information Processing, 2017, pp. 57-66

work page 2017
[22]

C. A. C. Coello, G. B. Lamont, and D. A. Van Veldhuizen, Evolutionary algorithms for solving multi-objective problems vol. 5: Springer, 2007

work page 2007
[23]

Part-of-speech tagging with evolutionary algorithms,

L. Araujo, "Part-of-speech tagging with evolutionary algorithms," in International Conference on Intelligent Text Processing and Computational Linguistics, 2002, pp. 230-239

work page 2002

[1] [1]

Advances in natural language processing,

J. Hirschberg and C. D. Manning, "Advances in natural language processing," Science, vol. 349, pp. 261-266, 2015

work page 2015

[2] [2]

Conditional Random Fields based named entity recognition for sinhala,

K. Senevirathne, N. Attanayake, A. Dhananjanie, W. Weragoda, A. Nugaliyadde, and S. Thelijjagoda, "Conditional Random Fields based named entity recognition for sinhala," in 2015 IEEE 10th International Conference on Industrial and Information Systems (ICIIS), 2015, pp. 302- 307

work page 2015

[3] [3]

Jumping NLP curves: a review of natural language processing research [review article],

E. Cambria and B. White, "Jumping NLP curves: a review of natural language processing research [review article]," IEEE Computational Intelligence Magazine, vol. 9, pp. 48-57, 2014

work page 2014

[4] [4]

C. D. Manning, C. D. Manning, and H. Schütze, Foundations of statistical natural language processing: MIT press, 1999

work page 1999

[5] [5]

“Mahoshadha

J. Jayakody, T. Gamlath, W. Lasantha, K. Premachandra, A. Nugaliyadde, and Y. Mallawarachchi, "“Mahoshadha”, the Sinhala Tagged Corpus Based Question Answering System," in Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems: Volume 1, 2016, pp. 313-322

work page 2016

[6] [6]

A Morphological analyzer to enable English to Sinhala Machine Translation,

B. Hettige and A. S. Karunananda, "A Morphological analyzer to enable English to Sinhala Machine Translation," in 2006 International Conference on Information and Automation, 2006, pp. 21-26

work page 2006

[7] [7]

Machine translation approaches and survey for Indian languages,

P. Antony, "Machine translation approaches and survey for Indian languages," International Journal of Computational Linguistics & Chinese Language Processing, Volume 18, Number 1, March 2013, vol. 18, 2013

work page 2013

[8] [8]

Developing lexicon databases for English to Sinhala machine translation,

B. Hettige and A. Karunananda, "Developing lexicon databases for English to Sinhala machine translation," in 2007 International Conference on Industrial and Information Systems, 2007, pp. 215- 220

work page 2007

[9] [9]

Japanese-Sinhalese “machine translation system Jaw/Sinhalese,

Y. I. Samantha Thelijjagodzf and T. Ikeda, "Japanese-Sinhalese “machine translation system Jaw/Sinhalese," Journal of the National Science Foundation of Sri Lanka, vol. 35, p. 2, 2007

work page 2007

[10] [10]

A statistical machine translation approach to sinhala-tamil language translation,

R. Weerasinghe, "A statistical machine translation approach to sinhala-tamil language translation," Towards an ICT enabled Society, p. 136, 2003

work page 2003

[11] [11]

Statistical machine translation of systems for Sinhala-Tamil,

S. Sripirakas, A. Weerasinghe, and D. L. Herath, "Statistical machine translation of systems for Sinhala-Tamil," in 2010 International Conference on Advances in ICT for Emerging Regions (ICTer), 2010, pp. 62-68

work page 2010

[12] [12]

Deep learning,

Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, pp. 436-444, 2015

work page 2015

[13] [13]

Reinforced memory network for question answering,

A. Nugaliyadde, K. W. Wong, F. Sohel, and H. Xie, "Reinforced memory network for question answering," in International Conference on Neural Information Processing, 2017, pp. 482-490

work page 2017

[14] [14]

Integration of Bilingual Lists for Domain- Specific Statistical Machine Translation for Sinhala-Tamil,

F. Farhath, S. Ranathunga, S. Jayasena, and G. Dias, "Integration of Bilingual Lists for Domain- Specific Statistical Machine Translation for Sinhala-Tamil," in 2018 Moratuwa Engineering Research Conference (MERCon), 2018, pp. 538- 543

work page 2018

[15] [15]

Memory Networks

J. Weston, S. Chopra, and A. Bordes, "Memory networks," arXiv preprint arXiv:1410.3916, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[16] [16]

Mnemonic Reader: Machine Comprehension with Iterative Aligning and Multi-hop Answer Pointing,

M. Hu, Y. Peng, and X. Qiu, "Mnemonic Reader: Machine Comprehension with Iterative Aligning and Multi-hop Answer Pointing," 2017

work page 2017

[17] [17]

Towards Neural Network-based Reasoning

B. Peng, Z. Lu, H. Li, and K.-F. Wong, "Towards neural network-based reasoning," arXiv preprint arXiv:1508.05508, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[18] [18]

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, et al., "Learning phrase representations using RNN encoder-decoder for statistical machine translation," arXiv preprint arXiv:1406.1078, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[19] [19]

Neural Machine Translation by Jointly Learning to Align and Translate

D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv:1409.0473, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[20] [20]

Language Modeling through Long Term Memory Network

A. Nugaliyadde, K. W. Wong, F. Sohel, and H. Xie, "Language Modeling through Long Term Memory Network," arXiv preprint arXiv:1904.08936, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904

[21] [21]

Modeling server workloads for campus email traffic using recurrent neural networks,

S. Boukoros, A. Nugaliyadde, A. Marnerides, C. Vassilakis, P. Koutsakis, and K. W. Wong, "Modeling server workloads for campus email traffic using recurrent neural networks," in International Conference on Neural Information Processing, 2017, pp. 57-66

work page 2017

[22] [22]

C. A. C. Coello, G. B. Lamont, and D. A. Van Veldhuizen, Evolutionary algorithms for solving multi-objective problems vol. 5: Springer, 2007

work page 2007

[23] [23]

Part-of-speech tagging with evolutionary algorithms,

L. Araujo, "Part-of-speech tagging with evolutionary algorithms," in International Conference on Intelligent Text Processing and Computational Linguistics, 2002, pp. 230-239

work page 2002