Lius: Translation Model Based Instructional Lingustic Using Continual Instruction Tuning In Kupang Malay
Pith reviewed 2026-06-27 09:49 UTC · model grok-4.3
The pith
Continual instruction tuning with dictionary-derived features improves Kupang Malay translation without large parallel datasets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By constructing instructions that embed explicit lexical and semantic features extracted from a bilingual dictionary and training via Continual Instruction Tuning, the Lius model delivers measurable gains in Kupang Malay translation accuracy over standard instruction-tuned, neural machine translation, and multilingual LLM baselines while avoiding dependence on large-scale parallel data.
What carries the argument
Continual Instruction Tuning (CIT), an iterative training loop that repeatedly applies dictionary-derived instructions to adapt an LLM for a target low-resource language pair.
If this is right
- Low-resource translation can proceed with far smaller parallel corpora when instructions encode dictionary features.
- Instruction-tuned models gain measurable accuracy from iterative rather than one-shot training on language-specific instructions.
- Performance advantages over both dedicated NMT systems and general multilingual LLMs become attainable through the same dictionary-plus-CIT pipeline.
- The approach supplies a concrete route to reduce data-collection costs for additional low-resource language pairs.
Where Pith is reading between the lines
- The same dictionary-to-instruction pipeline could be tested on other Austronesian or creole languages that possess modest bilingual resources.
- Automating the extraction of lexical and semantic features might allow the method to scale without manual dictionary curation.
- Combining CIT with existing multilingual models could narrow the gap between general-purpose LLMs and language-specific systems at lower compute cost.
Load-bearing premise
Explicit lexical and semantic features taken from a bilingual dictionary are enough to produce instructions that let continual tuning succeed where large parallel corpora are unavailable.
What would settle it
Evaluate the Lius model on another low-resource language that has no high-quality bilingual dictionary and check whether the 4-13 point gains over the same baselines disappear.
read the original abstract
Large Language Models (LLMs) offer new potential for translation tasks but often experience performance degradation when handling low-resource languages. To address this limitation, we propose an approach for fine-tuning LLMs on a low-resource language, Kupang Malay. Our approach involves designing a set of instructions by leveraging explicit lexical and semantic features from a bilingual dictionary, and introducing Continual Instruction Tuning (CIT), a training paradigm that enables iterative instruction-based training. Experimental results demonstrate that our model, named Lius, yields notable improvements over standard instruction-tuned models by outperforming 4-6 points, and surpassing both Neural Machine Translation (NMT) and Multilingual LLM models by 10-13 points on several evaluation metrics. These findings highlight the potential of our approach to mitigate the reliance on large-scale parallel data in low-resource language translation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a model named Lius for low-resource machine translation involving Kupang Malay. It designs instructions by extracting explicit lexical and semantic features from a bilingual dictionary and introduces Continual Instruction Tuning (CIT) as an iterative fine-tuning paradigm for LLMs. The central claim is that Lius yields 4-6 point gains over standard instruction-tuned models and 10-13 point gains over NMT and multilingual LLM baselines on several evaluation metrics, thereby reducing reliance on large-scale parallel data.
Significance. If the experimental claims hold under proper controls, the work could offer a practical route for low-resource translation by showing how dictionary-derived instructions enable effective CIT. This addresses a genuine need in the field for methods that operate with limited parallel corpora.
major comments (2)
- [Abstract] Abstract: The headline result (4-6 pt gains over instruction-tuned models; 10-13 pt over NMT/multilingual LLMs) is stated without any information on the evaluation metrics, test sets, statistical significance, baseline implementations, or experimental controls. This absence prevents verification that the data support the claims.
- [§4] §4 (Experiments): The central claim that bilingual-dictionary-derived instructions suffice for CIT gains rests on an untested assumption; the section supplies no details on dictionary size/coverage, the mapping from entries to instruction templates, the base model, volume of any parallel data still used, number of CIT stages, or ablations isolating the dictionary component. Without these, the reported deltas cannot be attributed to the proposed mechanism.
minor comments (1)
- [Title] Title: 'Lingustic' is a typographical error and should read 'Linguistic'.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. The comments highlight areas where additional detail will improve verifiability of the claims. We address each point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: The headline result (4-6 pt gains over instruction-tuned models; 10-13 pt over NMT/multilingual LLMs) is stated without any information on the evaluation metrics, test sets, statistical significance, baseline implementations, or experimental controls. This absence prevents verification that the data support the claims.
Authors: We agree that the abstract would benefit from greater specificity. In the revised manuscript we will expand the abstract to name the primary metrics (BLEU and chrF), identify the test sets, note that reported gains include statistical significance testing, and briefly characterize the baseline implementations and controls. These additions will make the headline claims directly verifiable from the abstract. revision: yes
-
Referee: [§4] §4 (Experiments): The central claim that bilingual-dictionary-derived instructions suffice for CIT gains rests on an untested assumption; the section supplies no details on dictionary size/coverage, the mapping from entries to instruction templates, the base model, volume of any parallel data still used, number of CIT stages, or ablations isolating the dictionary component. Without these, the reported deltas cannot be attributed to the proposed mechanism.
Authors: We accept that §4 currently omits several implementation details required to attribute gains to the dictionary-derived instructions and CIT procedure. We will revise the section to report dictionary size and coverage statistics, the exact template-mapping procedure, the base LLM, the quantity of parallel data retained, the number of CIT stages, and ablation experiments that isolate the dictionary component. These additions will allow readers to evaluate the contribution of the proposed mechanism. revision: yes
Circularity Check
No circularity: empirical method proposal with no derivation chain
full rationale
The paper describes an empirical NLP approach: designing instructions from bilingual dictionary lexical/semantic features, then applying Continual Instruction Tuning (CIT) to fine-tune an LLM for Kupang Malay translation. No equations, parameters, or mathematical derivations are present in the provided abstract or described claims. Experimental deltas (4-6 pts over instruction tuning, 10-13 over NMT/LLMs) are reported as outcomes, not as quantities forced by construction from fitted inputs. No self-citations, uniqueness theorems, or ansatzes are invoked in the given text. This matches the reader's assessment of score 1.0; absence of detail on dictionary size or ablations is a reproducibility concern, not circularity. The derivation chain is empty, so no reduction to inputs occurs.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Sequence to Sequence Learning with Neural Networks , url =
Sutskever, Ilya and Vinyals, Oriol and Le, Quoc V , booktitle =. Sequence to Sequence Learning with Neural Networks , url =
-
[2]
Learning phrase representations using RNN encoder ⚶decoder for statistical machine translation
Cho, Kyunghyun and van Merri. Learning Phrase Representations using RNN Encoder -- Decoder for Statistical Machine Translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing ( EMNLP ). 2014. doi:10.3115/v1/D14-1179
-
[3]
Attention is All you Need , url =
Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, ukasz and Polosukhin, Illia , booktitle =. Attention is All you Need , url =
-
[4]
A Convolutional Encoder Model for Neural Machine Translation
Gehring, Jonas and Auli, Michael and Grangier, David and Dauphin, Yann. A Convolutional Encoder Model for Neural Machine Translation. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017. doi:10.18653/v1/P17-1012
-
[5]
Survey of Low-Resource Machine Translation
Haddow, Barry and Bawden, Rachel and Miceli Barone, Antonio Valerio and Helcl, Jind r ich and Birch, Alexandra. Survey of Low-Resource Machine Translation. Computational Linguistics. 2022. doi:10.1162/coli_a_00446
-
[6]
Hedderich, Michael A. and Lange, Lukas and Adel, Heike and Str. A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021. doi:10.18653/v1/2021.naacl-main.201
-
[7]
Cahyawijaya, Samuel and Lovenia, Holy and Koto, Fajri and Adhista, Dea and Dave, Emmanuel and Oktavianti, Sarah and Akbar, Salsabil and Lee, Jhonson and Shadieq, Nuur and Cenggoro, Tjeng Wawan and Linuwih, Hanung and Wilie, Bryan and Muridan, Galih and Winata, Genta and Moeljadi, David and Aji, Alham Fikri and Purwarianti, Ayu and Fung, Pascale. N usa W r...
-
[8]
N usa X : Multilingual Parallel Sentiment Dataset for 10 I ndonesian Local Languages
Winata, Genta Indra and Aji, Alham Fikri and Cahyawijaya, Samuel and Mahendra, Rahmad and Koto, Fajri and Romadhony, Ade and Kurniawan, Kemal and Moeljadi, David and Prasojo, Radityo Eko and Fung, Pascale and Baldwin, Timothy and Lau, Jey Han and Sennrich, Rico and Ruder, Sebastian. N usa X : Multilingual Parallel Sentiment Dataset for 10 I ndonesian Loca...
-
[9]
Parallel Data, Tools and Interfaces in OPUS
Tiedemann, J. Parallel Data, Tools and Interfaces in OPUS. Proceedings of the Eighth International Conference on Language Resources and Evaluation ( LREC `12). 2012
2012
-
[10]
Aji, Alham Fikri and Winata, Genta Indra and Koto, Fajri and Cahyawijaya, Samuel and Romadhony, Ade and Mahendra, Rahmad and Kurniawan, Kemal and Moeljadi, David and Prasojo, Radityo Eko and Baldwin, Timothy and Lau, Jey Han and Ruder, Sebastian. One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in I ndonesia. Proceed...
-
[11]
Cross-Lingual Machine Speech Chain for J avanese, S undanese, B alinese, and B ataks Speech Recognition and Synthesis
Novitasari, Sashi and Tjandra, Andros and Sakti, Sakriani and Nakamura, Satoshi. Cross-Lingual Machine Speech Chain for J avanese, S undanese, B alinese, and B ataks Speech Recognition and Synthesis. Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resource...
2020
-
[12]
Language Model Prior for Low-Resource Neural Machine Translation
Baziotis, Christos and Haddow, Barry and Birch, Alexandra. Language Model Prior for Low-Resource Neural Machine Translation. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. doi:10.18653/v1/2020.emnlp-main.615
-
[13]
Chronopoulou, Alexandra and Stojanovski, Dario and Fraser, Alexander. Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021. doi:10.18653/v1/2021.naacl-main.16
-
[14]
Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences
Duan, Xiangyu and Ji, Baijun and Jia, Hao and Tan, Min and Zhang, Min and Chen, Boxing and Luo, Weihua and Zhang, Yue. Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020. doi:10.18653/v1/2020.acl-main.143
-
[15]
Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation
Pourdamghani, Nima and Aldarrab, Nada and Ghazvininejad, Marjan and Knight, Kevin and May, Jonathan. Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. doi:10.18653/v1/P19-1293
-
[16]
Dabre, Raj and Chu, Chenhui and Kunchukuttan, Anoop , title =. ACM Comput. Surv. , month = sep, articleno =. 2020 , issue_date =. doi:10.1145/3406095 , abstract =
-
[17]
Unsupervised Pivot Translation for Distant Languages
Leng, Yichong and Tan, Xu and Qin, Tao and Li, Xiang-Yang and Liu, Tie-Yan. Unsupervised Pivot Translation for Distant Languages. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. doi:10.18653/v1/P19-1017
-
[18]
Transfer Learning for Low-Resource Neural Machine Translation
Zoph, Barret and Yuret, Deniz and May, Jonathan and Knight, Kevin. Transfer Learning for Low-Resource Neural Machine Translation. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016. doi:10.18653/v1/D16-1163
-
[19]
Language Models are Few-Shot Learners , url =
Brown, Tom and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared D and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and Agarwal, Sandhini and Herbert-Voss, Ariel and Krueger, Gretchen and Henighan, Tom and Child, Rewon and Ramesh, Aditya and Ziegler, Daniel and Wu, Jeffrey and Winte...
-
[20]
2024 , eprint=
GPT-4 Technical Report , author=. 2024 , eprint=
2024
-
[21]
and Wang, Longyue
Lyu, Chenyang and Du, Zefeng and Xu, Jitao and Duan, Yitao and Wu, Minghao and Lynn, Teresa and Aji, Alham Fikri and Wong, Derek F. and Wang, Longyue. A Paradigm Shift: The Future of Machine Translation Lies with Large Language Models. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (L...
2024
-
[22]
Bang, Yejin and Cahyawijaya, Samuel and Lee, Nayeon and Dai, Wenliang and Su, Dan and Wilie, Bryan and Lovenia, Holy and Ji, Ziwei and Yu, Tiezheng and Chung, Willy and Do, Quyet V. and Xu, Yan and Fung, Pascale. A Multitask, Multilingual, Multimodal Evaluation of C hat GPT on Reasoning, Hallucination, and Interactivity. Proceedings of the 13th Internatio...
-
[23]
2023 , eprint=
Is ChatGPT A Good Translator? Yes With GPT-4 As The Engine , author=. 2023 , eprint=
2023
-
[24]
Robinson, Nathaniel and Ogayo, Perez and Mortensen, David R. and Neubig, Graham. C hat GPT MT : Competitive for High- (but Not Low-) Resource Languages. Proceedings of the Eighth Conference on Machine Translation. 2023. doi:10.18653/v1/2023.wmt-1.40
-
[25]
Iyer, Vivek and Malik, Bhavitvya and Zhu, Wenhao and Stepachev, Pavel and Chen, Pinzhen and Haddow, Barry and Birch, Alexandra. Exploring Very Low-Resource Translation with LLM s: The U niversity of E dinburgh`s Submission to A mericas NLP 2024 Translation Task. Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the...
-
[26]
2023 , eprint=
Llama 2: Open Foundation and Fine-Tuned Chat Models , author=. 2023 , eprint=
2023
-
[27]
2024 , eprint=
MaLA-500: Massive Language Adaptation of Large Language Models , author=. 2024 , eprint=
2024
-
[28]
2023 , eprint=
Mistral 7B , author=. 2023 , eprint=
2023
-
[29]
Cahyawijaya, Samuel and Lovenia, Holy and Yu, Tiezheng and Chung, Willy and Fung, Pascale. I nstruct A lign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning. Proceedings of the First Workshop in South East Asian Language Processing. 2023. doi:10.18653/v1/2023.sealp-1.5
-
[30]
Crosslingual Generalization through Multitask Finetuning
Muennighoff, Niklas and Wang, Thomas and Sutawika, Lintang and Roberts, Adam and Biderman, Stella and Le Scao, Teven and Bari, M Saiful and Shen, Sheng and Yong, Zheng Xin and Schoelkopf, Hailey and Tang, Xiangru and Radev, Dragomir and Aji, Alham Fikri and Almubarak, Khalid and Albanie, Samuel and Alyafeai, Zaid and Webson, Albert and Raff, Edward and Ra...
-
[31]
Mao, Zhuoyuan and Yu, Yen. Tuning LLM s with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages. Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024). 2024. doi:10.18653/v1/2024.loresmt-1.1
-
[32]
, editor =
Dyer, Chris and Chahuneau, Victor and Smith, Noah A. , editor =. A. Proceedings of the 2013. 2013 , pages =
2013
-
[33]
Transactions of the Association for Computational Linguistics , author =
Eliciting the. Transactions of the Association for Computational Linguistics , author =. 2024 , note =. doi:10.1162/tacl_a_00655 , abstract =
-
[34]
2022 , eprint=
Few-shot Learning with Multilingual Language Models , author=. 2022 , eprint=
2022
-
[35]
Teaching Large Language Models to Translate on Low-resource Languages with Textbook Prompting
Guo, Ping and Ren, Yubing and Hu, Yue and Li, Yunpeng and Zhang, Jiarui and Zhang, Xingsheng and Huang, Heyan. Teaching Large Language Models to Translate on Low-resource Languages with Textbook Prompting. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). 2024
2024
-
[36]
Low-Resource Machine Translation through Retrieval-Augmented LLM Prompting: A Study on the M ambai Language
Merx, Rapha. Low-Resource Machine Translation through Retrieval-Augmented LLM Prompting: A Study on the M ambai Language. Proceedings of the 2nd Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia (EURALI) @ LREC-COLING 2024. 2024
2024
-
[37]
Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages
Heffernan, Kevin and C elebi, Onur and Schwenk, Holger. Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages. Findings of the Association for Computational Linguistics: EMNLP 2022. 2022. doi:10.18653/v1/2022.findings-emnlp.154
-
[38]
Towards Making the Most of C hat GPT for Machine Translation
Peng, Keqin and Ding, Liang and Zhong, Qihuang and Shen, Li and Liu, Xuebo and Zhang, Min and Ouyang, Yuanxin and Tao, Dacheng. Towards Making the Most of C hat GPT for Machine Translation. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. doi:10.18653/v1/2023.findings-emnlp.373
-
[39]
Hire a Linguist!: Learning Endangered Languages in LLM s with In-Context Linguistic Descriptions
Zhang, Kexun and Choi, Yee and Song, Zhenqiao and He, Taiqi and Wang, William Yang and Li, Lei. Hire a Linguist!: Learning Endangered Languages in LLM s with In-Context Linguistic Descriptions. Findings of the Association for Computational Linguistics: ACL 2024. 2024. doi:10.18653/v1/2024.findings-acl.925
-
[40]
Huang, Haoyang and Tang, Tianyi and Zhang, Dongdong and Zhao, Xin and Song, Ting and Xia, Yan and Wei, Furu. Not All Languages Are Created Equal in LLM s: Improving Multilingual Capability by Cross-Lingual-Thought Prompting. Findings of the Association for Computational Linguistics: EMNLP 2023. 2023. doi:10.18653/v1/2023.findings-emnlp.826
-
[41]
Exploring Human-Like Translation Strategy with Large Language Models
He, Zhiwei and Liang, Tian and Jiao, Wenxiang and Zhang, Zhuosheng and Yang, Yujiu and Wang, Rui and Tu, Zhaopeng and Shi, Shuming and Wang, Xing. Exploring Human-Like Translation Strategy with Large Language Models. Transactions of the Association for Computational Linguistics. 2024. doi:10.1162/tacl_a_00642
-
[42]
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , articleno =
Rajbhandari, Samyam and Rasley, Jeff and Ruwase, Olatunji and He, Yuxiong , title =. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , articleno =. 2020 , isbn =
2020
-
[43]
1980--2014 , howpublished =
Jakarta Field Station , title =. 1980--2014 , howpublished =
1980
-
[44]
2005 , note =
Yohanes Manhitu , title =. 2005 , note =
2005
-
[45]
Transactions of the Association for Computational Linguistics , author =
Bojanowski, Piotr and Grave, Edouard and Joulin, Armand and Mikolov, Tomas. Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics. 2017. doi:10.1162/tacl_a_00051
-
[46]
Cendol: Open Instruction-tuned Generative Large Language Models for I ndonesian Languages
Cahyawijaya, Samuel and Lovenia, Holy and Koto, Fajri and Putri, Rifki and Cenggoro, Wawan and Lee, Jhonson and Akbar, Salsabil and Dave, Emmanuel and Nuurshadieq, Nuurshadieq and Mahendra, Muhammad and Putri, Rr and Wilie, Bryan and Winata, Genta and Aji, Alham and Purwarianti, Ayu and Fung, Pascale. Cendol: Open Instruction-tuned Generative Large Langua...
-
[47]
Sailor: Open Language Models for South- E ast A sia
Dou, Longxu and Liu, Qian and Zeng, Guangtao and Guo, Jia and Zhou, Jiahui and Mao, Xin and Jin, Ziqi and Lu, Wei and Lin, Min. Sailor: Open Language Models for South- E ast A sia. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 2024. doi:10.18653/v1/2024.emnlp-demo.45
-
[48]
2024 , eprint=
Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier , author=. 2024 , eprint=
2024
-
[49]
2024 , eprint=
SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages , author=. 2024 , eprint=
2024
-
[50]
2023 , eprint=
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset , author=. 2023 , eprint=
2023
-
[51]
2020 , eprint=
Scaling Laws for Neural Language Models , author=. 2020 , eprint=
2020
-
[52]
W iki M atrix: Mining 135 M Parallel Sentences in 1620 Language Pairs from W ikipedia
Schwenk, Holger and Chaudhary, Vishrav and Sun, Shuo and Gong, Hongyu and Guzm \'a n, Francisco. W iki M atrix: Mining 135 M Parallel Sentences in 1620 Language Pairs from W ikipedia. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 2021. doi:10.18653/v1/2021.eacl-main.115
-
[53]
CCM atrix: Mining Billions of High-Quality Parallel Sentences on the Web
Schwenk, Holger and Wenzek, Guillaume and Edunov, Sergey and Grave, Edouard and Joulin, Armand and Fan, Angela. CCM atrix: Mining Billions of High-Quality Parallel Sentences on the Web. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume ...
-
[54]
2024 , eprint=
Taxi1500: A Multilingual Dataset for Text Classification in 1500 Languages , author=. 2024 , eprint=
2024
-
[55]
P an L ex: Building a Resource for Panlingual Lexical Translation
Kamholz, David and Pool, Jonathan and Colowick, Susan. P an L ex: Building a Resource for Panlingual Lexical Translation. Proceedings of the Ninth International Conference on Language Resources and Evaluation ( LREC `14). 2014
2014
-
[56]
2024 , eprint=
Constructing and Expanding Low-Resource and Underrepresented Parallel Datasets for Indonesian Local Languages , author=. 2024 , eprint=
2024
-
[57]
Rafael, Agnes Maria Diana , year =. INTERFERENSI FONOLOGIS PENUTUR BAHASA MELAYU KUPANG KE DALAM BAHASA INDONESIA DI KOTA KUPANG , volume =. Jurnal Penelitian Humaniora , publisher =. doi:10.23917/humaniora.v20i1.7225 , number =
-
[58]
2003 , publisher=
Kamus pengantar bahasa Kupang , author=. 2003 , publisher=
2003
-
[59]
2023 , eprint=
PolyLM: An Open Source Polyglot Large Language Model , author=. 2023 , eprint=
2023
-
[60]
Kontektualisasi Direct Instruction Dalam Pembelajaran Sains , volume =
Zahriani, Zahriani , year =. Kontektualisasi Direct Instruction Dalam Pembelajaran Sains , volume =. Lantanida Journal , publisher =. doi:10.22373/lj.v2i1.667 , number =
-
[61]
Hughes, Charles A. and Morris, Jared R. and Therrien, William J. and Benson, Sarah K. , year =. Explicit Instruction: Historical and Contemporary Contexts , volume =. Learning Disabilities Research &; Practice , publisher =. doi:10.1111/ldrp.12142 , number =
-
[62]
Pauzan, Pauzan , year =. Theory in Second Language Acquisition (Recognition of Concepts Toward Krashen’s Second Language Acquisition Theory for Five Main Hypotheses) , volume =. Journal on Education , publisher =. doi:10.31004/joe.v6i4.6210 , number =
-
[63]
Nelson , journal =
Deanna L. Nelson , journal =. A Context-Based Strategy for Teaching Vocabulary , urldate =
-
[64]
Graves, Michael F. , year =. Vocabulary Learning and Instruction , volume =. doi:10.2307/1167219 , journal =
-
[65]
2023 , eprint=
Dictionary-based Phrase-level Prompting of Large Language Models for Machine Translation , author=. 2023 , eprint=
2023
-
[66]
Atkinson, Richard C. and Raugh, Michael R. , year =. An application of the mnemonic keyword method to the acquisition of a Russian vocabulary. , volume =. Journal of Experimental Psychology: Human Learning and Memory , publisher =. doi:10.1037/0278-7393.1.2.126 , number =
-
[67]
Rekrut , journal =
Martha D. Rekrut , journal =. Effective Vocabulary Instruction , urldate =
-
[68]
and Ding, Liang and Chao, Lidia S
Liu, Xuebo and Wang, Longyue and Wong, Derek F. and Ding, Liang and Chao, Lidia S. and Shi, Shuming and Tu, Zhaopeng. On the Copying Behaviors of Pre-Training for Neural Machine Translation. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 2021. doi:10.18653/v1/2021.findings-acl.373
-
[69]
Richard Landis and Gary G
J. Richard Landis and Gary G. Koch , journal =. An Application of Hierarchical Kappa-type Statistics in the Assessment of Majority Agreement among Multiple Observers , urldate =
-
[70]
doi:10.5281/zenodo.4461265 , file =
Grootendorst, Maarten , month = jan, year =. doi:10.5281/zenodo.4461265 , file =
-
[71]
doi:10.20944/preprints201908.0073.v1 , url =
Prafull Sharma and Yingbo Li , title =. doi:10.20944/preprints201908.0073.v1 , url =
-
[72]
I ndo NLU : Benchmark and Resources for Evaluating I ndonesian Natural Language Understanding
Wilie, Bryan and Vincentio, Karissa and Winata, Genta Indra and Cahyawijaya, Samuel and Li, Xiaohong and Lim, Zhi Yuan and Soleman, Sidik and Mahendra, Rahmad and Fung, Pascale and Bahar, Syafri and Purwarianti, Ayu. I ndo NLU : Benchmark and Resources for Evaluating I ndonesian Natural Language Understanding. Proceedings of the 1st Conference of the Asia...
-
[73]
Rose, Stuart and Engel, Dave and Cramer, Nick and Cowley, Wendy , year =. Automatic. Text. doi:10.1002/9780470689646.ch1 , note =
-
[74]
doi:https://doi.org/10.1016/j.ins.2019.09.013 , journal =
Ricardo Campos and Vítor Mangaravite and Arian Pasquali and Alípio Jorge and Célia Nunes and Adam Jatowt , keywords =. YAKE! Keyword extraction from single documents using multiple local features , journal =. 2020 , issn =. doi:https://doi.org/10.1016/j.ins.2019.09.013 , url =
-
[75]
Bag of Tricks for Efficient Text Classification
Joulin, Armand and Grave, Edouard and Bojanowski, Piotr and Mikolov, Tomas. Bag of Tricks for Efficient Text Classification. Proceedings of the 15th Conference of the E uropean Chapter of the Association for Computational Linguistics: Volume 2, Short Papers. 2017
2017
-
[76]
B leu: a Method for Automatic Evaluation of Machine Translation
Papineni, Kishore and Roukos, Salim and Ward, Todd and Zhu, Wei-Jing. B leu: a Method for Automatic Evaluation of Machine Translation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 2002. doi:10.3115/1073083.1073135
-
[77]
A Study of Translation Edit Rate with Targeted Human Annotation
Snover, Matthew and Dorr, Bonnie and Schwartz, Rich and Micciulla, Linnea and Makhoul, John. A Study of Translation Edit Rate with Targeted Human Annotation. Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers. 2006
2006
-
[78]
chr F : character n-gram F -score for automatic MT evaluation
Popovi \'c , Maja. chr F : character n-gram F -score for automatic MT evaluation. Proceedings of the Tenth Workshop on Statistical Machine Translation. 2015. doi:10.18653/v1/W15-3049
-
[79]
A Call for Clarity in Reporting BLEU Scores
Post, Matt. A Call for Clarity in Reporting BLEU Scores. Proceedings of the Third Conference on Machine Translation: Research Papers. 2018. doi:10.18653/v1/W18-6319
-
[80]
ROUGE : A Package for Automatic Evaluation of Summaries
Lin, Chin-Yew. ROUGE : A Package for Automatic Evaluation of Summaries. Text Summarization Branches Out. 2004
2004
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.