CoRR , volume =

Amr Hendy, Mohamed Abdelrehim, Amr Sharaf, Vikas Raunak, Mohamed Gabr, Hitokazu Matsushita, Young Jin Kim, Mohamed Afify, Hany Hassan Awadalla · 2023 · arXiv 2302.09210

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Nsanku: Evaluating Zero-Shot Translation Performance of LLMs for Ghanaian Languages

cs.CL · 2026-05-05 · accept · novelty 7.0

Nsanku benchmark shows current LLMs achieve only modest zero-shot translation scores on 43 Ghanaian languages, with no model reaching both high average performance and high cross-language consistency.

The Prompt Report: A Systematic Survey of Prompt Engineering Techniques

cs.CL · 2024-06-06 · accept · novelty 7.0

This systematic survey organizes prompt engineering into a taxonomy of 58 LLM techniques and 40 others, supplies a shared vocabulary, and offers guidelines for state-of-the-art models.

Evaluating Chinese Ambiguity Understanding in Large Language Models

cs.CL · 2026-05-15 · unverdicted · novelty 6.0

Introduces the CHA-Gen dataset for Chinese ambiguity based on Potential Ambiguity Theory and shows LLMs struggle to detect ambiguity, exhibiting specific failure modes and overconfidence after instruction tuning.

RouteLMT: Learned Sample Routing for Hybrid LLM Translation Deployment

cs.CL · 2026-04-24 · unverdicted · novelty 6.0

RouteLMT learns to route MT requests to large or small LLMs by predicting marginal quality gain from small-model token representations, yielding a better quality-budget Pareto frontier than baselines.

Low-Resource Languages Jailbreak GPT-4

cs.CL · 2023-10-03 · conditional · novelty 6.0

Translating unsafe inputs to low-resource languages jailbreaks GPT-4 at rates on par with or exceeding state-of-the-art attacks.

When Does Data Augmentation Help? Evaluating LLM and Back-Translation Methods for Hausa and Fongbe NLP

cs.CL · 2026-04-14 · unverdicted · novelty 5.0

Data augmentation via LLMs and back-translation produces task-specific effects on NER and POS tagging for Hausa and Fongbe, with no consistent gains over baseline and opposite outcomes across tasks for the same synthetic data.

Mining Large Language Models for Low-Resource Language Data: Comparing Elicitation Strategies for Hausa and Fongbe

cs.CL · 2026-04-14 · unverdicted · novelty 5.0

GPT-4o Mini extracts 6-41 times more usable Hausa and Fongbe text per API call than Gemini 2.5 Flash, with optimal elicitation strategies differing by language.

Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate

cs.CL · 2023-05-30 · conditional · novelty 5.0

Multi-agent debate with tit-for-tat arguments and a judge LLM improves reasoning by preventing LLMs from locking into incorrect initial solutions.

Benchmark Data Contamination of Large Language Models: A Survey

cs.CL · 2024-06-06 · unverdicted · novelty 3.0

A survey reviewing benchmark data contamination in LLMs, its impact on evaluation, and alternative assessment approaches.

citing papers explorer

Showing 9 of 9 citing papers.

Nsanku: Evaluating Zero-Shot Translation Performance of LLMs for Ghanaian Languages cs.CL · 2026-05-05 · accept · none · ref 22
Nsanku benchmark shows current LLMs achieve only modest zero-shot translation scores on 43 Ghanaian languages, with no model reaching both high average performance and high cross-language consistency.
The Prompt Report: A Systematic Survey of Prompt Engineering Techniques cs.CL · 2024-06-06 · accept · none · ref 9
This systematic survey organizes prompt engineering into a taxonomy of 58 LLM techniques and 40 others, supplies a shared vocabulary, and offers guidelines for state-of-the-art models.
Evaluating Chinese Ambiguity Understanding in Large Language Models cs.CL · 2026-05-15 · unverdicted · none · ref 24
Introduces the CHA-Gen dataset for Chinese ambiguity based on Potential Ambiguity Theory and shows LLMs struggle to detect ambiguity, exhibiting specific failure modes and overconfidence after instruction tuning.
RouteLMT: Learned Sample Routing for Hybrid LLM Translation Deployment cs.CL · 2026-04-24 · unverdicted · none · ref 11
RouteLMT learns to route MT requests to large or small LLMs by predicting marginal quality gain from small-model token representations, yielding a better quality-budget Pareto frontier than baselines.
Low-Resource Languages Jailbreak GPT-4 cs.CL · 2023-10-03 · conditional · none · ref 20
Translating unsafe inputs to low-resource languages jailbreaks GPT-4 at rates on par with or exceeding state-of-the-art attacks.
When Does Data Augmentation Help? Evaluating LLM and Back-Translation Methods for Hausa and Fongbe NLP cs.CL · 2026-04-14 · unverdicted · none · ref 11
Data augmentation via LLMs and back-translation produces task-specific effects on NER and POS tagging for Hausa and Fongbe, with no consistent gains over baseline and opposite outcomes across tasks for the same synthetic data.
Mining Large Language Models for Low-Resource Language Data: Comparing Elicitation Strategies for Hausa and Fongbe cs.CL · 2026-04-14 · unverdicted · none · ref 17
GPT-4o Mini extracts 6-41 times more usable Hausa and Fongbe text per API call than Gemini 2.5 Flash, with optimal elicitation strategies differing by language.
Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate cs.CL · 2023-05-30 · conditional · none · ref 60
Multi-agent debate with tit-for-tat arguments and a judge LLM improves reasoning by preventing LLMs from locking into incorrect initial solutions.
Benchmark Data Contamination of Large Language Models: A Survey cs.CL · 2024-06-06 · unverdicted · none · ref 59
A survey reviewing benchmark data contamination in LLMs, its impact on evaluation, and alternative assessment approaches.

CoRR , volume =

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer