From Fragments to Facts: A Curriculum-Driven DPO Approach for Generating Hindi News Veracity Explanations
Pith reviewed 2026-05-19 06:04 UTC · model grok-4.3
The pith
A curriculum-driven DPO framework generates reliable Hindi news veracity explanations by preferring fact-checked sources.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The framework aligns machine-generated explanations with human reasoning by treating fact-checked explanations from credible sources as preferred responses and LLM outputs as non-preferred responses within a Direct Preference Optimization setup enhanced by curriculum learning. Actuality and Finesse parameters are introduced into the DPO loss function to refine task-specific alignment, resulting in higher quality and more consistent veracity explanations for Hindi news.
What carries the argument
Curriculum-driven Direct Preference Optimization with Actuality and Finesse parameters in the loss function, which prioritizes fact-checked explanations over standard LLM outputs to improve explanation quality.
If this is right
- Explanations become more coherent and contextually relevant for Hindi news veracity assessment.
- The approach extends automated explanation generation effectively to low-resource languages.
- It supports scalable tools for combating misinformation through better alignment with fact-checked reasoning.
- Performance gains appear across tested LLMs such as Mistral, Llama, and Gemma as well as PLMs like mBART and mT5.
Where Pith is reading between the lines
- Similar preference-based training could be adapted for explanation tasks in other languages with limited fact-checking resources.
- The method might reduce inconsistencies in generated content for related verification problems beyond news.
- Evaluating the framework on streaming or real-time Hindi news could test its practical impact on veracity detection rates.
Load-bearing premise
The premise that fact-checked explanations from credible sources can reliably serve as preferred responses while LLM outputs serve as non-preferred responses, and that the Actuality and Finesse parameters will enhance quality without introducing new biases or inconsistencies.
What would settle it
An experiment that compares explanation quality metrics and human judgments with and without the Actuality and Finesse parameters, finding no measurable improvement, would show the central claim does not hold.
Figures
read the original abstract
In an era of rampant misinformation, generating reliable news explanations is vital, especially for under-represented languages like Hindi. Lacking robust automated tools, Hindi faces challenges in scaling misinformation detection. To bridge this gap, we propose a novel framework integrating Direct Preference Optimization (DPO) with curriculum learning to align machine-generated explanations with human reasoning. Fact-checked explanations from credible sources serve as preferred responses, while LLM outputs highlight system limitations and serve as non-preferred responses. To refine task-specific alignment, we introduce two key parameters -- Actuality and Finesse -- into the DPO loss function, enhancing explanation quality and consistency. Experiments with LLMs (Mistral, Llama, Gemma) and PLMs (mBART, mT5) confirm the framework's effectiveness in generating coherent, contextually relevant explanations. This scalable approach combats misinformation and extends automated explanation generation to low-resource languages.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a curriculum-driven Direct Preference Optimization (DPO) framework for generating veracity explanations of Hindi news articles. Fact-checked explanations from credible sources are treated as preferred responses and raw LLM outputs as non-preferred responses; two new scalars (Actuality and Finesse) are added to the DPO loss to improve alignment. Experiments on LLMs (Mistral, Llama, Gemma) and PLMs (mBART, mT5) are claimed to confirm that the approach produces coherent, contextually relevant explanations for low-resource languages.
Significance. If the claimed improvements from the curriculum ordering and the two new loss parameters can be shown with quantitative metrics, ablations, and proper baselines, the work would offer a practical method for scaling explanation generation to Hindi and other low-resource languages. The absence of any reported numbers, statistical tests, or loss equations in the current manuscript prevents assessment of whether the central claim holds.
major comments (3)
- [Abstract] Abstract: the assertion that 'experiments ... confirm the framework's effectiveness' is unsupported by any quantitative metrics, baseline comparisons, human or automatic evaluation scores, or statistical significance tests. Without these, the central empirical claim cannot be evaluated.
- [Method] Method section (DPO loss): the paper introduces Actuality and Finesse as additional parameters into the DPO loss but provides neither the explicit modified loss equation nor any ablation study isolating their contribution. It is therefore impossible to determine whether these scalars produce measurable gains over standard DPO or merely reparameterize the same objective.
- [Experiments] Experimental setup: the construction of preferred (fact-checked) versus non-preferred (raw LLM) pairs is described at a high level, yet no evidence is given that the rejected responses are systematically inferior on the target dimensions of actuality or finesse rather than simply different. This leaves open the possibility that observed differences arise from data curation rather than the proposed curriculum or loss modifications.
minor comments (1)
- [Abstract] The abstract and title use 'veracity explanations' without a concise definition of the term in the opening paragraphs.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. The comments highlight important areas where additional rigor is needed to fully support our claims. We address each major comment point by point below and indicate the revisions we will make in the next version of the paper.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that 'experiments ... confirm the framework's effectiveness' is unsupported by any quantitative metrics, baseline comparisons, human or automatic evaluation scores, or statistical significance tests. Without these, the central empirical claim cannot be evaluated.
Authors: We agree that the abstract's claim of effectiveness requires explicit quantitative backing to be evaluable. The current version does not report specific scores or tests in the abstract or elsewhere in the provided text. In the revised manuscript, we will update the abstract to reference key results including automatic metrics (such as BLEU and ROUGE), human evaluation scores, baseline comparisons, and any applicable statistical significance tests. revision: yes
-
Referee: [Method] Method section (DPO loss): the paper introduces Actuality and Finesse as additional parameters into the DPO loss but provides neither the explicit modified loss equation nor any ablation study isolating their contribution. It is therefore impossible to determine whether these scalars produce measurable gains over standard DPO or merely reparameterize the same objective.
Authors: We acknowledge that the explicit modified loss equation and supporting ablations are missing from the current manuscript. We will add the full mathematical formulation of the DPO loss incorporating the Actuality and Finesse scalars. We will also include a dedicated ablation study comparing the full model against standard DPO and variants with individual scalars removed to demonstrate their specific contributions. revision: yes
-
Referee: [Experiments] Experimental setup: the construction of preferred (fact-checked) versus non-preferred (raw LLM) pairs is described at a high level, yet no evidence is given that the rejected responses are systematically inferior on the target dimensions of actuality or finesse rather than simply different. This leaves open the possibility that observed differences arise from data curation rather than the proposed curriculum or loss modifications.
Authors: The preferred responses are drawn from fact-checked sources, which are expected to be superior in actuality and finesse by construction. However, we agree that the manuscript does not currently provide direct evidence or metrics demonstrating this systematic inferiority of the raw LLM outputs on those specific dimensions. In revision, we will add preliminary quantitative comparisons or annotations showing differences on actuality and finesse, and we will discuss how the curriculum ordering and loss modifications contribute beyond the initial data selection. revision: partial
Circularity Check
No significant circularity; external grounding and parameter augmentation remain independent of evaluation data
full rationale
The paper grounds preferred responses in external fact-checked sources from credible outlets and treats raw LLM outputs as non-preferred responses, then augments the standard DPO loss with two new scalars (Actuality and Finesse). No equation or derivation step reduces the claimed quality gains to a quantity defined by the same fitted data or by a self-citation chain whose content is itself unverified. Curriculum ordering and model choices are presented as experimental controls rather than as outputs derived from the target metrics. The central claim therefore retains independent content from the experimental results on Mistral, Llama, Gemma, mBART and mT5.
Axiom & Free-Parameter Ledger
free parameters (2)
- Actuality
- Finesse
axioms (1)
- domain assumption Fact-checked explanations from credible sources serve as preferred responses while LLM outputs serve as non-preferred responses.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
LHin-DPO(πθ;πref) =−E(x,yw,yl)∼D [logσ(β·S(x,yw,yl))] with S incorporating (1+sw) and max(0.01,sl) scaled by Finesse variance v+ε
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
BhashaSutra: A Task-Centric Unified Survey of Indian NLP Datasets, Corpora, and Resources
A unified survey that consolidates Indian NLP resources by task, language, domain, and modality while identifying gaps in coverage and generalization.
Reference graph
Works this paper leans on
-
[1]
Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[2]
P olyglot: Distributed word representations for multilingual NLP
Rami Al-Rfou ' , Bryan Perozzi, and Steven Skiena. P olyglot: Distributed word representations for multilingual NLP . In Julia Hockenmaier and Sebastian Riedel (eds.), Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pp.\ 183--192, Sofia, Bulgaria, August 2013. Association for Computational Linguistics
work page 2013
-
[3]
Aletheia: A fake news detection system for hindi
Jathin Badam, Akash Bonagiri, Kvln Raju, and Dipanjan Chakraborty. Aletheia: A fake news detection system for hindi. In Proceedings of the 5th Joint International Conference on Data Science & Management of Data (9th ACM IKDD CODS and 27th COMAD), CODS-COMAD '22, pp.\ 255–259, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450385824
work page 2022
-
[4]
Meteor: An automatic metric for mt evaluation with improved correlation with human judgments
Satanjeev Banerjee and Alon Lavie. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, pp.\ 65--72, 2005
work page 2005
-
[5]
Mmcfnd: Multimodal multilingual caption-aware fake news detection for low-resource indic languages
Shubhi Bansal, Nishit Sushil Singh, Shahid Shafi Dar, and Nagendra Kumar. Mmcfnd: Multimodal multilingual caption-aware fake news detection for low-resource indic languages. arXiv preprint arXiv:2410.10407, 2024
-
[6]
Zapan Barua, Sajib Barua, Salma Aktar, Najma Kabir, and Mingze Li. Effects of misinformation on covid-19 individual responses and recommendations for resilience of disastrous consequences of misinformation. Progress in Disaster Science, 8: 0 100119, 2020
work page 2020
-
[7]
Hostility detection dataset in hindi
Mohit Bhardwaj, Md Shad Akhtar, Asif Ekbal, Amitava Das, and Tanmoy Chakraborty. Hostility detection dataset in hindi. arXiv preprint arXiv:2011.03588, 2020
-
[8]
Generating unsupervised abstractive explanations for rumour verification
Iman Munire Bilal, Preslav Nakov, Rob Procter, and Maria Liakata. Generating unsupervised abstractive explanations for rumour verification. arXiv preprint arXiv:2401.12713, 2024
-
[9]
Influence of fake news in twitter during the 2016 us presidential election
Alexandre Bovet and Hern \'a n A Makse. Influence of fake news in twitter during the 2016 us presidential election. Nature communications, 10 0 (1): 0 7, 2019
work page 2016
-
[10]
Does fake news affect voting behaviour? Research Policy, 52 0 (1): 0 104628, 2023
Michele Cantarella, Nicol \`o Fraccaroli, and Roberto Volpe. Does fake news affect voting behaviour? Research Policy, 52 0 (1): 0 104628, 2023
work page 2023
-
[11]
Are U a joke master? pun generation via multi-stage curriculum learning towards a humor LLM
Yang Chen, Chong Yang, Tu Hu, Xinhao Chen, Man Lan, Li Cai, Xinlin Zhuang, Xuan Lin, Xin Lu, and Aimin Zhou. Are U a joke master? pun generation via multi-stage curriculum learning towards a humor LLM . In Lun-Wei Ku, Andre Martins, and Vivek Srikumar (eds.), Findings of the Association for Computational Linguistics: ACL 2024, pp.\ 878--890, Bangkok, Thai...
work page 2024
-
[12]
On softmax direct preference optimization for recommendation
Yuxin Chen, Junfei Tan, An Zhang, Zhengyi Yang, Leheng Sheng, Enzhi Zhang, Xiang Wang, and Tat-Seng Chua. On softmax direct preference optimization for recommendation. arXiv preprint arXiv:2406.09215, 2024 b
-
[13]
Haixiao Chi and Beishui Liao. A quantitative argumentation-based automated explainable decision system for fake news detection on social media. Knowledge-Based Systems, 242: 0 108378, 2022
work page 2022
-
[14]
Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[15]
Healthalignsumm: Utilizing alignment for multimodal summarization of code-mixed healthcare dialogues
Akash Ghosh, Arkadeep Acharya, Sriparna Saha, Gaurav Pandey, Dinesh Raghu, and Setu Sinha. Healthalignsumm: Utilizing alignment for multimodal summarization of code-mixed healthcare dialogues. In Findings of the Association for Computational Linguistics: EMNLP 2024, pp.\ 11546--11560, 2024
work page 2024
-
[16]
Yeaeun Gong, Lanyu Shang, and Dong Wang. Integrating social explanations into explainable artificial intelligence (xai) for combating misinformation: Vision and challenges. IEEE Transactions on Computational Social Systems, 2024
work page 2024
-
[17]
Prompt-learning for cross-lingual relation extraction
Chiaming Hsu, Changtong Zan, Liang Ding, Longyue Wang, Xiaoting Wang, Weifeng Liu, Fu Lin, and Wenbin Hu. Prompt-learning for cross-lingual relation extraction. In 2023 International Joint Conference on Neural Networks (IJCNN), pp.\ 1--9. IEEE, 2023
work page 2023
-
[18]
Decipherpref: Analyzing influential factors in human preference judgments via gpt-4
Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, and Fei Liu. Decipherpref: Analyzing influential factors in human preference judgments via gpt-4. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp.\ 8344--8357, 2023
work page 2023
-
[19]
Albert Q Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, et al. Mistral 7b. arXiv preprint arXiv:2310.06825, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[20]
Explainable misinformation detection across multiple social media platforms
Gargi Joshi, Ananya Srivastava, Bhargav Yagnik, Mohammed Hasan, Zainuddin Saiyed, Lubna A Gabralla, Ajith Abraham, Rahee Walambe, and Ketan Kotecha. Explainable misinformation detection across multiple social media platforms. IEEE Access, 11: 0 23634--23646, 2023
work page 2023
-
[21]
Explainable automated fact-checking: A survey
Neema Kotonya and Francesca Toni. Explainable automated fact-checking: A survey. In Donia Scott, Nuria Bel, and Chengqing Zong (eds.), Proceedings of the 28th International Conference on Computational Linguistics, pp.\ 5430--5443, Barcelona, Spain (Online), December 2020. International Committee on Computational Linguistics
work page 2020
-
[22]
Fake news detection on hindi news dataset
Sudhanshu Kumar and Thoudam Doren Singh. Fake news detection on hindi news dataset. Global Transitions Proceedings, 3 0 (1): 0 289--297, 2022
work page 2022
-
[23]
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Xin Lai, Zhuotao Tian, Yukang Chen, Senqiao Yang, Xiangru Peng, and Jiaya Jia. Step-dpo: Step-wise preference optimization for long-chain reasoning of llms. arXiv preprint arXiv:2406.18629, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[24]
Rouge: A package for automatic evaluation of summaries
Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pp.\ 74--81, 2004
work page 2004
-
[25]
L ong G en B ench: Long-context generation benchmark
Xiang Liu, Peijie Dong, Xuming Hu, and Xiaowen Chu. L ong G en B ench: Long-context generation benchmark. In Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (eds.), Findings of the Association for Computational Linguistics: EMNLP 2024, pp.\ 865--883, Miami, Florida, USA, November 2024. Association for Computational Linguistics
work page 2024
-
[26]
Knowledge-aware reasoning over multimodal semi-structured tables
Suyash Vardhan Mathur, Jainit Sushil Bafna, Kunal Kartik, Harshita Khandelwal, Manish Shrivastava, Vivek Gupta, Mohit Bansal, and Dan Roth. Knowledge-aware reasoning over multimodal semi-structured tables. In Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (eds.), Findings of the Association for Computational Linguistics: EMNLP 2024, pp.\ 14054--14073, ...
work page 2024
-
[27]
FA ct S core: Fine-grained atomic evaluation of factual precision in long form text generation
Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Koh, Mohit Iyyer, Luke Zettlemoyer, and Hannaneh Hajishirzi. FA ct S core: Fine-grained atomic evaluation of factual precision in long form text generation. In Houda Bouamor, Juan Pino, and Kalika Bali (eds.), Proceedings of the 2023 Conference on Empirical Methods in Natural Language Pr...
work page 2023
-
[28]
Filtered direct preference optimization
Tetsuro Morimura, Mitsuki Sakamoto, Yuu Jinnai, Kenshi Abe, and Kaito Ariu. Filtered direct preference optimization. arXiv preprint arXiv:2404.13846, 2024
-
[29]
Enhancing alignment using curriculum learning & ranked preferences
Pulkit Pattnaik, Rishabh Maheshwary, Kelechi Ogueji, Vikas Yadav, and Sathwik Tejaswi Madhusudhan. Enhancing alignment using curriculum learning & ranked preferences. In Findings of the Association for Computational Linguistics: EMNLP 2024, pp.\ 12891--12907, 2024
work page 2024
-
[30]
Direct preference optimization: Your language model is secretly a reward model
Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn. Direct preference optimization: Your language model is secretly a reward model. Advances in Neural Information Processing Systems, 36, 2024
work page 2024
-
[31]
Susceptibility to misinformation about covid-19 around the world
Jon Roozenbeek, Claudia R Schneider, Sarah Dryhurst, John Kerr, Alexandra LJ Freeman, Gabriel Recchia, Anne Marthe Van Der Bles, and Sander Van Der Linden. Susceptibility to misinformation about covid-19 around the world. Royal Society open science, 7 0 (10): 0 201199, 2020
work page 2020
-
[32]
Axiomatic preference modeling for longform question answering
Corby Rosset, Guoqing Zheng, Victor Dibia, Ahmed Awadallah, and Paul Bennett. Axiomatic preference modeling for longform question answering. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp.\ 11445--11475, 2023
work page 2023
-
[33]
Benchmarking the generation of fact checking explanations
Daniel Russo, Serra Sinem Tekiro g lu, and Marco Guerini. Benchmarking the generation of fact checking explanations. Transactions of the Association for Computational Linguistics, 11: 0 1250--1264, 2023
work page 2023
-
[34]
A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications
Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Samrat Mondal, and Aman Chadha. A systematic survey of prompt engineering in large language models: Techniques and applications. arXiv preprint arXiv:2402.07927, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[35]
Ayako Sato, Kyotaro Nakajima, Hwichan Kim, Zhousi Chen, and Mamoru Komachi. TMU - HIT ' s submission for the WMT 24 quality estimation shared task: Is GPT -4 a good evaluator for machine translation? In Barry Haddow, Tom Kocmi, Philipp Koehn, and Christof Monz (eds.), Proceedings of the Ninth Conference on Machine Translation, pp.\ 529--534, Miami, Florid...
work page 2024
-
[36]
Machine learning methods to identify hindi fake news within social-media
Dilip Kumar Sharma and Sonal Garg. Machine learning methods to identify hindi fake news within social-media. In 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), pp.\ 1--6, 2021
work page 2021
-
[37]
Lfwe: Linguistic feature based word embedding for hindi fake news detection
Richa Sharma and Arti Arya. Lfwe: Linguistic feature based word embedding for hindi fake news detection. ACM Trans. Asian Low-Resour. Lang. Inf. Process., 22 0 (6), June 2023. ISSN 2375-4699
work page 2023
-
[38]
Richa Sharma and Arti Arya. Mmhfnd: Fusing modalities for multimodal multiclass hindi fake news detection via contrastive learning. ACM Transactions on Asian and Low-Resource Language Information Processing, 2024
work page 2024
-
[39]
Marco Siino. Mcrock at semeval-2024 task 4: Mistral 7b for multilingual detection of persuasion techniques in memes. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pp.\ 53--59, 2024 a
work page 2024
-
[40]
Mistral at semeval-2024 task 5: Mistral 7b for argument reasoning in civil procedure
Marco Siino. Mistral at semeval-2024 task 5: Mistral 7b for argument reasoning in civil procedure. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pp.\ 155--162, 2024 b
work page 2024
-
[41]
Multilingual translation with extensible multilingual pretraining and finetuning
Yuqing Tang, Chau Tran, Xian Li, Peng-Jen Chen, Naman Goyal, Vishrav Chaudhary, Jiatao Gu, and Angela Fan. Multilingual translation with extensible multilingual pretraining and finetuning. 2020
work page 2020
-
[42]
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, et al. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. arXiv preprint arXiv:2403.05530, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[43]
Gemma Team. Gemma. 2024. doi:10.34740/KAGGLE/M/3301. URL https://www.kaggle.com/m/3301
-
[44]
Measuring and modifying the readability of E nglish texts with GPT -4
Sean Trott and Pamela Rivi \`e re. Measuring and modifying the readability of E nglish texts with GPT -4. In Matthew Shardlow, Horacio Saggion, Fernando Alva-Manchego, Marcos Zampieri, Kai North, Sanja S tajner, and Regina Stodden (eds.), Proceedings of the Third Workshop on Text Simplification, Accessibility and Readability (TSAR 2024), pp.\ 126--134, Mi...
work page 2024
-
[45]
Diffusion model alignment using direct preference optimization
Bram Wallace, Meihua Dang, Rafael Rafailov, Linqi Zhou, Aaron Lou, Senthil Purushwalkam, Stefano Ermon, Caiming Xiong, Shafiq Joty, and Nikhil Naik. Diffusion model alignment using direct preference optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 8228--8238, 2024
work page 2024
-
[46]
Feature extraction and analysis of natural language processing for deep learning english language
Dongyang Wang, Junli Su, and Hongbin Yu. Feature extraction and analysis of natural language processing for deep learning english language. IEEE Access, 8: 0 46335--46345, 2020
work page 2020
-
[47]
Bpo: Towards balanced preference optimization between knowledge breadth and depth in alignment
Sizhe Wang, Yongqi Tong, Hengyuan Zhang, Dawei Li, Xin Zhang, and Tianlong Chen. Bpo: Towards balanced preference optimization between knowledge breadth and depth in alignment. arXiv preprint arXiv:2411.10914, 2024
-
[48]
Wenyi Xiao, Zechuan Wang, Leilei Gan, Shuai Zhao, Wanggui He, Luu Anh Tuan, Long Chen, Hao Jiang, Zhou Zhao, and Fei Wu. A comprehensive survey of datasets, theories, variants, and applications in direct preference optimization. arXiv preprint arXiv:2410.15595, 2024
-
[49]
Large language models for generative information extraction: A survey
Derong Xu, Wei Chen, Wenjun Peng, Chao Zhang, Tong Xu, Xiangyu Zhao, Xian Wu, Yefeng Zheng, Yang Wang, and Enhong Chen. Large language models for generative information extraction: A survey. Frontiers of Computer Science, 18 0 (6): 0 186357, 2024
work page 2024
-
[50]
m T 5: A massively multilingual pre-trained text-to-text transformer
Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. m T 5: A massively multilingual pre-trained text-to-text transformer. In Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, and Yichao Zhou (eds.), Procee...
work page 2021
-
[51]
Explainable fact-checking through question answering
Jing Yang, Didier Vega-Oliveros, Taís Seibt, and Anderson Rocha. Explainable fact-checking through question answering. In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.\ 8952--8956, 2022
work page 2022
-
[52]
End-to-end multimodal fact-checking and explanation generation: A challenging dataset and models
Barry Menglong Yao, Aditya Shah, Lichao Sun, Jin-Hee Cho, and Lifu Huang. End-to-end multimodal fact-checking and explanation generation: A challenging dataset and models. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.\ 2733--2743, 2023
work page 2023
-
[53]
Self-augmented preference optimization: Off-policy paradigms for language model alignment
Yueqin Yin, Zhendong Wang, Yujia Xie, Weizhu Chen, and Mingyuan Zhou. Self-augmented preference optimization: Off-policy paradigms for language model alignment. arXiv preprint arXiv:2405.20830, 2024
-
[54]
Evidence-driven retrieval augmented response generation for online misinformation
Zhenrui Yue, Huimin Zeng, Yimeng Lu, Lanyu Shang, Yang Zhang, and Dong Wang. Evidence-driven retrieval augmented response generation for online misinformation. In Kevin Duh, Helena Gomez, and Steven Bethard (eds.), Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies...
work page 2024
-
[55]
Token-level direct preference optimization
Yongcheng Zeng, Guoqing Liu, Weiyu Ma, Ning Yang, Haifeng Zhang, and Jun Wang. Token-level direct preference optimization. arXiv preprint arXiv:2404.11999, 2024
-
[56]
Trie: end-to-end text reading and information extraction for document understanding
Peng Zhang, Yunlu Xu, Zhanzhan Cheng, Shiliang Pu, Jing Lu, Liang Qiao, Yi Niu, and Fei Wu. Trie: end-to-end text reading and information extraction for document understanding. In Proceedings of the 28th ACM International Conference on Multimedia, pp.\ 1413--1422, 2020
work page 2020
-
[57]
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1904
-
[58]
Jiawei Zhou, Yixuan Zhang, Qianni Luo, Andrea G Parker, and Munmun De Choudhury. Synthetic lies: Understanding ai-generated misinformation and evaluating algorithmic and human solutions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp.\ 1--20, 2023
work page 2023
-
[59]
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.