Recoverable Identifier
advisory
doi_compliance
recoverable_identifier
DOI in the printed bibliography is fragmented by whitespace or line breaks. A longer candidate (10.18653/v1/2024.acl-long.511.URLhttps://aclanthology.org/2024.acl-long.511/.Yidong) was visible in the surrounding text but could not be confirmed against doi.org as printed.
Paper page Integrity report arXiv Try DOI
Evidence text
URL https://openreview.net/forum?id=G0dksFayVq. Peiyi Wang, Lei Li, Liang Chen, Zefan Cai, Dawei Zhu, Binghuai Lin, Yunbo Cao, Lingpeng Kong, Qi Liu, Tianyu Liu, and Zhifang Sui. Large language models are not fair evaluators. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar, editors,Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (V olume 1: Long Papers), pages 9440–9450, Bangkok, Thailand, August 2024a. Association for Computational Linguistics. doi: 10.18653/v1/2024. acl-long.511. URLhttps://aclanthology.org/2024.acl-long.511/. Yidong Wang, Zhuohao Yu, Wenjin Yao, Zhengran Zeng, Linyi Yang, Cunxiang Wang, Hao Chen, Chaoya Jiang, Rui Xie, Jindong Wang, Xing Xie, Wei Ye, Shikun Zhang, and Yue Zhang. PandaLM: An automatic evaluation benchmark for LLM instruction tuning optimization. InThe Twelfth International Conference on Learning Representations, 2024b. URL https://openreview. net/forum?id=5Nn2BLV7SB. Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, Tianle Li, Max Ku, Kai Wang, Alex Zhuang, Rongqi Fan, Xiang Yue, and Wenhu Chen. Mmlu-pro: A more robust and challenging multi-task language understanding benchmark. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 95266–95290. Curran Associates, Inc., 2024c. doi: 10.52202/079017-3018. URL https://p
Evidence payload
{
"printed_excerpt": "URL https://openreview.net/forum?id=G0dksFayVq. Peiyi Wang, Lei Li, Liang Chen, Zefan Cai, Dawei Zhu, Binghuai Lin, Yunbo Cao, Lingpeng Kong, Qi Liu, Tianyu Liu, and Zhifang Sui. Large language models are not fair evaluators. In Lun-Wei Ku,",
"reconstructed_doi": "10.18653/v1/2024.acl-long.511.URLhttps://aclanthology.org/2024.acl-long.511/.Yidong",
"ref_index": 15,
"resolved_title": null,
"verdict_class": "incontrovertible"
}