Cost-Effective Model Evaluation with Meta-Learning
Pith reviewed 2026-05-25 04:42 UTC · model grok-4.3
The pith
Meta-learning from reference models enables accurate evaluation of new models on completely unlabeled data without labels or per-model adaptation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MetaEvaluator leverages meta-learning over a pool of reference models to obtain a transferable initialization, enabling accurate evaluation of new models on entirely unlabeled datasets while amortizing cost across the pool and removing the need for per-model retraining; it is presented as the first model-agnostic framework capable of this.
What carries the argument
Meta-learning over a pool of reference models to obtain a transferable initialization for label-free evaluation of new models on unlabeled target data.
If this is right
- Performance estimates for new models remain stable and accurate even when the target dataset has no labels at all.
- The computational and annotation cost of evaluation is shared across many models rather than repeated individually.
- The same initialization works across diverse model architectures and data modalities without modification.
- No additional fine-tuning or adaptation step is required when a new model is presented for evaluation.
Where Pith is reading between the lines
- If the initialization transfers reliably, the framework could support ongoing monitoring of deployed models on private unlabeled streams where labeling is prohibited.
- Similar amortization might apply to other post-training tasks such as model selection or drift detection on unlabeled data.
- The approach could reduce dependence on fixed labeled benchmarks by enabling evaluation on fresh, domain-specific unlabeled collections.
Load-bearing premise
Meta-learning over reference models yields a transferable initialization that generalizes to new model families, architectures, and modalities on completely unlabeled target data without labels or per-model adaptation.
What would settle it
Apply the method to a model from a new architecture family and modality absent from the reference pool and compare its performance estimates against ground-truth accuracy obtained with labels; large systematic errors would falsify the claim.
Figures
read the original abstract
The rapid growth of machine learning has produced an ever-expanding ecosystem of models, making it increasingly challenging to verify the reliability of newly released models on unseen, unlabeled data. Conventional evaluation pipelines depend on expensive annotation, repeated fine-tuning, or narrow assumptions that fail to transfer across model families. We present MetaEvaluator, a cost-effective, model-agnostic framework for rapid, label-free assessment of unseen models spanning diverse architectures and modalities. MetaEvaluator leverages meta-learning over a pool of reference models to obtain a transferable initialization, enabling accurate evaluation of new models while amortizing cost across the pool and removing the need for per-model retraining. To the best of our knowledge, this is the first model-agnostic framework capable of evaluating new models on entirely unlabeled datasets. Extensive experiments show that MetaEvaluator produces stable and accurate performance estimates at substantially reduced cost compared to conventional approaches, making scalable benchmarking of emerging models on unlabeled data practical.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces MetaEvaluator, a meta-learning framework that trains a transferable initialization over a pool of reference models to enable label-free performance estimation for entirely new, unseen models on unlabeled target data. It claims to be the first model-agnostic method for this task, amortizing evaluation cost across the reference pool without requiring per-model retraining or labels, and asserts that extensive experiments demonstrate stable, accurate estimates at substantially lower cost than conventional approaches.
Significance. If the transferability claim holds, the work could meaningfully lower the barrier to benchmarking new models on unlabeled data across architectures and modalities. The amortization of meta-learning cost and removal of annotation requirements address a practical pain point in ML deployment and evaluation pipelines.
major comments (2)
- [Abstract] Abstract: the central claim that a single meta-learned initialization generalizes to 'entirely new model families, architectures, and modalities' on completely unlabeled target data without any per-model adaptation or labels is presented without any supporting cross-family, cross-architecture, or cross-modal results, quantitative metrics, or description of the reference-pool diversity; this assumption is load-bearing for the model-agnostic and label-free assertions.
- [Abstract] Abstract: the statement that 'extensive experiments show that MetaEvaluator produces stable and accurate performance estimates' is made without reference to any datasets, baselines, evaluation metrics, number of trials, or numerical results, so it is impossible to determine whether the data actually support the accuracy and cost-reduction claims.
minor comments (1)
- [Abstract] The abstract uses the phrase 'to the best of our knowledge' for the 'first model-agnostic framework' claim but provides no comparison table or citation list to prior meta-learning or label-free evaluation methods.
Simulated Author's Rebuttal
We thank the referee for these comments on the abstract. We agree that the abstract would be strengthened by incorporating more specific details from the experiments and will revise it accordingly. We address each point below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that a single meta-learned initialization generalizes to 'entirely new model families, architectures, and modalities' on completely unlabeled target data without any per-model adaptation or labels is presented without any supporting cross-family, cross-architecture, or cross-modal results, quantitative metrics, or description of the reference-pool diversity; this assumption is load-bearing for the model-agnostic and label-free assertions.
Authors: The manuscript body (Sections 4–5) contains the supporting cross-family, cross-architecture, and cross-modal results, including quantitative metrics and a description of the reference-pool composition and diversity. The abstract is a concise summary of these findings. We will revise the abstract to briefly note the reference-pool diversity and key generalization metrics so that the model-agnostic and label-free claims are better grounded within the abstract itself. revision: yes
-
Referee: [Abstract] Abstract: the statement that 'extensive experiments show that MetaEvaluator produces stable and accurate performance estimates' is made without reference to any datasets, baselines, evaluation metrics, number of trials, or numerical results, so it is impossible to determine whether the data actually support the accuracy and cost-reduction claims.
Authors: The full manuscript details the datasets, baselines, metrics (e.g., MAE, correlation), number of trials, and numerical results supporting stability and accuracy. We will revise the abstract to include concise references to the experimental scope (e.g., number of datasets and main performance metrics) to make these claims more verifiable from the abstract alone. revision: yes
Circularity Check
No significant circularity; framework proposal is empirically grounded rather than self-referential by construction
full rationale
The paper presents MetaEvaluator as a meta-learning framework trained on a reference pool of models and then applied to held-out new models on unlabeled data. This follows the standard meta-learning train/test split on distinct model sets and does not reduce any claimed performance estimate to a fitted parameter or self-citation by definition. No equations, uniqueness theorems, or ansatzes are shown that would make the output equivalent to the input by construction. The transfer claim to new families/modalities is an empirical assertion whose validity is independent of the meta-training procedure itself.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Anastasios N Angelopoulos, Stephen Bates, Clara Fannjiang, Michael I Jordan, and Tijana Zrnic. 2023. Prediction-powered inference.Science(2023)
work page 2023
-
[2]
Pierre Boyeau, Anastasios Nikolas Angelopoulos, Tianle Li, Nir Yosef, Jitendra Malik, and Michael I. Jordan. 2025. AutoEval Done Right: Using Synthetic Data for Model Evaluation. InICML. Cost-Effective Model Evaluation with Meta-Learning Preprint, 2026,
work page 2025
-
[3]
Jiefeng Chen, Frederick Liu, Besim Avci, Xi Wu, Yingyu Liang, and Somesh Jha. 2021. Detecting errors and estimating accuracy on unlabeled data with self-training ensembles.NeurIPS(2021)
work page 2021
-
[4]
Shaoguo Cui, Keying Wen, Binbin Sang, Tiansong Li, Yi Zhang, and Huan Gao
-
[5]
LLM-Based Data Synthesis and Distillation for High-Quality Text-to-SQL Training. InICIC
-
[6]
Yaxun Dai, Haiqin Yang, Mou Hao, and Pingfu Chao. 2025. PARSQL: Enhancing Text-to-SQL through SQL Parsing and Reasoning. InACL
work page 2025
-
[7]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Ima- geNet: A Large-Scale Hierarchical Image Database. InCVPR
work page 2009
-
[8]
Weijian Deng and Liang Zheng. 2021. Are labels always necessary for classifier accuracy evaluation?. InCVPR
work page 2021
-
[9]
Chi Thang Duong, Thanh Tam Nguyen, Trung-Dung Hoang, Hongzhi Yin, Matthias Weidlich, and Quoc Viet Hung Nguyen. 2022. Deep MinCut: Learning Node Embeddings from Detecting Communities.Pattern Recognition(2022), 109126
work page 2022
-
[10]
Chi Thang Duong, Thanh Tam Nguyen, Hongzhi Yin, Matthias Weidlich, Thai Son Mai, Karl Aberer, and Quoc Viet Hung Nguyen. 2022. Efficient and Effective Multi- Modal Queries Through Heterogeneous Network Embedding.IEEE Transactions on Knowledge and Data Engineering34, 11 (2022), 5307–5320
work page 2022
- [11]
-
[12]
Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John Winn, and Andrew Zisserman. 2010. The Pascal Visual Object Classes (VOC) Challenge. IJCV(2010)
work page 2010
-
[13]
Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-Agnostic Meta- Learning for Fast Adaptation of Deep Networks. InICML
work page 2017
-
[14]
Alex Hofer, Bhuwan Dhingra, Amir Globerson, and William W
Adam Fisch, Joshua Maynez, R. Alex Hofer, Bhuwan Dhingra, Amir Globerson, and William W. Cohen. 2024. Stratified Prediction-Powered Inference for Effective Hybrid Evaluation of Language Models. InNeurIPS
work page 2024
-
[15]
Saurabh Garg, Sivaraman Balakrishnan, Zachary Chase Lipton, Behnam Neyshabur, and Hanie Sedghi. 2022. Leveraging unlabeled data to predict out-of- distribution performance. (2022)
work page 2022
-
[16]
Jiawei Gu, Xuhui Jiang, Zhichao Shi, Hexiang Tan, Xuehao Zhai, Chengjin Xu, Wei Li, Yinghan Shen, Shengjie Ma, Honghao Liu, et al . 2024. A survey on llm-as-a-judge.The Innovation(2024)
work page 2024
-
[17]
Devin Guillory, Vaishaal Shankar, Sayna Ebrahimi, Trevor Darrell, and Ludwig Schmidt. 2021. Predicting with confidence on unseen distributions. InICCV
work page 2021
-
[18]
Yu Guo, Dong Jin, Shenghao Ye, Shuangwu Chen, Jian Yang, and Xiaobin Tan
-
[19]
SQLForge: Synthesizing Reliable and Diverse Data to Enhance Text-to-SQL Reasoning in LLMs. InACL
-
[20]
Rundong He, Yicong Dong, Lan-Zhe Guo, Yilong Yin, and Tailin Wu. 2025. Re- Evaluating the Impact of Unseen-Class Unlabeled Data on Semi-Supervised Learning Model. InICLR
work page 2025
-
[21]
Thanh Dat Hoang, Thanh Trung Huynh, Matthias Weidlich, Thanh Tam Nguyen, Tong Chen, Hongzhi Yin, and Quoc Viet Hung Nguyen. 2026. Boosting Small Language Models for Text-to-SQL with Fine-Grained Execution Feedback and Cost-Efficient Rewards. InICDE. IEEE
work page 2026
-
[22]
Jonathan J. Hull. 2002. A database for handwritten text recognition research. TPAMI(2002)
work page 2002
-
[23]
Nguyen Quoc Viet Hung, Duong Chi Thang, Nguyen Thanh Tam, Matthias Weidlich, Karl Aberer, Hongzhi Yin, and Xiaofang Zhou. 2017. Answer validation for generic crowdsourcing tasks with minimal efforts.The VLDB Journal26 (2017), 855–880
work page 2017
-
[24]
Nguyen Quoc Viet Hung, Matthias Weidlich, Nguyen Thanh Tam, Zoltán Miklós, Karl Aberer, Avigdor Gal, and Bela Stantic. 2019. Handling probabilistic integrity constraints in pay-as-you-go reconciliation of data models.Information Systems 83 (2019), 166–180
work page 2019
-
[25]
Thanh Trung Huynh, Chi Thang Duong, Thanh Tam Nguyen, Vinh Tong Van, Ab- dul Sattar, Hongzhi Yin, and Quoc Viet Hung Nguyen. 2021. Network alignment with holistic embeddings.TKDE35, 2 (2021), 1881–1894
work page 2021
-
[26]
Thanh Trung Huynh, Minh Hieu Nguyen, Thanh Tam Nguyen, Phi Le Nguyen, Matthias Weidlich, Quoc Viet Hung Nguyen, and Karl Aberer. 2023. Efficient integration of multi-order dynamics and internal dynamics in stock movement prediction. InProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. 850–858
work page 2023
-
[27]
Thanh Trung Huynh, Trong Bang Nguyen, Phi Le Nguyen, Thanh Tam Nguyen, Matthias Weidlich, Quoc Viet Hung Nguyen, and Karl Aberer. 2024. Fast-fedul: A training-free federated unlearning with provable skew resilience. InJoint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 55–72
work page 2024
-
[28]
Thanh Trung Huynh, Trong Bang Nguyen, Thanh Toan Nguyen, Phi Le Nguyen, Hongzhi Yin, Quoc Viet Hung Nguyen, and Thanh Tam Nguyen. 2025. Certified Unlearning for Federated Recommendation.ACM Transactions on Information Systems(2025)
work page 2025
-
[29]
Yiding Jiang, Vaishnavh Nagarajan, Christina Baek, and J Zico Kolter. 2022. Assessing Generalization of SGD via Disagreement. InICLR
work page 2022
-
[30]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 2002. Gradient- based learning applied to document recognition.Proc. IEEE(2002)
work page 2002
-
[31]
Chia-Hsuan Lee, Hao Cheng, Jacob Devlin, Kristina Toutanova, and Jianfeng Gao
-
[32]
KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers. InACL
-
[33]
Gyubok Lee, Hyeonji Hwang, Seongsu Bae, Yeonsu Kwon, Woncheol Shin, Seongjun Yang, Minjoon Seo, Jong-Yeup Kim, and Edward Choi. 2022. EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records. InNeurIPS
work page 2022
-
[34]
Fangyu Lei, Jixuan Chen, Yuxiao Ye, Ruisheng Cao, Dongchan Shin, Hongjin SU, ZHAOQING SUO, Hongcheng Gao, Wenjing Hu, Pengcheng Yin, Victor Zhong, Caiming Xiong, Ruoxi Sun, Qian Liu, Sida Wang, and Tao Yu. 2025. Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows. InICLR
work page 2025
-
[35]
Boyan Li, Yuyu Luo, Chengliang Chai, Guoliang Li, and Nan Tang. 2024. The Dawn of Natural Language to SQL: Are We Fully Ready?VLDB(2024)
work page 2024
-
[36]
Haoyang Li, Shang Wu, Xiaokang Zhang, Xinmei Huang, Jing Zhang, Fuxin Jiang, Shuai Wang, Tieying Zhang, Jianjun Chen, Rui Shi, Hong Chen, and Cuiping Li
-
[37]
OmniSQL: Synthesizing High-Quality Text-to-SQL Data at Scale.VLDB (2025)
work page 2025
-
[38]
Jinyang Li, Binyuan Hui, Ge Qu, Jiaxi Yang, Binhua Li, Bowen Li, Bailin Wang, Bowen Qin, Rongyu Cao, Ruiying Geng, et al. 2023. Can LLM Already Serve as a Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to- SQLs. InNeurIPS
work page 2023
-
[39]
Zhenguo Li, Fengwei Zhou, Fei Chen, and Hang Li. 2017. Meta-SGD: Learning to Learn Quickly for Few-Shot Learning.arXiv:1707.09835(2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[40]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. InECCV
work page 2014
-
[41]
Renpu Liu and Jing Yang. 2025. Unlabeled Data Can Provably Enhance In-Context Learning of Transformers. InNeurIPS
work page 2025
-
[42]
Xinyu Liu, Shuyu Shen, Boyan Li, Nan Tang, and Yuyu Luo. 2025. NL2SQL-BUGs: A Benchmark for Detecting Semantic Errors in NL2SQL Translation. InSIGKDD
work page 2025
-
[43]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Baolin Wu, An- drew Y Ng, et al. 2011. Reading digits in natural images with unsupervised feature learning. InNeurIPS
work page 2011
-
[44]
Dong Duc Anh Nguyen, Minh Hieu Nguyen, Phi Le Nguyen, Jun Jo, Hongzhi Yin, and Thanh Tam Nguyen. 2024. Multi-task Learning of Heterogeneous Hypergraph Representations in LBSNs. InInternational Conference on Advanced Data Mining and Applications. Springer, 161–177
work page 2024
-
[45]
Minh Hieu Nguyen, Thanh Trung Huynh, Thanh Toan Nguyen, Phi Le Nguyen, Hien Thu Pham, Jun Jo, and Thanh Tam Nguyen. 2025. On-device diagnos- tic recommendation with heterogeneous federated BlockNets.Science China Information Sciences68, 4 (2025), 140102
work page 2025
-
[46]
Minh Hieu Nguyen, Thanh Tam Nguyen, Jun Jo, Duc Anh Nguyen, Hongzhi Yin, and Quoc Viet Hung Nguyen. 2026. Handling Data Sparsity and Model Poisoning Attacks in Federated Sequential Recommender Systems.Knowledge- Based Systems(2026), 115545
work page 2026
-
[47]
Quoc Viet Hung Nguyen, Son Thanh Do, Thanh Tam Nguyen, and Karl Aberer
-
[48]
InInternational Conference on Database Systems for Advanced Applications
Tag-based paper retrieval: minimizing user effort with diversity awareness. InInternational Conference on Database Systems for Advanced Applications. 510– 528
-
[49]
Quoc Viet Hung Nguyen, Chi Thang Duong, Thanh Tam Nguyen, Matthias Wei- dlich, Karl Aberer, Hongzhi Yin, and Xiaofang Zhou. 2017. Argument discovery via crowdsourcing.The VLDB Journal26, 4 (2017), 511–535
work page 2017
-
[50]
Quoc Viet Hung Nguyen, Thanh Tam Nguyen, Vinh Tuan Chau, Tri Kurniawan Wijaya, Zoltán Miklós, Karl Aberer, Avigdor Gal, and Matthias Weidlich. 2015. SMART: A tool for analyzing and reconciling schema matching networks. In ICDE. 1488–1491
work page 2015
-
[51]
Quoc Viet Hung Nguyen, Tam Nguyen Thanh, Zoltán Miklós, and Karl Aberer
-
[52]
Reconciling schema matching networks through crowdsourcing.EAI Endorsed Transactions on Collaborative Computing1, 2 (2014), e2
work page 2014
-
[53]
Quoc Viet Hung Nguyen, Kai Zheng, Matthias Weidlich, Bolong Zheng, Hongzhi Yin, Thanh Tam Nguyen, and Bela Stantic. 2018. What-if analysis with conflicting goals: Recommending data ranges for exploration. In2018 IEEE 34th International Conference on Data Engineering (ICDE). IEEE, 89–100
work page 2018
-
[54]
Thanh Tam Nguyen, Thanh Trung Huynh, Zhao Ren, Thanh Toan Nguyen, Phi Le Nguyen, Hongzhi Yin, and Quoc Viet Hung Nguyen. 2025. Privacy-preserving explainable AI: a survey.Science China Information Sciences68, 1 (2025), 111101
work page 2025
-
[55]
Thanh Tam Nguyen, Thanh Trung Huynh, Hongzhi Yin, Matthias Weidlich, Thanh Thi Nguyen, Thai Son Mai, and Quoc Viet Hung Nguyen. 2023. Detecting rumours with latency guarantees using massive streaming data.The VLDB Journal32, 2 (2023), 369–387
work page 2023
-
[56]
Thanh Toan Nguyen, Thanh Tam Nguyen, Thanh Hung Nguyen, Hongzhi Yin, Thanh Thi Nguyen, Jun Jo, and Quoc Viet Hung Nguyen. 2023. Isomorphic Graph Embedding for Progressive Maximal Frequent Subgraph Mining.ACM Transactions on Intelligent Systems and Technology15, 1 (2023), 1–26
work page 2023
-
[57]
Thanh Tam Nguyen, Thanh Toan Nguyen, Matthias Weidlich, Jun Jo, Quoc Viet Hung Nguyen, Hongzhi Yin, and Alan Wee-Chung Liew. 2024. Handling Low Homophily in Recommender Systems with Partitioned Graph Transformer. Preprint, 2026, Pham et al. IEEE Transactions on Knowledge and Data Engineering(2024)
work page 2024
-
[58]
Thanh Tam Nguyen, Thanh Cong Phan, Minh Hieu Nguyen, Matthias Weidlich, Hongzhi Yin, Jun Jo, and Quoc Viet Hung Nguyen. 2022. Model-agnostic and diverse explanations for streaming rumour graphs.Knowledge-Based Systems 253 (2022), 109438
work page 2022
-
[59]
Thanh Tam Nguyen, Thanh Cong Phan, Hien Thu Pham, Thanh Thi Nguyen, Jun Jo, and Quoc Viet Hung Nguyen. 2023. Example-based explanations for streaming fraud detection on graphs.Information Sciences621 (2023), 319–340
work page 2023
-
[60]
Thanh Toan Nguyen, Nguyen Quoc Viet Hung, Thanh Tam Nguyen, Thanh Trung Huynh, Thanh Thi Nguyen, Matthias Weidlich, and Hongzhi Yin. 2024. Manipu- lating recommender systems: A survey of poisoning attacks and countermeasures. Comput. Surveys57, 1 (2024), 1–39
work page 2024
-
[61]
Thanh Tam Nguyen, Zhao Ren, Thanh Toan Nguyen, Jun Jo, Quoc Viet Hung Nguyen, and Hongzhi Yin. 2024. Portable graph-based rumour detection against multi-modal heterophily.Knowledge-Based Systems284 (2024), 111310
work page 2024
-
[62]
Thanh Tam Nguyen, Zhao Ren, Trinh Pham, Phi Le Nguyen, Quoc Viet Hung Nguyen, and Hongzhi Yin. 2026. A review of instruction-guided image editing. EAAI(2026)
work page 2026
-
[63]
Thanh Tam Nguyen, Matthias Weidlich, Hongzhi Yin, Bolong Zheng, Quang Huy Nguyen, and Quoc Viet Hung Nguyen. 2020. Factcatch: Incremental pay-as-you- go fact checking with minimal user effort. InProceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2165–2168
work page 2020
-
[64]
Toan Nguyen Thanh, Nguyen Duc Khang Quach, Thanh Tam Nguyen, Thanh Trung Huynh, Viet Hung Vu, Phi Le Nguyen, Jun Jo, and Quoc Viet Hung Nguyen. 2023. Poisoning GNN-based recommender systems with generative surrogate-based attacks.ACM Transactions on Information Systems41, 3 (2023), 1–24
work page 2023
-
[65]
Alex Nichol, Joshua Achiam, and John Schulman. 2018. On First-Order Meta- Learning Algorithms.arXiv:1803.02999(2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[66]
Khanh Trinh Pham, Thu Huong Nguyen, Jun Jo, Quoc Viet Hung Nguyen, and Thanh Tam Nguyen. 2025. Multilingual Text-to-SQL: Benchmarking the Limits of Language Models with Collaborative Language Agents. InAustralasian Database Conference. Springer, 108–123
work page 2025
-
[67]
Khanh Trinh Pham, Thanh Tam Nguyen, Viet Huynh, Hongzhi Yin, and Quoc Viet Hung Nguyen. 2026. An Efficient and Effective Evaluator for Text2SQL Models on Unseen and Unlabeled Data. In2026 IEEE 42nd International Conference on Data Engineering (ICDE). IEEE
work page 2026
-
[68]
Minh Tam Pham, Thanh Trung Huynh, Thanh Tam Nguyen, Thanh Toan Nguyen, Thanh Thi Nguyen, Jun Jo, Hongzhi Yin, and Quoc Viet Hung Nguyen. 2024. A dual benchmarking study of facial forgery and facial forensics.CAAI Transactions on Intelligence Technology9, 6 (2024), 1377–1397
work page 2024
-
[69]
Minh Tam Pham, Quoc Viet Hung Nguyen, Jun Jo, and Thanh Tam Nguyen. 2025. An Extensible Benchmark for Value Ambiguity Resolution in Text-to-SQL. In Australasian Database Conference. Springer, 124–138
work page 2025
-
[70]
Trinh Pham, Viet Huynh, Hongzhi Yin, Quoc Viet Hung Nguyen, and Thanh Tam Nguyen. 2026. Learning to Evaluate: Cost-Effective Model Evaluation on Unla- beled Data with Meta-Learning. InKDD
work page 2026
-
[71]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sand- hini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al
-
[72]
Learning Transferable Visual Models From Natural Language Supervision. InICML
-
[73]
Anirudh Raghu, Maithra Raghu, Samy Bengio, and Oriol Vinyals. 2020. Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML. InICLR
work page 2020
-
[74]
Zhao Ren, Yi Chang, Thanh Tam Nguyen, Yang Tan, Kun Qian, and Björn W Schuller. 2024. A comprehensive survey on heart sound analysis in the deep learning era.IEEE Computational Intelligence Magazine19, 3 (2024), 42–57
work page 2024
-
[75]
Zhao Ren, Thanh Tam Nguyen, and Wolfgang Nejdl. 2022. Prototype learning for interpretable respiratory sound analysis. InProc. ICASSP. 9087–9091
work page 2022
-
[76]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-Resolution Image Synthesis with Latent Diffusion Models. InCVPR
work page 2022
-
[77]
Darnbi Sakong, Viet Hung Vu, Thanh Trung Huynh, Phi Le Nguyen, Hongzhi Yin, Quoc Viet Hung Nguyen, and Thanh Tam Nguyen. 2024. Higher-order knowledge-enhanced recommendation with heterogeneous hypergraph multi- attention.Information Sciences680 (2024), 121165
work page 2024
-
[78]
David Salinas, Omar Swelam, and Frank Hutter. 2025. Tuning LLM Judge Design Decisions for 1/1000 of the Cost. InICML
work page 2025
-
[79]
Sebastian Schelter, Tammo Rukat, and Felix Biessmann. 2020. Learning to Validate the Predictions of Black Box Classifiers on Unseen Data. InSIGMOD
work page 2020
-
[80]
Konstantin Schürholt, Diyar Taskiran, Boris Knyazev, Xavier Giró-i Nieto, and Damian Borth. 2022. Model zoos: A dataset of diverse populations of neural network models.NeurIPS(2022)
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.