Exploratory study finds MLE agents produce high-variance pipelines that underperform manual baselines on predictive quality and skin-tone fairness for melanoma classification despite targeted prompts.
stratum: A System Infrastructure for Massive Agent-Centric ML Workloads.arXiv:2603.03589
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Be Fair! Can Machine Learning Engineering Agents Adhere to Fairness Constraints?
Exploratory study finds MLE agents produce high-variance pipelines that underperform manual baselines on predictive quality and skin-tone fairness for melanoma classification despite targeted prompts.