Human-AI hybrids achieve only +0.4pp over AI alone on diverse tasks because confidence routing fails to identify the small set of cases where humans can correct AI errors.
Selective Classification for Deep Neural Networks , url =
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it