pith. sign in

arxiv: 1705.07115 · v3 · pith:5F5CARE3new · submitted 2017-05-19 · 💻 cs.CV

Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics

classification 💻 cs.CV
keywords learningmulti-taskregressiontaskclassificationdeeplearnloss
0
0 comments X
read the original abstract

Numerous deep learning applications benefit from multi-task learning with multiple regression and classification objectives. In this paper we make the observation that the performance of such systems is strongly dependent on the relative weighting between each task's loss. Tuning these weights by hand is a difficult and expensive process, making multi-task learning prohibitive in practice. We propose a principled approach to multi-task deep learning which weighs multiple loss functions by considering the homoscedastic uncertainty of each task. This allows us to simultaneously learn various quantities with different units or scales in both classification and regression settings. We demonstrate our model learning per-pixel depth regression, semantic and instance segmentation from a monocular input image. Perhaps surprisingly, we show our model can learn multi-task weightings and outperform separate models trained individually on each task.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. CITYMPC: A Large-Scale Physics-Informed Benchmark and Tool for Generative Complete Multipath Wireless Channel Modeling

    eess.SP 2026-05 unverdicted novelty 6.0

    CITYMPC, a cVAE model, predicts full per-path multipath component parameters from POV images and height maps alone, matching ray-tracing accuracy with 1.29 dB power MAE and 7.25 ns delay MAE across 427k links in five ...

  2. Real-Time Hand Gesture Recognition: Integrating Skeleton-Based Data Fusion and Multi-Stream CNN

    cs.CV 2024-06 unverdicted novelty 5.0

    Skeleton data from hand gestures is fused into RGB images and classified by an e2eET multi-stream CNN, yielding competitive accuracy on five datasets and real-time operation on consumer hardware.