COMODO is a cross-modal self-supervised distillation framework that uses a frozen video encoder and dynamic instance queue to align video and IMU embeddings, improving IMU-based egocentric HAR to match supervised performance.
Spatial- temporal masked autoencoder for multi-device wearable hu- man activity recognition
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition
COMODO is a cross-modal self-supervised distillation framework that uses a frozen video encoder and dynamic instance queue to align video and IMU embeddings, improving IMU-based egocentric HAR to match supervised performance.