pith. sign in

arxiv: 1906.00634 · v1 · pith:ICRH5MRRnew · submitted 2019-06-03 · 💻 cs.CV · cs.AI

How Much Does Audio Matter to Recognize Egocentric Object Interactions?

classification 💻 cs.CV cs.AI
keywords audioclassificationinteractionsactionegocentricmodelobjectverb
0
0 comments X
read the original abstract

Sounds are an important source of information on our daily interactions with objects. For instance, a significant amount of people can discern the temperature of water that it is being poured just by using the sense of hearing. However, only a few works have explored the use of audio for the classification of object interactions in conjunction with vision or as single modality. In this preliminary work, we propose an audio model for egocentric action recognition and explore its usefulness on the parts of the problem (noun, verb, and action classification). Our model achieves a competitive result in terms of verb classification (34.26% accuracy) on a standard benchmark with respect to vision-based state of the art systems, using a comparatively lighter architecture.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.