Contrastive multi-modal hypergraph reasoning fuses semantic, geometric, and pose cues to achieve state-of-the-art 3D crowd mesh recovery under severe occlusions.
Pixels to Graphs by Associative Embedding
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Graphs are a useful abstraction of image content. Not only can graphs represent details about individual objects in a scene but they can capture the interactions between pairs of objects. We present a method for training a convolutional neural network such that it takes in an input image and produces a full graph definition. This is done end-to-end in a single stage with the use of associative embeddings. The network learns to simultaneously identify all of the elements that make up a graph and piece them together. We benchmark on the Visual Genome dataset, and demonstrate state-of-the-art performance on the challenging task of scene graph generation.
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Contrastive Multi-Modal Hypergraph Reasoning for 3D Crowd Mesh Recovery
Contrastive multi-modal hypergraph reasoning fuses semantic, geometric, and pose cues to achieve state-of-the-art 3D crowd mesh recovery under severe occlusions.