pith. machine review for the scientific record. sign in

arxiv: 2411.13311 · v1 · submitted 2024-11-20 · 💻 cs.CV · cs.AI

Recognition: unknown

A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data

Authors on Pith no claims yet
classification 💻 cs.CV cs.AI
keywords radarcameradetectionbirdcamerasdatafeaturesfusion
0
0 comments X
read the original abstract

Cameras can be used to perceive the environment around the vehicle, while affordable radar sensors are popular in autonomous driving systems as they can withstand adverse weather conditions unlike cameras. However, radar point clouds are sparser with low azimuth and elevation resolution that lack semantic and structural information of the scenes, resulting in generally lower radar detection performance. In this work, we directly use the raw range-Doppler (RD) spectrum of radar data, thus avoiding radar signal processing. We independently process camera images within the proposed comprehensive image processing pipeline. Specifically, first, we transform the camera images to Bird's-Eye View (BEV) Polar domain and extract the corresponding features with our camera encoder-decoder architecture. The resultant feature maps are fused with Range-Azimuth (RA) features, recovered from the RD spectrum input from the radar decoder to perform object detection. We evaluate our fusion strategy with other existing methods not only in terms of accuracy but also on computational complexity metrics on RADIal dataset.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. REFNet++: Multi-Task Efficient Fusion of Camera and Radar Sensor Data in Bird's-Eye Polar View

    cs.CV 2026-05 unverdicted novelty 4.0

    REFNet++ aligns raw camera images and radar range-Doppler data into a shared bird's-eye polar view using variational encoders for multi-task vehicle detection and free space segmentation on the RADIal dataset.