hub

arXiv preprint arXiv:1611.04201 , year=

Cad2rl: Real single-image flight without a single real image , author= · 2016 · cs.LG · arXiv 1611.04201

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

open full Pith review browse 10 citing papers arXiv PDF

abstract

Deep reinforcement learning has emerged as a promising and powerful technique for automatically acquiring control policies that can process raw sensory inputs, such as images, and perform complex behaviors. However, extending deep RL to real-world robotic tasks has proven challenging, particularly in safety-critical domains such as autonomous flight, where a trial-and-error learning process is often impractical. In this paper, we explore the following question: can we train vision-based navigation policies entirely in simulation, and then transfer them into the real world to achieve real-world flight without a single real training image? We propose a learning method that we call CAD$^2$RL, which can be used to perform collision-free indoor flight in the real world while being trained entirely on 3D CAD models. Our method uses single RGB images from a monocular camera, without needing to explicitly reconstruct the 3D geometry of the environment or perform explicit motion planning. Our learned collision avoidance policy is represented by a deep convolutional neural network that directly processes raw monocular images and outputs velocity commands. This policy is trained entirely on simulated images, with a Monte Carlo policy evaluation algorithm that directly optimizes the network's ability to produce collision-free flight. By highly randomizing the rendering settings for our simulated training set, we show that we can train a policy that generalizes to the real world, without requiring the simulator to be particularly realistic or high-fidelity. We evaluate our method by flying a real quadrotor through indoor environments, and further evaluate the design choices in our simulator through a series of ablation studies on depth prediction. For supplementary video see: https://youtu.be/nXBWmzFrj5s

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.

Learning When to Stop: Selective Imitation Learning Under Arbitrary Dynamics Shift

cs.LG · 2026-05-09 · unverdicted · novelty 7.0 · 2 refs

SeqRejectron constructs a stopping rule with a small set of validator policies to achieve horizon-free sample complexity for selective imitation learning under arbitrary dynamics shifts.

Dream to Fly: Model-Based Reinforcement Learning for Vision-Based Drone Flight

cs.RO · 2025-01-24 · unverdicted · novelty 6.0

DreamerV3 enables pixel-to-control policies for drone racing that reach 9 m/s in both simulation and real hardware-in-the-loop tests.

RoboNet: Large-Scale Multi-Robot Learning

cs.RO · 2019-10-24 · conditional · novelty 6.0

RoboNet is a multi-robot video dataset that enables pre-training of vision-based manipulation models which, after fine-tuning on a new robot, outperform robot-specific training that uses 4-20 times more data.

Environment Probing Interaction Policies

cs.RO · 2019-07-26 · unverdicted · novelty 6.0

EPI policies use a transition-predictability reward to probe environments and condition task policies, outperforming standard generalization methods on novel test environments.

Exploring the Role of Synthetic Data Augmentation in Controllable Human-Centric Video Generation

cs.CV · 2026-04-23 · unverdicted · novelty 6.0

Synthetic data complements real data in diffusion-based controllable human video generation, with effective sample selection improving motion realism, temporal consistency, and identity preservation.

NavRL++: A System-Level Framework for Improving Sim-to-Real Transfer in Reinforcement Learning-Based Robot Navigation

cs.RO · 2026-05-15 · unverdicted · novelty 5.0

NavRL++ improves sim-to-real transfer for RL navigation via empirical analysis of perturbations, perturbation-aware fine-tuning, and a Transformer temporal policy, with real-world validation showing outperformance over learning baselines and parity with optimization planners in static cases.

Agent AI: Surveying the Horizons of Multimodal Interaction

cs.AI · 2024-01-07 · unverdicted · novelty 4.0

The paper defines Agent AI as interactive multimodal systems that perceive grounded data and generate embodied actions, arguing this approach can mitigate hallucinations in foundation models.

Multi-Task Regression-based Learning for Autonomous Unmanned Aerial Vehicle Flight Control within Unstructured Outdoor Environments

cs.RO · 2019-07-18 · unverdicted · novelty 4.0

End-to-end multi-task regression learns flight commands for UAVs to explore unstructured forest environments from vision alone, outperforming pose-estimation baselines in simulation.

Improved Reinforcement Learning through Imitation Learning Pretraining Towards Image-based Autonomous Driving

cs.LG · 2019-07-16 · unverdicted · novelty 3.0

Imitation learning pretraining of a ResNet-34 DDPG agent improves performance on image-based autonomous driving in simulation over pure IL or pure RL.

citing papers explorer

Showing 10 of 10 citing papers.

Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling cs.LG · 2026-05-14 · unverdicted · none · ref 128 · internal anchor
DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
Learning When to Stop: Selective Imitation Learning Under Arbitrary Dynamics Shift cs.LG · 2026-05-09 · unverdicted · none · ref 31 · 2 links · internal anchor
SeqRejectron constructs a stopping rule with a small set of validator policies to achieve horizon-free sample complexity for selective imitation learning under arbitrary dynamics shifts.
Dream to Fly: Model-Based Reinforcement Learning for Vision-Based Drone Flight cs.RO · 2025-01-24 · unverdicted · none · ref 39 · internal anchor
DreamerV3 enables pixel-to-control policies for drone racing that reach 9 m/s in both simulation and real hardware-in-the-loop tests.
RoboNet: Large-Scale Multi-Robot Learning cs.RO · 2019-10-24 · conditional · none · ref 1 · internal anchor
RoboNet is a multi-robot video dataset that enables pre-training of vision-based manipulation models which, after fine-tuning on a new robot, outperform robot-specific training that uses 4-20 times more data.
Environment Probing Interaction Policies cs.RO · 2019-07-26 · unverdicted · none · ref 21 · internal anchor
EPI policies use a transition-predictability reward to probe environments and condition task policies, outperforming standard generalization methods on novel test environments.
Exploring the Role of Synthetic Data Augmentation in Controllable Human-Centric Video Generation cs.CV · 2026-04-23 · unverdicted · none · ref 29
Synthetic data complements real data in diffusion-based controllable human video generation, with effective sample selection improving motion realism, temporal consistency, and identity preservation.
NavRL++: A System-Level Framework for Improving Sim-to-Real Transfer in Reinforcement Learning-Based Robot Navigation cs.RO · 2026-05-15 · unverdicted · none · ref 34 · internal anchor
NavRL++ improves sim-to-real transfer for RL navigation via empirical analysis of perturbations, perturbation-aware fine-tuning, and a Transformer temporal policy, with real-world validation showing outperformance over learning baselines and parity with optimization planners in static cases.
Agent AI: Surveying the Horizons of Multimodal Interaction cs.AI · 2024-01-07 · unverdicted · none · ref 263 · internal anchor
The paper defines Agent AI as interactive multimodal systems that perceive grounded data and generate embodied actions, arguing this approach can mitigate hallucinations in foundation models.
Multi-Task Regression-based Learning for Autonomous Unmanned Aerial Vehicle Flight Control within Unstructured Outdoor Environments cs.RO · 2019-07-18 · unverdicted · none · ref 22 · internal anchor
End-to-end multi-task regression learns flight commands for UAVs to explore unstructured forest environments from vision alone, outperforming pose-estimation baselines in simulation.
Improved Reinforcement Learning through Imitation Learning Pretraining Towards Image-based Autonomous Driving cs.LG · 2019-07-16 · unverdicted · none · ref 4 · internal anchor
Imitation learning pretraining of a ResNet-34 DDPG agent improves performance on image-based autonomous driving in simulation over pure IL or pure RL.

arXiv preprint arXiv:1611.04201 , year=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer