The paper proves W[1]-hardness parameterized by dimension d for positivity, zonotope containment, max approximation, and L_p-Lipschitz constants in 2- and 3-layer ReLU networks, showing enumeration methods are optimal under ETH.
hub Mixed citations
End to End Learning for Self-Driving Cars
Mixed citation behavior. Most common role is background (50%).
abstract
We trained a convolutional neural network (CNN) to map raw pixels from a single front-facing camera directly to steering commands. This end-to-end approach proved surprisingly powerful. With minimum training data from humans the system learns to drive in traffic on local roads with or without lane markings and on highways. It also operates in areas with unclear visual guidance such as in parking lots and on unpaved roads. The system automatically learns internal representations of the necessary processing steps such as detecting useful road features with only the human steering angle as the training signal. We never explicitly trained it to detect, for example, the outline of roads. Compared to explicit decomposition of the problem, such as lane marking detection, path planning, and control, our end-to-end system optimizes all processing steps simultaneously. We argue that this will eventually lead to better performance and smaller systems. Better performance will result because the internal components self-optimize to maximize overall system performance, instead of optimizing human-selected intermediate criteria, e.g., lane detection. Such criteria understandably are selected for ease of human interpretation which doesn't automatically guarantee maximum system performance. Smaller networks are possible because the system learns to solve the problem with the minimal number of processing steps. We used an NVIDIA DevBox and Torch 7 for training and an NVIDIA DRIVE(TM) PX self-driving car computer also running Torch 7 for determining where to drive. The system operates at 30 frames per second (FPS).
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Diffusion Policy models robot actions as a conditional diffusion process, outperforming prior state-of-the-art methods by 46.9% on average across 12 manipulation tasks from four benchmarks.
LOSCAR-SGD combines local updates, sparse model averaging, and communication-computation overlap with a delay-corrected merge rule, providing convergence rates for smooth non-convex objectives under worker heterogeneity.
Ringmaster LMO extends delay-thresholding from ASGD to LMO-based momentum updates, providing convergence guarantees under (L0, L1)-smoothness and time-complexity bounds that recover optimal rates in the Euclidean case.
4DLidarOpen is a new open dataset providing synchronized 4D FMCW Lidar velocity measurements, multi-Lidar and camera data, and 3D bounding-box annotations with track IDs to support benchmarks on 3D detection, BEV segmentation, flow prediction, and motion forecasting.
Bench2Drive-Robust is a new closed-loop benchmark that evaluates end-to-end autonomous driving models under deployment perturbations from camera failures, ego-state errors, and compute delays, showing substantial performance degradation beyond image-level tests.
DRATS derives a minimax objective from a feasibility formulation of MTRL to adaptively sample tasks with the largest return gaps, leading to better worst-task performance on MetaWorld benchmarks.
Sub-network Laplace approximations always underestimate full-model predictive variance, and two new gradient-based and greedy selection rules provide theoretically grounded improvements.
TCD-Arena is a new customizable testing framework that runs millions of experiments to map how 33 different assumption violations affect time series causal discovery methods and shows ensembles can boost overall robustness.
ST-BCP tightens the coverage bound in Backward Conformal Prediction by applying a computable data-dependent transformation to nonconformity scores, reducing the average gap from 4.20% to 1.12% on benchmarks while proving superiority over the identity baseline.
A training-free method using Fourier-parameterized star-convex contours optimized via gradients to generate compact, faithful visual attributions for image classifiers on benchmarks like ImageNet.
DSRL steers pretrained diffusion policies for robotics by applying RL to their latent noise inputs, achieving sample-efficient real-world adaptation with only black-box access.
Dywave uses wavelet hierarchical decomposition to create event-aligned compact token sequences for heterogeneous IoT signals, yielding up to 12% accuracy gains and 75% shorter inputs on mainstream sequence models across five datasets.
Smaller end-to-end autonomous driving models achieve optimal 3-second trajectory prediction accuracy at lower or intermediate temporal sampling frequencies, whereas larger VLA-style models perform best at the highest frequencies across Waymo, nuScenes, and PAVE datasets.
ReflectDrive-2 combines masked discrete diffusion with RL-aligned self-editing to generate and refine driving trajectories, reaching 91.0 PDMS on NAVSIM camera-only and 94.8 in best-of-6.
A broad empirical benchmark shows how 15 existing test selection metrics perform for fault detection, performance estimation, and retraining under corrupted, adversarial, temporal, natural, and label shifts across image, text, and Android data.
FingerViP equips each finger with a miniature camera and trains a multi-view diffusion policy that achieves 80.8% success on real-world dexterous tasks previously limited by wrist-camera occlusion.
MVAdapt conditions end-to-end autonomous driving policies on explicit vehicle physics to achieve better zero-shot transfer and few-shot calibration across different vehicles in CARLA simulation.
MOSAIC is a scaling-aware data selection framework that outperforms baselines in training end-to-end autonomous driving planners, achieving comparable or better EPDMS scores with up to 80% less data.
Safety-aware metrics and losses for 3D detection improve critical error handling in autonomous vehicle perception across single-vehicle, cooperative, and end-to-end settings.
SutureFormer models needle tip movement in video as sequential pixel-space actions via goal-conditioned offline RL with spline-based reward densification, cutting average displacement error by 58.6% on a new 1,158-trajectory kidney suturing dataset.
The paper introduces Hyper Diffusion Planner (HDP), a diffusion-based E2E AD framework that identifies insights on loss space, trajectory representation and data scaling, adds RL post-training, and reports 10x performance gains over 200 km of real-world testing across 6 scenarios.
Reducing expert-student asymmetries in visibility, uncertainty, and route specification enables a new TransFuser v6 policy that reaches 95 DS on Bench2Drive and more than doubles prior scores on Longest6 v2 and Town13.
Alpamayo-R1 introduces a VLA model with a Chain of Causation dataset and multi-stage SFT-plus-RL training that reports 12% better planning accuracy and 35% fewer close encounters versus trajectory-only baselines in driving tasks.
citing papers explorer
-
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Diffusion Policy models robot actions as a conditional diffusion process, outperforming prior state-of-the-art methods by 46.9% on average across 12 manipulation tasks from four benchmarks.
-
4DLidarOpen: An Open 4D FMCW Lidar Dataset for Motion-Aware Autonomous Driving
4DLidarOpen is a new open dataset providing synchronized 4D FMCW Lidar velocity measurements, multi-Lidar and camera data, and 3D bounding-box annotations with track IDs to support benchmarks on 3D detection, BEV segmentation, flow prediction, and motion forecasting.
-
Bench2Drive-Robust: Benchmarking Closed-Loop Autonomous Driving under Deployment Perturbations
Bench2Drive-Robust is a new closed-loop benchmark that evaluates end-to-end autonomous driving models under deployment perturbations from camera failures, ego-state errors, and compute delays, showing substantial performance degradation beyond image-level tests.
-
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
DSRL steers pretrained diffusion policies for robotics by applying RL to their latent noise inputs, achieving sample-efficient real-world adaptation with only black-box access.
-
ReflectDrive-2: Reinforcement-Learning-Aligned Self-Editing for Discrete Diffusion Driving
ReflectDrive-2 combines masked discrete diffusion with RL-aligned self-editing to generate and refine driving trajectories, reaching 91.0 PDMS on NAVSIM camera-only and 94.8 in best-of-6.
-
FingerViP: Learning Real-World Dexterous Manipulation with Fingertip Visual Perception
FingerViP equips each finger with a miniature camera and trains a multi-view diffusion policy that achieves 80.8% success on real-world dexterous tasks previously limited by wrist-camera occlusion.
-
MVAdapt: Zero-Shot Multi-Vehicle Adaptation for End-to-End Autonomous Driving
MVAdapt conditions end-to-end autonomous driving policies on explicit vehicle physics to achieve better zero-shot transfer and few-shot calibration across different vehicles in CARLA simulation.
-
SutureFormer: Learning Surgical Trajectories via Goal-conditioned Offline RL in Pixel Space
SutureFormer models needle tip movement in video as sequential pixel-space actions via goal-conditioned offline RL with spline-based reward densification, cutting average displacement error by 58.6% on a new 1,158-trajectory kidney suturing dataset.
-
Unleashing the Potential of Diffusion Models for End-to-End Autonomous Driving
The paper introduces Hyper Diffusion Planner (HDP), a diffusion-based E2E AD framework that identifies insights on loss space, trajectory representation and data scaling, adds RL post-training, and reports 10x performance gains over 200 km of real-world testing across 6 scenarios.
-
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail
Alpamayo-R1 introduces a VLA model with a Chain of Causation dataset and multi-stage SFT-plus-RL training that reports 12% better planning accuracy and 35% fewer close encounters versus trajectory-only baselines in driving tasks.
-
Octo: An Open-Source Generalist Robot Policy
Octo is an open-source transformer-based generalist robot policy pretrained on 800k trajectories that serves as an effective initialization for finetuning across diverse robotic platforms.
-
3D Diffuser Actor: Policy Diffusion with 3D Scene Representations
3D Diffuser Actor unifies diffusion policies with 3D scene features to set new state-of-the-art results on RLBench and CALVIN robot benchmarks.
-
Anomaly-Informed Confidence Calibration for Vision-Based Safety Prediction
Fusing perceptual and dynamics anomaly scores enables online temperature scaling that cuts expected calibration error by 37% on physical DonkeyCar tests with four unseen anomaly types.
-
Reliable and Real-Time Highway Trajectory Planning via Hybrid Learning-Optimization Frameworks
Hybrid learning-optimization framework for highway trajectory planning that reports over 97% scenario success rate and 54 ms average cycle time on the HighD dataset while enforcing formal safety via MIQP.
-
NeuroTrajectory: A Neuroevolutionary Approach to Local State Trajectory Learning for Autonomous Vehicles
NeuroTrajectory is a neuroevolutionary method that trains deep neural networks via genetic algorithms to estimate multi-objective optimal state trajectories over a finite horizon for autonomous vehicle motion planning.
-
Multimodal embodiment-aware navigation transformer
ViLiNT improves goal-conditioned navigation success rates by 166% on average over vision-only baselines across simulations and real rover tests by combining multimodal sensing with embodiment-conditioned diffusion trajectories and clearance scoring.
-
ADAPS: Autonomous Driving Via Principled Simulations
ADAPS generates accident data via simulations and employs a memory-enabled hierarchical policy with efficient online learning to produce robust autonomous driving controllers tested in simulation.
-
Multi-Task Regression-based Learning for Autonomous Unmanned Aerial Vehicle Flight Control within Unstructured Outdoor Environments
End-to-end multi-task regression learns flight commands for UAVs to explore unstructured forest environments from vision alone, outperforming pose-estimation baselines in simulation.
-
A Hierarchical Architecture for Sequential Decision-Making in Autonomous Driving using Deep Reinforcement Learning
A hierarchical DRL architecture generates lane-change commands from occupancy grids for stochastic highway driving and claims improved reliability over end-to-end methods.
-
An Introduction to Deep Reinforcement and Imitation Learning
The paper delivers a concise, self-contained tutorial on foundational DRL algorithms including REINFORCE and PPO and DIL methods including behavioral cloning, DAgger, and GAIL for embodied agents.
- Lost in Fog: Sensor Perturbations Expose Reasoning Fragility in Driving VLAs
- State-Conditional Adversarial Learning: An Off-Policy Visual Domain Transfer Method for End-to-End Imitation Learning