Foundation models in robotics: Applications, challenges, and the future

Roya Firoozi et al · 2023 · arXiv 2312.07843

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

Runtime Monitoring of Perception-Based Autonomous Systems via Embedding Temporal Logic

cs.LG · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

Embedding Temporal Logic (ETL) performs runtime monitoring directly in learned embedding spaces using distance-based predicates composed with temporal operators, supported by conformal calibration for reliable predicate evaluation.

SemiFA: An Agentic Multi-Modal Framework for Autonomous Semiconductor Failure Analysis Report Generation

cs.CV · 2026-04-14 · unverdicted · novelty 7.0

SemiFA is a four-agent LangGraph pipeline that combines DINOv2 and LLaVA image analysis with SECS/GEM telemetry and vector retrieval to produce complete FA reports in 48 seconds.

KITE: Keyframe-Indexed Tokenized Evidence for VLM-Based Robot Failure Analysis

cs.RO · 2026-04-08 · unverdicted · novelty 7.0

KITE is a training-free method that uses keyframe-indexed tokenized evidence including BEV schematics to enhance VLM performance on robot failure detection, identification, localization, explanation, and correction.

ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation

cs.RO · 2024-09-03 · conditional · novelty 7.0

ReKep encodes robotic tasks as optimizable Python functions over 3D keypoints that are generated automatically from language and RGB-D input, enabling real-time hierarchical planning on single- and dual-arm platforms without task-specific data.

Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies

cs.RO · 2026-03-12 · unverdicted · novelty 6.0

Q-DIG applies quality diversity optimization with vision-language models to generate diverse adversarial instructions that reveal VLA robot failures and enable robustness improvements via fine-tuning.

A Survey on Vision-Language-Action Models for Embodied AI

cs.RO · 2024-05-23 · unverdicted · novelty 6.0

This is the first survey on vision-language-action models, providing a taxonomy across three lines, plus summaries of datasets, simulators, benchmarks, challenges, and future directions in embodied AI.

Large Language Models for Multi-Robot Systems: A Survey

cs.RO · 2025-02-06 · unverdicted · novelty 4.0

A survey that categorizes LLM uses in multi-robot systems across task allocation, motion planning, action generation, and human interaction, while noting challenges and future research opportunities.

A Tutorial on World Models and Physical AI

cs.AI · 2026-06-11 · unverdicted · novelty 2.0

A tutorial that unifies explicit and implicit world models through shared predictive structure for applications in physical AI such as robotics.

Vision-Language-Action Jump-Starting for Reinforcement Learning Robotic Agents

cs.LG · 2026-04-15

citing papers explorer

Showing 1 of 1 citing paper after filters.

Vision-Language-Action Jump-Starting for Reinforcement Learning Robotic Agents cs.LG · 2026-04-15 · unreviewed · ref 9

Foundation models in robotics: Applications, challenges, and the future

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer