arxiv: 2605.03909 · v1 · submitted 2026-05-05 · 💻 cs.RO · cs.CV

Task-Aware Scanning Parameter Configuration for Robotic Inspection Using Vision Language Embeddings and Hyperdimensional Computing

Zhiling Chen , David Gorsich , Matthew P. Castanier , Yang Zhang , Jiong Tang , Farhad Imani This is my paper

Pith reviewed 2026-05-07 15:26 UTC · model grok-4.3

classification 💻 cs.RO cs.CV

keywords robotic inspectionlaser profilingscanning parameter configurationhyperdimensional computingvision language embeddingstask-aware sensingmultimodal datasetInstruct-Obs2Param

0 comments

The pith

A hyperdimensional computing system recommends optimal laser scanner settings from a natural-language inspection task and an initial image.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the problem of tuning coupled parameters in robotic laser profilers, which currently depends on manual trial-and-error and often produces saturated or missing measurements. It defines the task of predicting discrete configurations for sampling frequency, measurement range, exposure time, receiver dynamic range, and illumination, given only a natural-language inspection instruction and a pre-scan RGB observation. The authors release the Instruct-Obs2Param dataset that pairs such instructions and multi-view observations across 16 objects with canonical parameter regimes. They introduce ScanHD, which encodes the inputs with vision-language embeddings, binds them into a task-aware hyperdimensional code, and performs parameter-wise associative lookup in compact memories. The resulting decisions reach 92.7 percent average exact accuracy and 98.1 percent Win@1 accuracy while running at low latency, outperforming rule-based heuristics and larger multimodal models.

Core claim

ScanHD binds instruction and observation into a task-aware code using hyperdimensional computing and performs parameter-wise associative reasoning with compact memories to match discrete scanner regimes, achieving 92.7 percent average exact accuracy and 98.1 percent average Win@1 accuracy across the five parameters with strong cross-split generalization on Instruct-Obs2Param.

What carries the argument

ScanHD, a hyperdimensional computing framework that encodes instruction and observation embeddings, binds them into task-aware vectors, and retrieves each parameter setting through associative memory lookup.

If this is right

Robotic systems can configure laser profilers autonomously from task intent and scene context without manual tuning.
Sensor configuration becomes an adaptive decision variable that improves measurement fidelity for each inspection instruction.
Low-latency inference supports real-time deployment on robot-mounted hardware.
The method generalizes across object and illumination splits within the collected data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same binding mechanism could be applied to configure other robot sensors such as cameras or depth cameras for different tasks.
Replacing the discrete associative memories with continuous regression heads would allow the approach to handle non-discrete parameter spaces.
Online updates to the compact memories could let the system adapt when the robot encounters previously unseen objects.
The compact size of the memories makes the method suitable for edge devices where large multimodal models cannot run.

Load-bearing premise

The five discrete parameter regimes captured in the dataset are sufficient to represent optimal configurations for the stated inspection intents.

What would settle it

Running the system on objects outside the original 16-object collection or under lighting conditions absent from the dataset and measuring whether exact accuracy falls below 80 percent.

Figures

Figures reproduced from arXiv: 2605.03909 by David Gorsich, Farhad Imani, Jiong Tang, Matthew P. Castanier, Yang Zhang, Zhiling Chen.

**Figure 1.** Figure 1: Instruction and observation dependent sensing parameter configuration in embodied inspection. (a) A detailoriented inspection instruction combined with insufficient exposure time leads to missing surface geometry. (b) A global inspection instruction requires full measurement range, but an incorrect range setting results in clipped geometry. (c) When sensing parameters are selected in accordance with bot… view at source ↗

**Figure 2.** Figure 2: Hardware setup of the ScanBot system. A UR3 robotic arm is equipped with a Keyence LJ-X8200 laser profiler and an Intel RealSense D435i RGB-D camera mounted on the end-effector. A GoPro HERO8 captures third-person views from a fixed tripod. The entire setup operates within a blackcurtained environment to ensure consistent and interferencefree measurements. The experimental workspace is enclosed by black … view at source ↗

**Figure 3.** Figure 3: The dataset comprises 16 representative objects commonly encountered in robotic inspection, including consumer electronics (e.g., smartphones), printed circuit boards, GPU modules, mechanical tools, and calibration blocks. These objects exhibit diverse geometric scales, surface reflectivity, and structural complexity, posing varying challenges for scanning. densely populated PCBs and GPU modules contain f… view at source ↗

**Figure 4.** Figure 4: Five key scanning parameters are discretized into three representative options each, forming a compact and interpretable action space. For each parameter, we explicitly indicate its primary and secondary driving factors, distinguishing whether it is mainly determined by inspection intent (instruction) or by observation-level cues such as surface reflectivity and brightness. Within appearance inspection, we… view at source ↗

**Figure 5.** Figure 5: The Data Evolution Flywheel Framework: Left: intent-conditioned instruction instantiation based on structured inspection intent, multi-view observations, and scanning prior knowledge. Middle: consistency-driven checking, expert-in-the-loop calibration, and iterative instruction–parameter refinement. Right: representative instruction–observation–parameter instances distilled through the flywheel. triplets (… view at source ↗

**Figure 6.** Figure 6: Dataset statistics of Instruct-Obs2Param. (a) Distribution of synthesized instructions across different inspection task types. (b) Distribution of instructions across the 16 inspected objects. Test Data hD Query Hypervector h1 . . . Similarity 1 Similarity 2 Similarity N C Class Hypervector 1 1 D C1 1 C Class Hypervector 2 2 D C2 1 C Class Hypervector N k D Ck 1 Encoding Associate Memory Train Data Single-… view at source ↗

**Figure 7.** Figure 7: Overview of the HDC learning procedure. (1) Encode raw data into hypervectors. (2) Hypervectors from the same class are aggregated to create class hypervectors. (3) Update class hypervectors in response to misclassifications. (4) Compare query hypervectors to class hypervectors via similarity during inference. defining the cosine similarity between two hypervectors as 𝛿(𝐡1 , 𝐡2 ) = 𝐡 ⊤ 1 𝐡2 ‖𝐡1‖2 ‖𝐡2‖2 (15… view at source ↗

**Figure 8.** Figure 8: Overview of the proposed ScanHD framework. ScanHD consists of four stages. (1) Data Evolution Flywheel constructs high-quality instruction–observation–parameter training instances (with associated canonical intent labels) through knowledge distillation, sample calibration, and instance generation. (2) Encoding Phase maps a visual observation and a natural-language instruction into a unified symbolic hyperv… view at source ↗

**Figure 9.** Figure 9: System-level evaluation of all-parameter correctness across different methods. A prediction is considered correct only if all five scanning parameters are inferred correctly for a given instruction–observation pair. Radar plots report (a) Exact Accuracy and (b) Win@1 Accuracy under this allparameter criterion. instruction; ResNet and ViT models trained on RGB observations are used to assess the contribut… view at source ↗

**Figure 10.** Figure 10: Prompt template used to evaluate the multimodal large language models for laser scanning parameter prediction. driven by inspection intent. For example, Logistic Regression and KNN achieve over 92% Exact Accuracy on sampling frequency and over 83% on measurement range, reflecting strong instruction-level regularities. However, their performance degrades on appearance-sensitive parameters. On exposure t… view at source ↗

**Figure 11.** Figure 11: Category-wise analysis of ScanHD across scanning parameters. Heatmaps show Exact Accuracy, Win@1 Accuracy, and F1 score, respectively. parameter-dependent generalization patterns. For object categories with complex geometry and heterogeneous materials, such as IC modules, PCBs, and GPUs, ScanHD achieves strong Exact and Win@1 Accuracy across all parameters, indicating effective transfer of instruction-c… view at source ↗

**Figure 12.** Figure 12: Data efficiency of ScanHD under limited supervision. Performance is evaluated by varying the fraction of training data from 20% to 100%. Curves report Exact Accuracy, Win@1 Accuracy, and F1 score for each scanning parameter view at source ↗

**Figure 13.** Figure 13: Comparison of inference latency across different methods. marginal overhead relative to ScanHD. This observation confirms that the instruction-conditioned hyperdimensional inference in ScanHD introduces minimal computational burden beyond basic feature fusion. In contrast, multimodal large language models exhibit substantially higher inference latency. Qwen3-VL-4BInstruct requires over an order of magn… view at source ↗

read the original abstract

Robotic laser profiling is widely used for dimensional verification and surface inspection, yet measurement fidelity is often dominated by sensor configuration rather than robot motion. Industrial profilers expose multiple coupled parameters, including sampling frequency, measurement range, exposure time, receiver dynamic range, and illumination, that are still tuned by trial-and-error; mismatches can cause saturation, clipping, or missing returns that cannot be recovered downstream. We formulate instruction-conditioned sensing parameter recommendation; given a pre-scan RGB observation and a natural-language inspection instruction, infer a discrete configuration over key parameters of a robot-mounted profiler. To benchmark this problem, we develop Instruct-Obs2Param, a real-world multimodal dataset linking inspection intents and multi-view pose and illumination variation across 16 objects to canonical parameter regimes. We then propose ScanHD, a hyperdimensional computing framework that binds instruction and observation into a task-aware code and performs parameter-wise associative reasoning with compact memories, matching discrete scanner regimes while yielding stable, interpretable, low-latency decisions. On Instruct-Obs2Param, ScanHD achieves 92.7% average exact accuracy and 98.1% average Win@1 accuracy across the five parameters, with strong cross-split generalization and low-latency inference suitable for deployment, outperforming rule-based heuristics, conventional multimodal models, and multimodal large language models. This work enables autonomous, instruction-conditioned sensing configuration from task intent and scene context, eliminating manual tuning and elevating sensor configuration from a static setting to an adaptive decision variable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds a new small-scale dataset and an HDC-based method for picking discrete scanner parameters from instructions plus RGB views, with solid numbers inside its 16-object collection but no tests on new objects or lighting shifts.

read the letter

The main takeaway is a concrete engineering step for robotic inspection: pair a natural-language task description with a quick RGB observation and output one of five discrete profiler settings. They collected Instruct-Obs2Param across 16 objects with pose and illumination changes, then built ScanHD to bind the instruction and image into a hypervector and retrieve the matching parameter regime via associative memory. On held-out splits the method reaches 92.7% exact accuracy and 98.1% Win@1 while staying fast and compact, which beats the rule-based and multimodal baselines they ran. That combination of a new benchmark and a working implementation is the useful part. The HDC route also keeps the decisions interpretable, which matters when you have to explain why the sensor was set a certain way. The evaluation stays inside the same 16 objects. Cross-split results therefore measure how well the bindings interpolate within this fixed collection rather than how the system behaves with unseen parts or lighting regimes that were never in the training data. There is no out-of-distribution detection or online update step, so the deployment claim rests on the untested assumption that the five regimes plus the encoding will remain reliable outside the lab collection. This work is aimed at people who build or maintain robotic inspection cells in manufacturing and want to reduce manual tuning. A reader already working on multimodal parameter selection or hyperdimensional methods could pull the dataset and the code pattern for their own experiments. It has enough concrete results and a clear problem definition to go to peer review. The main request in review should be additional tests on novel objects and lighting to check whether the reported accuracy holds up under distribution shift.

Referee Report

2 major / 2 minor

Summary. The paper formulates the task of instruction-conditioned configuration of coupled parameters (sampling frequency, measurement range, exposure time, receiver dynamic range, illumination) for a robot-mounted laser profiler. It introduces the Instruct-Obs2Param dataset linking natural-language inspection intents, multi-view RGB observations across 16 objects with pose/illumination variation, and canonical discrete parameter regimes. It proposes ScanHD, a hyperdimensional computing pipeline that encodes vision-language embeddings into task-aware codes, performs parameter-wise associative lookup in compact memories, and reports 92.7% average exact accuracy and 98.1% average Win@1 accuracy on cross-splits, outperforming rule-based heuristics, conventional multimodal models, and MLLMs while providing low-latency inference.

Significance. If the reported accuracies and latency hold under the stated protocol, the work supplies a concrete, interpretable, and deployable alternative to manual tuning for industrial robotic inspection. The Instruct-Obs2Param dataset is a useful benchmark contribution, and the HDC binding approach offers compactness and stability advantages over heavier multimodal models. These strengths support the claim that sensor configuration can be treated as an adaptive, task-aware decision variable.

major comments (2)

[Evaluation / Experiments] Evaluation section (cross-split protocol): the 92.7% exact / 98.1% Win@1 figures and the 'strong cross-split generalization' and 'suitable for deployment' claims rest on interpolation within the closed 16-object collection under controlled conditions. No experiments on novel objects or lighting regimes outside this set are reported, leaving the central assumption that the five discrete regimes plus HDC associative lookup will remain reliable under distribution shift unverified and load-bearing for the deployment narrative.
[§3] §3 (ScanHD architecture): the binding of instruction and observation embeddings into hyperdimensional codes and the subsequent parameter-wise memory lookup are described at a high level, but the precise encoding functions, bundling operations, and memory construction details are not given with sufficient equations or pseudocode to permit independent reproduction or verification of the claimed parameter-free character of the associative reasoning.

minor comments (2)

[Evaluation] Clarify the exact definition and computation of 'Win@1 accuracy' (is it top-1 among the five parameters or per-parameter?) and report per-parameter breakdowns in addition to the averages.
[Baselines] Provide implementation details or references for the MLLM baselines (model names, prompting templates, fine-tuning status) to allow assessment of the fairness of the comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below with clarifications and indicate where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: [Evaluation / Experiments] Evaluation section (cross-split protocol): the 92.7% exact / 98.1% Win@1 figures and the 'strong cross-split generalization' and 'suitable for deployment' claims rest on interpolation within the closed 16-object collection under controlled conditions. No experiments on novel objects or lighting regimes outside this set are reported, leaving the central assumption that the five discrete regimes plus HDC associative lookup will remain reliable under distribution shift unverified and load-bearing for the deployment narrative.

Authors: We agree that all reported results, including the 92.7% exact and 98.1% Win@1 accuracies, are obtained via cross-splits within the 16-object Instruct-Obs2Param collection under controlled pose and illumination variations. The protocol does ensure no overlap in object instances, views, or lighting between train and test, which supports our claims of strong cross-split generalization within the dataset's scope. However, we acknowledge that no experiments on entirely novel objects or unseen lighting regimes are included, leaving robustness under broader distribution shift untested. In the revision we will moderate the 'suitable for deployment' language to reflect this scope, add an explicit limitations paragraph discussing the assumption of similar industrial conditions, and clarify that the current results demonstrate utility for tasks matching the dataset's characteristics. revision: partial
Referee: [§3] §3 (ScanHD architecture): the binding of instruction and observation embeddings into hyperdimensional codes and the subsequent parameter-wise memory lookup are described at a high level, but the precise encoding functions, bundling operations, and memory construction details are not given with sufficient equations or pseudocode to permit independent reproduction or verification of the claimed parameter-free character of the associative reasoning.

Authors: We thank the referee for highlighting the need for greater technical precision. In the revised manuscript we will expand Section 3 with the exact encoding functions for mapping vision-language embeddings to hypervectors, the specific binding and bundling operations (including the mathematical definitions of the task-aware code construction), and the step-by-step procedure for building the parameter-wise associative memories. We will also include pseudocode for the complete ScanHD inference process to enable independent reproduction and to substantiate the parameter-free character of the associative lookup. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical accuracies measured on held-out cross-splits of a new dataset

full rationale

The paper introduces a new multimodal dataset (Instruct-Obs2Param) covering 16 objects and defines ScanHD as an HDC-based binding and associative lookup procedure. Reported performance (92.7% exact accuracy, 98.1% Win@1) is obtained by direct evaluation on cross-validation splits of that dataset. No equations, parameter fits, or self-citations are shown to reduce these accuracy figures to quantities already present in the training data or prior author work. The derivation chain consists of dataset collection followed by external benchmarking against baselines; the central claims remain falsifiable by new objects or lighting conditions outside the 16-object collection.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are stated. The method presumably relies on standard hyperdimensional computing operations and learned or hand-chosen binding vectors, but these details are not provided.

pith-pipeline@v0.9.0 · 5590 in / 1156 out tokens · 41190 ms · 2026-05-07T15:26:04.509438+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 3 canonical work pages · 2 internal anchors

[1]

Oztemel, S

E. Oztemel, S. Gursev, Literature review of Industry 4.0 and related technologies,Journalofintelligentmanufacturing31(1)(2020)127– 182

2020
[2]

Papavasileiou, G

A. Papavasileiou, G. Michalos, S. Makris, Quality control in manufacturing–review and challenges on robotic applications, Inter- nationalJournalofComputerIntegratedManufacturing38(1)(2025) 79–115

2025
[3]

S.Rescsanski,R.Hebert,A.Haghighi,J.Tang,F.Imani,Towardsin- telligentcooperativeroboticsinadditivemanufacturing:Past,present, and future, Robotics and Computer-Integrated Manufacturing 93 (2025) 102925

2025
[4]

X. Guo, B. Zhu, M. Chi, C. Liu, Y. Wei, Q. Fang, Modeling and compensation of measurement errors in hand-eye system for heavy- load industrial robots with line laser sensor, Robotics and Computer- Integrated Manufacturing 98 (2026) 103155

2026
[5]

Dhiman, A

G. Dhiman, A. V. Kumar, R. Nirmalan, S. Sujitha, K. Srihari, N. Yu- varaj, P. Arulprakash, R. A. Raja, Multi-modal active learning with deep reinforcement learning for target feature extraction in multi- media image processing applications, Multimedia Tools and Appli- cations 82 (4) (2023) 5343–5367

2023
[6]

Jiang, B

W. Jiang, B. Lei, K. Daniilidis, Fisherrf: Active view selection and mapping with radiance fields using fisher information, in: European Conference on Computer Vision, Springer, 422–440, 2024

2024
[7]

Vutetakis, J

D. Vutetakis, J. Xiao, Active perception network for non-myopic online exploration and visual surface coverage, The International Journal of Robotics Research 44 (2) (2025) 247–272

2025
[8]

J. Liu, Q. Chen, J. Wang, S. Sun, X. Zhang, J. Du, J. Jiang, Z. Tian, S. Yu, W. Yan, Geometric error modeling and compensation for high precision composite optical measurement systems, Optics Express 31 (25) (2023) 42015–42035

2023
[9]

D. A. Maisano, L. Mastrogiacomo, F. Franceschini, S. Capizzi, G. Pischedda, D. Laurenza, G. Gomiero, G. Manca, Dimensional measurements in the shipbuilding industry: on-site comparison of a state-of-the-art laser tracker, total station and laser scanner, Produc- tion Engineering 17 (3) (2023) 625–642

2023
[10]

H. Chen, S. Huo, M. Muddassir, H.-Y. Lee, Y. Liu, J. Li, A. Duan, P. Zheng, D. Navarro-Alarcon, PSO-based optimal coverage path planning for surface defect inspection of 3C components with a roboticlinescanner,IEEETransactionsonInstrumentationandMea- surement
[11]

Naghavi Khanghah, Z

K. Naghavi Khanghah, Z. Chen, L. Romeo, Q. Yang, R. Malhotra, F. Imani, H. Xu, Multimodal Rag-Driven Anomaly Detection and Classification in Laser Powder Bed Fusion Using Large Language Models, in: International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, vol. 89220, American Society of Mechanical Engi...

2025
[12]

J.Xu,Q.Sun,Q.-L.Han,Y.Tang,WhenembodiedAImeetsIndustry 5.0: Human-centered smart manufacturing, IEEE/CAA Journal of Automatica Sinica 12 (3) (2025) 485–501

2025
[13]

Hoang, R

D. Hoang, R. Chen, G. Bollas, F. Imani, Hyperdimensional comput- ing for explainable information fusion and multi-task adaptation in advanced manufacturing, Information Fusion (2025) 103898

2025
[14]

Z. Chen, D. Hoang, F. J. Piran, R. Chen, F. Imani, Federated Hy- perdimensional Computing for hierarchical and distributed quality monitoring in smart manufacturing, Internet of Things 31 (2025) 101568

2025
[15]

Z. Chen, F. Imani, A multi-expert framework for enhancing multi- modallargelanguagemodelsinindustrialanomalydetection,Pattern Recognition (2025) 112752

2025
[16]

Y.Liu,W.Zhao,H.Liu,Y.Wang,X.Yue,Coveragepathplanningfor robotic quality inspection with control on measurement uncertainty, IEEE/ASMETransactionsonMechatronics27(5)(2022)3482–3493

2022
[17]

M.-K. Kim, J. C. Cheng, H. Sohn, C.-C. Chang, A framework for dimensional and surface quality assessment of precast concrete ele- ments using BIM and 3D laser scanning, Automation in construction 49 (2015) 225–238

2015
[18]

Bajcsy, Active perception, Proceedings of the IEEE 76 (8) (1988) 966–1005

R. Bajcsy, Active perception, Proceedings of the IEEE 76 (8) (1988) 966–1005

1988
[19]

S. Wang, Y. Tong, X. Shang, Z. Zhang, Hierarchical viewpoint planning for complex surfaces in industrial product inspection, IEEE/ASMETransactionsonMechatronics29(5)(2023)3289–3299

2023
[20]

L.Jin,X.Chen,J.Rückin,M.Popović,Neu-nbv:Nextbestviewplan- ningusinguncertaintyestimationinimage-basedneuralrendering,in: 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 11305–11312, 2023

2023
[21]

O. S. Egwuche, A. Singh, A. E. Ezugwu, J. Greeff, M. O. Olusanya, L.Abualigah,Machinelearningforcoverageoptimizationinwireless sensor networks: a comprehensive review, Annals of Operations Research (2023) 1–67

2023
[22]

A.Gunatilake,L.Piyathilaka,A.Tran,V.K.Vishwanathan,K.Thiya- garajan,S.Kodagoda,Stereovisioncombinedwithlaserprofilingfor mapping of pipeline internal defects, IEEE Sensors Journal 21 (10) (2020) 11926–11934

2020
[23]

Torabi, S

M. Torabi, S. Mousavi G, D. Younesian, A new flexible laser beam profiler for the inspection of train wheels, Proceedings of the Insti- tution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit 235 (2) (2021) 215–225

2021
[24]

Z. Wang, L. Zhang, T. Fang, P. T. Mathiopoulos, X. Tong, H. Qu, Z. Xiao, F. Li, D. Chen, A multiscale and hierarchical feature extrac- tion method for terrestrial laser scanning point cloud classification, IEEETransactionsonGeoscienceandRemoteSensing53(5)(2014) 2409–2425

2014
[25]

B. Ai, S. Tian, H. Shi, Y. Wang, T. Pfaff, C. Tan, H. I. Christensen, H. Su, J. Wu, Y. Li, A review of learning-based dynamics models for robotic manipulation, Science Robotics 10 (106) (2025) eadt1497

2025
[26]

R. Shao, W. Li, L. Zhang, R. Zhang, Z. Liu, R. Chen, L. Nie, Large vlm-basedvision-language-actionmodelsforroboticmanipulation:A survey, arXiv preprint arXiv:2508.13073 . First Author et al.:Preprint submitted to ElsevierPage 19 of 20 Task-Aware Scanning Parameter Configuration for Robotic Inspection

work page internal anchor Pith review arXiv
[27]

Doveh, N

S. Doveh, N. Shabtay, E. Schwartz, H. Kuehne, R. Giryes, R. Feris, L.Karlinsky,J.Glass,A.Arbelle,S.Ullman,etal.,TeachingVLMsto LocalizeSpecificObjectsfromIn-contextExamples,in:Proceedings of the IEEE/CVF International Conference on Computer Vision, 9572–9582, 2025

2025
[28]

Engelbracht, R

T. Engelbracht, R. Zurbrügg, M. Pollefeys, H. Blum, Z. Bauer, Spot- light:Roboticsceneunderstandingthroughinteractionandaffordance detection, in: 2025 IEEE-RAS 24th International Conference on Humanoid Robots (Humanoids), IEEE, 1–8, 2025

2025
[29]

G.Sarch,L.Jang,M.Tarr,W.W.Cohen,K.Marino,K.Fragkiadaki, Vlm agents generate their own memories: Distilling experience into embodied programs of thought, Advances in Neural Information Processing Systems 37 (2024) 75942–75985

2024
[30]

Y. Feng, J. Han, Z. Yang, X. Yue, S. Levine, J. Luo, Reflective plan- ning: Vision-language models for multi-stage long-horizon robotic manipulation, arXiv preprint arXiv:2502.16707

work page arXiv
[31]

89213, American Society of Mechanical Engineers, V02BT02A051, 2025

Z.Chen,H.Chen,M.Imani,F.Imani,Canmultimodallargelanguage modelsbeguidedtoimproveindustrialanomalydetection?,in:Inter- national Design Engineering Technical Conferences and Computers and Information in Engineering Conference, vol. 89213, American Society of Mechanical Engineers, V02BT02A051, 2025

2025
[32]

O.Mees,L.Hermann,E.Rosete-Beas,W.Burgard,Calvin:Abench- markforlanguage-conditionedpolicylearningforlong-horizonrobot manipulation tasks, IEEE Robotics and Automation Letters 7 (3) (2022) 7327–7334

2022
[33]

B. Liu, Y. Zhu, C. Gao, Y. Feng, Q. Liu, Y. Zhu, P. Stone, Libero: Benchmarking knowledge transfer for lifelong robot learning, Ad- vances in Neural Information Processing Systems 36 (2023) 44776– 44791

2023
[34]

C. Li, R. Zhang, J. Wong, C. Gokmen, S. Srivastava, R. Martín- Martín, C. Wang, G. Levine, M. Lingelbach, J. Sun, et al., Behavior- 1k: A benchmark for embodied ai with 1,000 everyday activities and realistic simulation, in: Conference on Robot Learning, PMLR, 80– 93, 2023

2023
[35]

Kanerva, Hyperdimensional computing: An introduction to com- puting in distributed representation with high-dimensional random vectors, Cognitive computation 1 (2) (2009) 139–159

P. Kanerva, Hyperdimensional computing: An introduction to com- puting in distributed representation with high-dimensional random vectors, Cognitive computation 1 (2) (2009) 139–159

2009
[36]

Neubert, S

P. Neubert, S. Schubert, P. Protzel, An introduction to hyperdimen- sionalcomputingforrobotics,KI-KünstlicheIntelligenz33(4)(2019) 319–330

2019
[37]

Menon, A

A. Menon, A. Natarajan, L. I. G. Olascoaga, Y. Kim, B. Benedict, J. M. Rabaey, On the role of hyperdimensional computing for be- havioral prioritization in reactive robot navigation tasks, in: 2022 InternationalConferenceonRoboticsandAutomation(ICRA),IEEE, 7335–7341, 2022

2022
[38]

Neubert, S

P. Neubert, S. Schubert, P. Protzel, Learning vector symbolic archi- tectures for reactive robot behaviours
[39]

H. Kwon, K. Kim, J. Lee, H. Lee, J. Kim, J. Kim, T. Kim, Y. Kim, Y.Ni,M.Imani,etal.,Brain-inspiredhyperdimensionalcomputingin the wild: Lightweight symbolic learning for sensorimotor controls of wheeled robots, in: 2024 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 5176–5182, 2024

2024
[40]

Keyence Corporation, LJ-X8200 High-Speed 2D/3D Laser Pro- filer,https://www.keyence.com/products/measure/laser-2d/lj-x8000/ models/lj-x8200/, accessed: May 5, 2025, 2025

2025
[41]

Hernández-Cano, N

A. Hernández-Cano, N. Matsumoto, E. Ping, M. Imani, Onlinehd: Robust, efficient, and single-pass online learning using hyperdimen- sional system, in: 2021 Design, Automation & Test in Europe Con- ference & Exhibition (DATE), IEEE, 56–61, 2021

2021
[42]

S. Bai, Y. Cai, R. Chen, K. Chen, X. Chen, Z. Cheng, L. Deng, W.Ding,C.Gao,C.Ge,W.Ge,Z.Guo,Q.Huang,J.Huang,F.Huang, B.Hui,S.Jiang,Z.Li,M.Li,M.Li,K.Li,Z.Lin,J.Lin,X.Liu,J.Liu, C.Liu,Y.Liu,D.Liu,S.Liu,D.Lu,R.Luo,C.Lv,R.Men,L.Meng, X. Ren, X. Ren, S. Song, Y. Sun, J. Tang, J. Tu, J. Wan, P. Wang, P. Wang, Q. Wang, Y. Wang, T. Xie, Y. Xu, H. Xu, J. Xu, Z. Yan...

work page internal anchor Pith review arXiv