Neuro-Symbolic Control with Large Language Models for Language-Guided Spatial Tasks

Jose M. Merigo; Momina Liaqat Ali; Muhammad Abid; Muhammad Saqlain

arxiv: 2512.17321 · v2 · submitted 2025-12-19 · 💻 cs.RO

Neuro-Symbolic Control with Large Language Models for Language-Guided Spatial Tasks

Momina Liaqat Ali , Muhammad Abid , Muhammad Saqlain , Jose M. Merigo This is my paper

Pith reviewed 2026-05-16 21:16 UTC · model grok-4.3

classification 💻 cs.RO

keywords neuro-symbolic controllarge language modelsspatial manipulationlanguage-guided roboticsdelta controllerembodied AIcontinuous controlsymbolic reasoning

0 comments

The pith

A neuro-symbolic framework lets an LLM handle high-level language reasoning while a neural delta controller executes precise continuous motions, raising success rates and cutting steps by more than 70 percent in spatial manipulation tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether splitting control between a language model for symbolic task interpretation and a lightweight neural controller for incremental physical actions improves performance over using either component alone. In planar object-manipulation experiments defined by spatial language instructions, the combined system is compared against pure LLM baselines and pure neural baselines across several locally run models. Results show higher task completion rates together with average step reductions above 70 percent and speedups reaching 8.83 times, all without reinforcement learning or environment rollouts. The neural controller is trained once on synthetic geometric data and then used unchanged, which the authors credit for stability and robustness to changes in language-model quality. This decomposition is presented as a practical route to reliable language-guided embodied control.

Core claim

The central claim is that assigning symbolic task interpretation to a locally deployed LLM while routing uninterpreted low-level execution to a neural delta controller trained on artificial geometric data produces higher success rates, more than 70 percent fewer steps on average, and speedups up to 8.83 times compared with LLM-only or neural-only baselines in language-specified planar manipulation tasks.

What carries the argument

The neuro-symbolic split that keeps the LLM on symbolic outputs and delegates bounded incremental actions in continuous space to a neural delta controller trained only on synthetic geometry.

If this is right

Success rates rise because the LLM is prevented from generating hallucinated low-level actions.
Step counts and execution time drop sharply once motion execution is off-loaded to the trained delta controller.
Performance stays consistent across different language models because the neural component absorbs the continuous-control burden.
No reinforcement learning or costly rollouts are required, lowering the barrier to deployment.
Interpretability increases because the LLM's output remains symbolic and inspectable.
pith_inferences=[

Load-bearing premise

A lightweight neural delta controller trained solely on artificial geometric data can execute the required bounded incremental actions reliably in the target continuous space without further training or adaptation.

What would settle it

Running the same tasks in a new continuous-space geometry where the neural controller produces visibly incorrect incremental moves or the overall success rate falls below the LLM-only baseline would falsify the claim that the split reliably improves performance.

Figures

Figures reproduced from arXiv: 2512.17321 by Jose M. Merigo, Momina Liaqat Ali, Muhammad Abid, Muhammad Saqlain.

**Figure 1.** Figure 1: Motivation for neuro-symbolic control. End-to-end LLM-based control directly predicts continuous actions, [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Overview of the proposed neuro-symbolic control framework. A local large language model performs [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Success rate aggregated across all language models for each spatial task. The proposed LLM+DL framework [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: Total average number of control steps for all language models. Compared to LLM-only control, the LLM+DL [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗

**Figure 5.** Figure 5: Normalized distance-to-goal over time for the [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

**Figure 6.** Figure 6: Speedup of LLM+DL relative to LLM-only control aggregated across language models. The proposed [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

**Figure 7.** Figure 7: Success rate by language model and task. The suggested LLM+DL framework consistently improves [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

**Figure 8.** Figure 8: Success rate by task and language model. The neuro-symbolic LLM+DL framework improves reliability [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗

**Figure 9.** Figure 9: The average number of control steps for the [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗

**Figure 10.** Figure 10: Normalized distance-to-goal over time for all spatial tasks. The proposed LLM+DL framework converges [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗

**Figure 11.** Figure 11: Model-specific speedup (left) and success-rate improvement (right) of LLM+DL over LLM-only control. [PITH_FULL_IMAGE:figures/full_fig_p023_11.png] view at source ↗

read the original abstract

Although large language models (LLMs) have recently become effective tools for language-conditioned control in embodied systems, instability, slow convergence, and hallucinated actions continue to limit their direct application to continuous control. A modular neuro-symbolic control framework that clearly distinguishes between low-level motion execution and high-level semantic reasoning is proposed in this work. While a lightweight neural delta controller performs bounded, incremental actions in continuous space, a locally deployed LLM interprets symbolic tasks. We assess the suggested method in a planar manipulation setting with spatial relations between objects specified by language. Numerous tasks and local language models, such as Mistral, Phi, and LLaMA-3.2, are used in extensive experiments to compare LLM-only control, neural-only control, and the suggested LLM+DL framework. In comparison to LLM-only baselines, the results show that the neuro-symbolic integration consistently increases both success rate and efficiency, achieving average step reductions exceeding 70% and speedups of up to 8.83x while remaining robust to language model quality. The suggested framework enhances interpretability, stability, and generalization without any need of reinforcement learning or costly rollouts by controlling the LLM to symbolic outputs and allocating uninterpreted execution to a neural controller trained on artificial geometric data. These outputs show empirically that neuro-symbolic decomposition offers a scalable and principled way to integrate language understanding with ongoing control, this approach promotes the creation of dependable and effective language-guided embodied systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The neuro-symbolic split delivers measurable efficiency gains over pure LLM control in the reported planar tasks, but the neural delta controller's generalization from synthetic data is the untested piece that could undermine the claims.

read the letter

The paper's main finding is that routing high-level language tasks through an LLM for symbolic outputs while handing low-level bounded increments to a lightweight neural delta controller trained on artificial geometric data produces consistent improvements. In their planar manipulation experiments it cuts average steps by over 70 percent and delivers speedups up to 8.83 times compared with LLM-only baselines, and the gains hold across Mistral, Phi, and LLaMA-3.2. The setup avoids reinforcement learning and rollouts entirely, which is a practical advantage for stability and interpretability in continuous space control. What the work does well is the clean empirical comparison across the three regimes (LLM-only, neural-only, and combined) on multiple spatial-relation tasks. The modular split is straightforward to understand and the reported numbers show the combined version outperforming the separate pieces without extra training overhead. The soft spot is the controller itself. It is trained only on synthetic geometry, yet the abstract supplies no error bounds after repeated steps, no ablation on contact, friction, sensor noise, or non-convex shapes, and no quantitative check on whether accumulated drift stays inside the tolerance the symbolic layer expects. If the delta actions drift too far, the efficiency numbers collapse regardless of how good the LLM is. The abstract also omits task definitions, statistical tests, and exact architecture details, so the central performance claims rest on summary statistics alone. This is useful reading for anyone working on language-conditioned robotics who wants a simple modular baseline that does not require heavy simulation. A reader looking for concrete numbers on neuro-symbolic decomposition in planar settings will find something to build on. It deserves peer review because the framework is clearly motivated, the gains are quantified, and the open question about controller reliability is straightforward for referees to examine once the full methods and ablations are available.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a modular neuro-symbolic framework for language-guided planar manipulation tasks. A locally deployed LLM handles high-level symbolic reasoning and task interpretation, while a lightweight neural delta controller (trained exclusively on synthetic geometric data) executes bounded incremental actions in continuous space. Experiments compare LLM-only, neural-only, and combined LLM+DL approaches across multiple spatial-relation tasks and models (Mistral, Phi, LLaMA-3.2), reporting consistent gains in success rate and efficiency: average step reductions exceeding 70% and speedups up to 8.83x, with robustness to LLM quality and no requirement for RL or rollouts.

Significance. If the empirical claims hold under rigorous validation, the work demonstrates a practical, interpretable decomposition that mitigates LLM instability in continuous control while avoiding expensive training. This could offer a scalable template for integrating language understanding with low-level execution in embodied systems, particularly where direct LLM control or end-to-end RL is impractical.

major comments (3)

[Abstract / Experiments] Abstract and Experiments section: The central quantitative claims (average step reductions >70%, speedups up to 8.83x, robustness across Mistral/Phi/LLaMA-3.2) are presented without task definitions, number of trials per condition, statistical tests, error bars, or variance measures. These omissions make it impossible to evaluate whether the reported efficiency gains are statistically reliable or merely anecdotal.
[Controller / Experiments] Controller description (likely §3–4): The neural delta controller is trained only on artificial geometric data and asserted to execute bounded incremental actions reliably in continuous planar space. No ablation isolates controller error under task-relevant conditions (object contact, friction, sensor noise, non-convex geometries), nor is any quantitative bound on cumulative position drift after k steps provided. This assumption is load-bearing for the claimed 70%+ step reductions; if drift exceeds symbolic-layer tolerance, the neuro-symbolic gains collapse regardless of LLM quality.
[Experiments / Baselines] Baseline comparison: The LLM-only baseline is described as directly outputting actions, yet the precise prompting strategy, action discretization, and failure modes (hallucinations, instability) are not detailed. Without this, it is unclear whether the reported improvements stem from the neuro-symbolic split or from differences in how the LLM is constrained to symbolic outputs.

minor comments (2)

[Abstract] Abstract phrasing: 'Numerous tasks and local language models, such as Mistral, Phi, and LLaMA-3.2, are used in extensive experiments' is grammatically awkward and should be rephrased for clarity (e.g., 'extensive experiments are conducted across numerous tasks using local LLMs including...').
[Abstract / Conclusion] The claim of 'no need of reinforcement learning or costly rollouts' is repeated; a single concise statement in the introduction or conclusion would suffice.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, indicating the revisions that will be incorporated into the next manuscript version.

read point-by-point responses

Referee: [Abstract / Experiments] Abstract and Experiments section: The central quantitative claims (average step reductions >70%, speedups up to 8.83x, robustness across Mistral/Phi/LLaMA-3.2) are presented without task definitions, number of trials per condition, statistical tests, error bars, or variance measures. These omissions make it impossible to evaluate whether the reported efficiency gains are statistically reliable or merely anecdotal.

Authors: We agree that these experimental details are necessary for rigorous evaluation. In the revised manuscript we will add explicit definitions of all spatial-relation tasks (including a summary table), the exact number of trials per condition (50 independent trials for each task-model pair), standard-deviation error bars on all reported metrics, and statistical significance tests (paired Wilcoxon signed-rank tests) comparing the neuro-symbolic method against baselines. These additions will appear in both the abstract and the Experiments section. revision: yes
Referee: [Controller / Experiments] Controller description (likely §3–4): The neural delta controller is trained only on artificial geometric data and asserted to execute bounded incremental actions reliably in continuous planar space. No ablation isolates controller error under task-relevant conditions (object contact, friction, sensor noise, non-convex geometries), nor is any quantitative bound on cumulative position drift after k steps provided. This assumption is load-bearing for the claimed 70%+ step reductions; if drift exceeds symbolic-layer tolerance, the neuro-symbolic gains collapse regardless of LLM quality.

Authors: The referee correctly highlights a missing robustness analysis. While the bounded-action design was intended to limit drift, we did not quantify controller error under realistic conditions. In the revision we will add an ablation evaluating the delta controller under simulated sensor noise, friction, and contact scenarios, together with an analytic upper bound on cumulative position drift after k steps derived from the action bounds and measured controller accuracy on the synthetic data. These results will be presented in a new subsection of the Experiments section. revision: yes
Referee: [Experiments / Baselines] Baseline comparison: The LLM-only baseline is described as directly outputting actions, yet the precise prompting strategy, action discretization, and failure modes (hallucinations, instability) are not detailed. Without this, it is unclear whether the reported improvements stem from the neuro-symbolic split or from differences in how the LLM is constrained to symbolic outputs.

Authors: We agree that the LLM-only baseline description is insufficiently detailed. The revised manuscript will expand this section to include the exact prompt template used to elicit direct continuous actions, the discretization scheme (fixed increments in x, y, and rotation), and the observed failure modes (action hallucinations, instability over long horizons). This clarification will demonstrate that the performance gains arise from the neuro-symbolic decomposition rather than prompting differences alone. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical comparisons rest on independent evaluations

full rationale

The paper reports direct experimental results comparing LLM-only, neural-only, and neuro-symbolic control across multiple spatial tasks and local LLMs (Mistral, Phi, LLaMA-3.2). Central metrics (success rate, >70% step reduction, up to 8.83x speedup) are obtained from explicit trials rather than any derivation, equation, or fitted parameter that reduces to its own inputs by construction. The neural delta controller is trained on separate artificial geometric data and evaluated as an independent module; no self-citation chain, ansatz smuggling, or uniqueness theorem is invoked to justify the performance claims. The framework is self-contained against external benchmarks via ablation-style comparisons, yielding no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the unproven effectiveness of the neural delta controller for continuous execution and the assumption that symbolic outputs from the LLM map cleanly to actionable increments without further verification.

axioms (1)

domain assumption Lightweight neural delta controller trained on artificial geometric data suffices for bounded continuous control in the target domain.
Invoked to justify allocation of execution to the neural component without RL.

pith-pipeline@v0.9.0 · 5572 in / 1182 out tokens · 37899 ms · 2026-05-16T21:16:45.265410+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · 4 internal anchors

[1]

Bandyopadhyay, D., Bhattacharjee, S., & Ekbal, A. (2025). Thinking machines: A survey of LLM-based reasoning strategies.arXiv preprint arXiv:2503.10814

work page arXiv 2025
[2]

Zhang, Y ., Wang, H., Feng, S., Tan, Z., Han, X., He, T., & Tsvetkov, Y . (2024). Can LLM graph reasoning generalize beyond pattern memorization?arXiv preprint arXiv:2406.15992

work page arXiv 2024
[3]

Liu, L., Nair, A., Peng, T., Desai, S., Gupta, M., Mehta, K., & Singh, P. (2024). Optimizing task planning efficiency in LLMs: Beyond closed-loop systems.Authorea Preprints

work page 2024
[4]

Banerjee, S., Agarwal, A., & Singla, S. (2025). LLMs will always hallucinate, and we need to live with this. In Proceedings of the Intelligent Systems Conference(pp. 624–648). Springer

work page 2025
[5]

Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., & Liu, T. (2025). A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions.ACM Transactions on Information Systems, 43(2), 1–55

work page 2025
[6]

Tang, C., Abbatematteo, B., Hu, J., Chandra, R., Martín-Martín, R., & Stone, P. (2025). Deep reinforcement learning for robotics: A survey of real-world successes.Annual Review of Control, Robotics, and Autonomous Systems, 8, 153–178

work page 2025
[7]

Du, Q., Li, B., Du, Y ., Su, S., Fu, T., Zhan, Z., & Wang, C. (2025). Fast task planning with neuro-symbolic relaxation.arXiv preprint arXiv:2507.15975

work page arXiv 2025
[8]

Grounding llms for robot task planning using closed-loop state feedback,

Bhat, V ., Kaypak, A. U., Krishnamurthy, P., Karri, R., & Khorrami, F. (2024). Grounding LLMs for robot task planning using closed-loop state feedback.arXiv preprint arXiv:2402.08546

work page arXiv 2024
[9]

Su, W. (2025). Do large language models (really) need statistical foundations?arXiv preprint arXiv:2505.19145

work page arXiv 2025
[10]

Enoasmo, V ., Featherstonehaugh, C., Konstantinopoulos, X., & Huntington, Z. (2025). Structural embedding projection for contextual large language model inference.arXiv preprint arXiv:2501.18826

work page arXiv 2025
[11]

(2022, October)

Ullah, S., Liaqat, M., Asif, A., Khan, A., Aslam, U., & Asif, H. (2022, October). Deep auto encoder based chatbot for discrete math course. In 2022 International Conference onRecent Advances in Electrical Engineering & Computer Sciences (RAEE & CS)(pp. 1-7). IEEE

work page 2022
[12]

Kim, Y ., Choi, J., & Lee, S. (2024). A survey on integration of large language models with intelligent robots. Intelligent Service Robotics

work page 2024
[13]

Zeng, F., Gan, W., Wang, Y ., Liu, N., & Yu, P. S. (2023). Large language models for robotics: A survey.arXiv preprint arXiv:2311.07226

work page arXiv 2023
[14]

Huang, W., Abbeel, P., Pathak, D., & Mordatch, I. (2022). Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. InProceedings of ICML

work page 2022
[15]

Ahn, M., Brohan, A., Brown, N., Chebotar, Y ., Cortes, O., David, B., & Zeng, A. (2022). Do as I can, not as I say: Grounding language in robotic affordances.arXiv preprint arXiv:2204.01691

work page internal anchor Pith review Pith/arXiv arXiv 2022
[16]

S., Lynch, C., Chowdhery, A., Wahid, A., & Florence, P

Driess, D., Xia, F., Sajjadi, M. S., Lynch, C., Chowdhery, A., Wahid, A., & Florence, P. (2023). PaLM-E: An embodied multimodal language model

work page 2023
[17]

Liang, J., Huang, W., Xia, F., Xu, P., Hausman, K., Ichter, B., & Zeng, A. (2022). Code as policies: Language model programs for embodied control.arXiv preprint arXiv:2209.07753

work page internal anchor Pith review Pith/arXiv arXiv 2022
[19]

Jeong, H., Lee, H., Kim, C., & Shin, S. (2024). A survey of robot intelligence with large language models.Applied Sciences, 14(19), 8868. 17 Running Title for Header

work page 2024
[20]

Abid, M., Akhtar, T., & Bhatt, H. (2025). Uncertainty quantification in steady-state heat transfer: A comprehensive analysis of DRAM and MCMC methods with applications to thermal systems.Spectrum of engineering and management sciences, 3(1), 63-75

work page 2025
[21]

Chen, Y ., Arkin, J., Zhang, Y ., et al. (2024). AutoTAMP: Autoregressive task and motion planning with LLMs as translators and checkers. InProceedings of ICRA

work page 2024
[22]

Garcez, A., & Lamb, L. (2023). Neurosymbolic AI: The third wave.Artificial Intelligence Review, 56, 12387–12406

work page 2023
[23]

arXiv preprint arXiv:2401.01040 , year=

Wan, Z., Liu, C. K., Yang, H., Li, C., You, H., Fu, Y ., & Raychowdhury, A. (2024). Towards cognitive AI systems: A survey and prospective on neuro-symbolic AI.arXiv preprint arXiv:2401.01040

work page arXiv 2024
[24]

De Raedt, L., Dumancic, S., Manhaeve, R., & Marra, G. (2024). From statistical relational to neurosymbolic artificial intelligence: A survey.Artificial Intelligence, 328

work page 2024
[25]

Neuro-symbolic ai in 2024: A systematic review.arXiv preprint arXiv:2501.05435, 2025

Colelough, B. C., & Regli, W. (2025). Neuro-symbolic AI in 2024: A systematic review.arXiv preprint arXiv:2501.05435

work page arXiv 2025
[26]

Abid, M., & Saqlain, M. (2024). Optimizing diabetes data insights through kmapper-based topological networks: a decision analytics approach for predictive and prescriptive modeling.Management science advances, 1(1), 1-19

work page 2024
[27]

R., Chitnis, R., Holladay, R

Garrett, C. R., Chitnis, R., Holladay, R. M., et al. (2021). Integrated task and motion planning.Annual Review of Control, Robotics, and Autonomous Systems, 4, 265–293

work page 2021
[28]

Zhai, W., Liao, J., Chen, Z., Su, B., & Zhao, X. (2025). A survey of task planning with large language models. Intelligent Computing, 4, 0124

work page 2025
[29]

Bousetouane, B. (2025). Agentic LLM-based robotic systems for real-world applications: A review.Frontiers in Robotics and AI

work page 2025
[30]

P., Brundage, M., & Bharath, A

Arulkumaran, K., Deisenroth, M. P., Brundage, M., & Bharath, A. A. (2017). Deep reinforcement learning: A brief survey.IEEE Signal Processing Magazine, 34(6), 26–38

work page 2017
[31]

M., Thuong, L

Thanh, L. M., Thuong, L. H., Loc, P. T., & Nguyen, C.-N. (2020). Delta robot control using single neuron PID algorithms based on recurrent fuzzy neural network identifiers.International Journal of Mechanical Engineering and Robotics Research, 9(10), 1411–1418

work page 2020
[32]

Gholami, A., Homayouni, T., Ehsani, R., & Sun, J. Q. (2021). Inverse kinematic control of a delta robot using neural networks in real-time.Robotics, 10(4), 115

work page 2021
[33]

Fan, Y ., Huang, H., & Yang, C. (2022). Fixed-time incremental neural control for manipulator based on composite learning with input saturation. InActuators, 11(12), 373

work page 2022
[34]

A., Steinmetz, R., & Tello Gamarra, D

dos Santos Lima, M., Kich, V . A., Steinmetz, R., & Tello Gamarra, D. F. (2024). Delta robot control by learning systems: Harnessing the power of deep reinforcement learning algorithms.Journal of Intelligent & Fuzzy Systems, 46(2), 4881–4894

work page 2024
[35]

Abid, M., & Ali, M. L. Enhancing Software Effort Estimation: A Comparative Analysis of Machine Learning Models with Correlation-Based Feature Selection.Sustainable Machine Intelligence Journal, 12, 1-17

work page
[36]

Khosravi, S., & Akbari, A. (2022). Experimental study on a novel simultaneous control and identification of a 3-DOF delta robot using model reference adaptive control.Mechatronics, 86

work page 2022
[37]

Chen, B., Xu, Z., Kirmani, S., et al. (2024). SpatialVLM: Endowing vision-language models with spatial reasoning capabilities. InProceedings of CVPR

work page 2024
[38]

Rana, K., Haviland, J., Garg, S., et al. (2023). SayPlan: Grounding large language models using 3D scene graphs for scalable robot task planning. InProceedings of CoRL

work page 2023
[39]

Ramchurn, and Mohammad D

Hunt, W., Ramchurn, S. D., & Soorati, M. D. (2024). A survey of language-based communication in robotics. arXiv preprint arXiv:2406.04086

work page arXiv 2024
[40]

Abid, M., Bukhari, S., & Saqlain, M. (2025). Enhancing software effort Estimation in healthcare informatics: A comparative analysis of machine learning models with Correlation-Based feature selection.Sustainable Machine Intelligence Journal, 10, 50-66

work page 2025
[41]

Wang, J., Shi, E., Hu, H., Ma, C., Liu, Y ., Wang, X., & Zhang, S. (2024). Large language models for robotics: Opportunities, challenges, and perspectives.Journal of Automation and Intelligence

work page 2024
[42]

(2024, October)

Amin, B. (2024, October). Mistral expands its reach in the SLM space with Ministral models.TechTalks

work page 2024
[43]

Zheng, Y ., Chen, Y ., Qian, B., Shi, X., Shu, Y ., & Chen, J. (2025). A review on edge large language models: Design, execution, and applications.ACM Computing Surveys, 57(8), 1–35. 18 Running Title for Header

work page 2025
[44]

From clip to dino: Visual encoders shout in multi-modal large language models,

Jiang, D., Liu, Y ., Liu, S., Zhao, J. E., Zhang, H., Gao, Z., & Xiong, H. (2023). From CLIP to DINO: Visual encoders shout in multi-modal large language models.arXiv preprint arXiv:2310.08825

work page arXiv 2023
[45]

Abdin, M., Aneja, J., Behl, H., Bubeck, S., Eldan, R., Gunasekar, S., & Zhang, Y . (2024). Phi-4 technical report. arXiv preprint arXiv:2412.08905

work page internal anchor Pith review Pith/arXiv arXiv 2024
[46]

LLaMA: Open and Efficient Foundation Language Models

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M. A., Lacroix, T., & Lample, G. (2023). LLaMA: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971

work page internal anchor Pith review Pith/arXiv arXiv 2023
[47]

Data Science Dojo. (2024). Phi-3 and beyond: Top small language models of 2024

work page 2024
[48]

(2024, October)

Mistral AI. (2024, October). Introducing Les Ministraux: Edge-optimized models

work page 2024
[49]

Kress-Gazit, H., Hashimoto, K., Kuppuswamy, N., Shah, P., Horgan, P., Richardson, G., & Burchfiel, B. (2024). Robot learning as an empirical science: Best practices for policy evaluation.arXiv preprint arXiv:2409.09491

work page arXiv 2024
[50]

Faigl, J., Kulich, M., & Pˇreuˇcil, L. (2012). Goal assignment using distance cost in multi-robot exploration. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)(pp. 3741–3746). 19 Running Title for Header Supplementary Material Additional experimental data and analysis that corroborate the conclusions in the ...

work page 2012

[1] [1]

Bandyopadhyay, D., Bhattacharjee, S., & Ekbal, A. (2025). Thinking machines: A survey of LLM-based reasoning strategies.arXiv preprint arXiv:2503.10814

work page arXiv 2025

[2] [2]

Zhang, Y ., Wang, H., Feng, S., Tan, Z., Han, X., He, T., & Tsvetkov, Y . (2024). Can LLM graph reasoning generalize beyond pattern memorization?arXiv preprint arXiv:2406.15992

work page arXiv 2024

[3] [3]

Liu, L., Nair, A., Peng, T., Desai, S., Gupta, M., Mehta, K., & Singh, P. (2024). Optimizing task planning efficiency in LLMs: Beyond closed-loop systems.Authorea Preprints

work page 2024

[4] [4]

Banerjee, S., Agarwal, A., & Singla, S. (2025). LLMs will always hallucinate, and we need to live with this. In Proceedings of the Intelligent Systems Conference(pp. 624–648). Springer

work page 2025

[5] [5]

Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., & Liu, T. (2025). A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions.ACM Transactions on Information Systems, 43(2), 1–55

work page 2025

[6] [6]

Tang, C., Abbatematteo, B., Hu, J., Chandra, R., Martín-Martín, R., & Stone, P. (2025). Deep reinforcement learning for robotics: A survey of real-world successes.Annual Review of Control, Robotics, and Autonomous Systems, 8, 153–178

work page 2025

[7] [7]

Du, Q., Li, B., Du, Y ., Su, S., Fu, T., Zhan, Z., & Wang, C. (2025). Fast task planning with neuro-symbolic relaxation.arXiv preprint arXiv:2507.15975

work page arXiv 2025

[8] [8]

Grounding llms for robot task planning using closed-loop state feedback,

Bhat, V ., Kaypak, A. U., Krishnamurthy, P., Karri, R., & Khorrami, F. (2024). Grounding LLMs for robot task planning using closed-loop state feedback.arXiv preprint arXiv:2402.08546

work page arXiv 2024

[9] [9]

Su, W. (2025). Do large language models (really) need statistical foundations?arXiv preprint arXiv:2505.19145

work page arXiv 2025

[10] [10]

Enoasmo, V ., Featherstonehaugh, C., Konstantinopoulos, X., & Huntington, Z. (2025). Structural embedding projection for contextual large language model inference.arXiv preprint arXiv:2501.18826

work page arXiv 2025

[11] [11]

(2022, October)

Ullah, S., Liaqat, M., Asif, A., Khan, A., Aslam, U., & Asif, H. (2022, October). Deep auto encoder based chatbot for discrete math course. In 2022 International Conference onRecent Advances in Electrical Engineering & Computer Sciences (RAEE & CS)(pp. 1-7). IEEE

work page 2022

[12] [12]

Kim, Y ., Choi, J., & Lee, S. (2024). A survey on integration of large language models with intelligent robots. Intelligent Service Robotics

work page 2024

[13] [13]

Zeng, F., Gan, W., Wang, Y ., Liu, N., & Yu, P. S. (2023). Large language models for robotics: A survey.arXiv preprint arXiv:2311.07226

work page arXiv 2023

[14] [14]

Huang, W., Abbeel, P., Pathak, D., & Mordatch, I. (2022). Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. InProceedings of ICML

work page 2022

[15] [15]

Ahn, M., Brohan, A., Brown, N., Chebotar, Y ., Cortes, O., David, B., & Zeng, A. (2022). Do as I can, not as I say: Grounding language in robotic affordances.arXiv preprint arXiv:2204.01691

work page internal anchor Pith review Pith/arXiv arXiv 2022

[16] [16]

S., Lynch, C., Chowdhery, A., Wahid, A., & Florence, P

Driess, D., Xia, F., Sajjadi, M. S., Lynch, C., Chowdhery, A., Wahid, A., & Florence, P. (2023). PaLM-E: An embodied multimodal language model

work page 2023

[17] [17]

Liang, J., Huang, W., Xia, F., Xu, P., Hausman, K., Ichter, B., & Zeng, A. (2022). Code as policies: Language model programs for embodied control.arXiv preprint arXiv:2209.07753

work page internal anchor Pith review Pith/arXiv arXiv 2022

[18] [19]

Jeong, H., Lee, H., Kim, C., & Shin, S. (2024). A survey of robot intelligence with large language models.Applied Sciences, 14(19), 8868. 17 Running Title for Header

work page 2024

[19] [20]

Abid, M., Akhtar, T., & Bhatt, H. (2025). Uncertainty quantification in steady-state heat transfer: A comprehensive analysis of DRAM and MCMC methods with applications to thermal systems.Spectrum of engineering and management sciences, 3(1), 63-75

work page 2025

[20] [21]

Chen, Y ., Arkin, J., Zhang, Y ., et al. (2024). AutoTAMP: Autoregressive task and motion planning with LLMs as translators and checkers. InProceedings of ICRA

work page 2024

[21] [22]

Garcez, A., & Lamb, L. (2023). Neurosymbolic AI: The third wave.Artificial Intelligence Review, 56, 12387–12406

work page 2023

[22] [23]

arXiv preprint arXiv:2401.01040 , year=

Wan, Z., Liu, C. K., Yang, H., Li, C., You, H., Fu, Y ., & Raychowdhury, A. (2024). Towards cognitive AI systems: A survey and prospective on neuro-symbolic AI.arXiv preprint arXiv:2401.01040

work page arXiv 2024

[23] [24]

De Raedt, L., Dumancic, S., Manhaeve, R., & Marra, G. (2024). From statistical relational to neurosymbolic artificial intelligence: A survey.Artificial Intelligence, 328

work page 2024

[24] [25]

Neuro-symbolic ai in 2024: A systematic review.arXiv preprint arXiv:2501.05435, 2025

Colelough, B. C., & Regli, W. (2025). Neuro-symbolic AI in 2024: A systematic review.arXiv preprint arXiv:2501.05435

work page arXiv 2025

[25] [26]

Abid, M., & Saqlain, M. (2024). Optimizing diabetes data insights through kmapper-based topological networks: a decision analytics approach for predictive and prescriptive modeling.Management science advances, 1(1), 1-19

work page 2024

[26] [27]

R., Chitnis, R., Holladay, R

Garrett, C. R., Chitnis, R., Holladay, R. M., et al. (2021). Integrated task and motion planning.Annual Review of Control, Robotics, and Autonomous Systems, 4, 265–293

work page 2021

[27] [28]

Zhai, W., Liao, J., Chen, Z., Su, B., & Zhao, X. (2025). A survey of task planning with large language models. Intelligent Computing, 4, 0124

work page 2025

[28] [29]

Bousetouane, B. (2025). Agentic LLM-based robotic systems for real-world applications: A review.Frontiers in Robotics and AI

work page 2025

[29] [30]

P., Brundage, M., & Bharath, A

Arulkumaran, K., Deisenroth, M. P., Brundage, M., & Bharath, A. A. (2017). Deep reinforcement learning: A brief survey.IEEE Signal Processing Magazine, 34(6), 26–38

work page 2017

[30] [31]

M., Thuong, L

Thanh, L. M., Thuong, L. H., Loc, P. T., & Nguyen, C.-N. (2020). Delta robot control using single neuron PID algorithms based on recurrent fuzzy neural network identifiers.International Journal of Mechanical Engineering and Robotics Research, 9(10), 1411–1418

work page 2020

[31] [32]

Gholami, A., Homayouni, T., Ehsani, R., & Sun, J. Q. (2021). Inverse kinematic control of a delta robot using neural networks in real-time.Robotics, 10(4), 115

work page 2021

[32] [33]

Fan, Y ., Huang, H., & Yang, C. (2022). Fixed-time incremental neural control for manipulator based on composite learning with input saturation. InActuators, 11(12), 373

work page 2022

[33] [34]

A., Steinmetz, R., & Tello Gamarra, D

dos Santos Lima, M., Kich, V . A., Steinmetz, R., & Tello Gamarra, D. F. (2024). Delta robot control by learning systems: Harnessing the power of deep reinforcement learning algorithms.Journal of Intelligent & Fuzzy Systems, 46(2), 4881–4894

work page 2024

[34] [35]

Abid, M., & Ali, M. L. Enhancing Software Effort Estimation: A Comparative Analysis of Machine Learning Models with Correlation-Based Feature Selection.Sustainable Machine Intelligence Journal, 12, 1-17

work page

[35] [36]

Khosravi, S., & Akbari, A. (2022). Experimental study on a novel simultaneous control and identification of a 3-DOF delta robot using model reference adaptive control.Mechatronics, 86

work page 2022

[36] [37]

Chen, B., Xu, Z., Kirmani, S., et al. (2024). SpatialVLM: Endowing vision-language models with spatial reasoning capabilities. InProceedings of CVPR

work page 2024

[37] [38]

Rana, K., Haviland, J., Garg, S., et al. (2023). SayPlan: Grounding large language models using 3D scene graphs for scalable robot task planning. InProceedings of CoRL

work page 2023

[38] [39]

Ramchurn, and Mohammad D

Hunt, W., Ramchurn, S. D., & Soorati, M. D. (2024). A survey of language-based communication in robotics. arXiv preprint arXiv:2406.04086

work page arXiv 2024

[39] [40]

Abid, M., Bukhari, S., & Saqlain, M. (2025). Enhancing software effort Estimation in healthcare informatics: A comparative analysis of machine learning models with Correlation-Based feature selection.Sustainable Machine Intelligence Journal, 10, 50-66

work page 2025

[40] [41]

Wang, J., Shi, E., Hu, H., Ma, C., Liu, Y ., Wang, X., & Zhang, S. (2024). Large language models for robotics: Opportunities, challenges, and perspectives.Journal of Automation and Intelligence

work page 2024

[41] [42]

(2024, October)

Amin, B. (2024, October). Mistral expands its reach in the SLM space with Ministral models.TechTalks

work page 2024

[42] [43]

Zheng, Y ., Chen, Y ., Qian, B., Shi, X., Shu, Y ., & Chen, J. (2025). A review on edge large language models: Design, execution, and applications.ACM Computing Surveys, 57(8), 1–35. 18 Running Title for Header

work page 2025

[43] [44]

From clip to dino: Visual encoders shout in multi-modal large language models,

Jiang, D., Liu, Y ., Liu, S., Zhao, J. E., Zhang, H., Gao, Z., & Xiong, H. (2023). From CLIP to DINO: Visual encoders shout in multi-modal large language models.arXiv preprint arXiv:2310.08825

work page arXiv 2023

[44] [45]

Abdin, M., Aneja, J., Behl, H., Bubeck, S., Eldan, R., Gunasekar, S., & Zhang, Y . (2024). Phi-4 technical report. arXiv preprint arXiv:2412.08905

work page internal anchor Pith review Pith/arXiv arXiv 2024

[45] [46]

LLaMA: Open and Efficient Foundation Language Models

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M. A., Lacroix, T., & Lample, G. (2023). LLaMA: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971

work page internal anchor Pith review Pith/arXiv arXiv 2023

[46] [47]

Data Science Dojo. (2024). Phi-3 and beyond: Top small language models of 2024

work page 2024

[47] [48]

(2024, October)

Mistral AI. (2024, October). Introducing Les Ministraux: Edge-optimized models

work page 2024

[48] [49]

Kress-Gazit, H., Hashimoto, K., Kuppuswamy, N., Shah, P., Horgan, P., Richardson, G., & Burchfiel, B. (2024). Robot learning as an empirical science: Best practices for policy evaluation.arXiv preprint arXiv:2409.09491

work page arXiv 2024

[49] [50]

Faigl, J., Kulich, M., & Pˇreuˇcil, L. (2012). Goal assignment using distance cost in multi-robot exploration. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)(pp. 3741–3746). 19 Running Title for Header Supplementary Material Additional experimental data and analysis that corroborate the conclusions in the ...

work page 2012