SMH-Bench supplies 1,100 stratified tasks in a verifiable smart-home simulator to measure LLM performance on explicit control, scheduling, ambiguity, and personalization as environment complexity grows.
Ilias Chalkidis, Manos Fergadiotis, and Ion Androutsopoulos
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
LLMs function as semantic engines for content-based pub/sub in agentic AI, with an analytical context-window crossover and an empirical discrimination-capacity crossover that together show backend model choice dominates pipeline tuning.
citing papers explorer
-
SMH-Bench: Benchmarking LLM Agents for Environment-Grounded Reasoning and Action in Smart Homes
SMH-Bench supplies 1,100 stratified tasks in a verifiable smart-home simulator to measure LLM performance on explicit control, scheduling, ambiguity, and personalization as environment complexity grows.
-
Neural Router: Semantic Content Matching for Agentic AI
LLMs function as semantic engines for content-based pub/sub in agentic AI, with an analytical context-window crossover and an empirical discrimination-capacity crossover that together show backend model choice dominates pipeline tuning.