SMH-Bench supplies 1,100 stratified tasks in a verifiable smart-home simulator to measure LLM performance on explicit control, scheduling, ambiguity, and personalization as environment complexity grows.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Position paper calling for stronger evidentiary standards and a diagnostic checklist in anthropomorphic misalignment research.
citing papers explorer
-
SMH-Bench: Benchmarking LLM Agents for Environment-Grounded Reasoning and Action in Smart Homes
SMH-Bench supplies 1,100 stratified tasks in a verifiable smart-home simulator to measure LLM performance on explicit control, scheduling, ambiguity, and personalization as environment complexity grows.
-
Position: Anthropomorphic Misalignment Research Needs Stronger Evidence
Position paper calling for stronger evidentiary standards and a diagnostic checklist in anthropomorphic misalignment research.