{"paper":{"title":"Scale-Gest: Scalable Model-Space Synthesis and Runtime Selection for On-Device Gesture Detection","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"Runtime controller switches among tiny-YOLO variants to cut on-device gesture energy by 4x while holding F1 at 0.8-0.9.","cross_cats":["cs.AI","cs.HC","cs.RO","eess.IV"],"primary_cat":"cs.CV","authors_text":"Abdul Basit, Muhammad Shafique, Saim Rehman","submitted_at":"2026-03-16T10:12:26Z","abstract_excerpt":"Realizing on-device ML-based gesture detection under tight real-time performance, energy and memory constraints is challenging, especially when considering mobile devices with varying battery-power levels. Existing EdgeAI deployments typically rely on a single fixed detector, limiting optimization opportunities. We present Scale-Gest, a novel run-time adaptive gesture detection framework that expands the detector space into a dense family of tiny-YOLO architectures. We introduce multiple novel device-calibrated ACE (Accuracy-Complexity-Energy) profiles by analyzing different model-resolution-s"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"On a battery-powered laptop running gesture streams, our ACE controller reduces per-frame energy by 4x (from 6.9 mJ to 1.6 mJ) while maintaining high gesture-detection performance (event-level F1 = 0.8-0.9) and low mean latency (6 ms).","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the device-calibrated ACE profiles and the motion-aware ROI gate will transfer to other hardware platforms and real-world lighting/pose variations without re-calibration or loss of the reported energy savings.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Scale-Gest creates a runtime-selectable family of tiny-YOLO models with device-calibrated ACE profiles and an ROI gate that cuts per-frame energy by 4x while holding event-level F1 at 0.8-0.9 on a new driving-gesture dataset.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Runtime controller switches among tiny-YOLO variants to cut on-device gesture energy by 4x while holding F1 at 0.8-0.9.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"39071f63ee5164476f96edc02c0259d02eb2f2bfca81340a818179fb78b89a17"},"source":{"id":"2605.12506","kind":"arxiv","version":1},"verdict":{"id":"440b3077-80e4-4b86-b089-106e243f9570","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T10:40:08.224699Z","strongest_claim":"On a battery-powered laptop running gesture streams, our ACE controller reduces per-frame energy by 4x (from 6.9 mJ to 1.6 mJ) while maintaining high gesture-detection performance (event-level F1 = 0.8-0.9) and low mean latency (6 ms).","one_line_summary":"Scale-Gest creates a runtime-selectable family of tiny-YOLO models with device-calibrated ACE profiles and an ROI gate that cuts per-frame energy by 4x while holding event-level F1 at 0.8-0.9 on a new driving-gesture dataset.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the device-calibrated ACE profiles and the motion-aware ROI gate will transfer to other hardware platforms and real-world lighting/pose variations without re-calibration or loss of the reported energy savings.","pith_extraction_headline":"Runtime controller switches among tiny-YOLO variants to cut on-device gesture energy by 4x while holding F1 at 0.8-0.9."},"references":{"count":29,"sample":[{"doi":"","year":2023,"title":"Applied Sciences 13, 20 (2023)","work_id":"debb69b2-0893-431a-ab16-75974ae9c379","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2012,"title":"M., Pasricha, S., Maciejewski, A","work_id":"0ec328fd-4349-4d37-b03a-704978a8215e","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2024,"title":"In2024 IEEE/CVF Winter Conference on Applications of Computer Vision (W ACV)(Jan","work_id":"5efbe84f-3f8c-418a-9c1a-626dafa451ff","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Angell, L., Seaman, S., Payyanadan, R., Biever, W., Seppelt, B., Mehler, B., and Reimer, B.In the context of whole trips: New insights into driver management of attention and tasks. pp. 1–7","work_id":"f20d6602-dfc1-4933-92df-97c41f04ee46","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2004,"title":"YOLOv4: Optimal Speed and Accuracy of Object Detection","work_id":"7057aaee-27f6-4209-a83c-f59727f937a8","ref_index":5,"cited_arxiv_id":"2004.10934","is_internal_anchor":true}],"resolved_work":29,"snapshot_sha256":"6080e6b62ad51b19efdeefc6a01f92919d93249f22b3c234271947ba835bdc72","internal_anchors":4},"formal_canon":{"evidence_count":1,"snapshot_sha256":"0dc659660d7a854bdb91adcdc5bd454b930aea1ce653ca1382d55fd560f506f1"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}