FORGE uses a reasoning-action-observation loop and Dynamic Forest of Agents to perform scalable LLM-based binary analysis, finding 1,274 vulnerabilities across 591 of 3,457 real-world firmware binaries at 72.3% precision and broader coverage than prior methods.
Reducing energy bloat in large model training
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 6roles
background 1polarities
background 1representative citing papers
Develops a simulation framework showing multi-resource stranding changes deployable capacity and effective costs in AI datacenters, arguing the key metric is deployable capacity over time rather than installed megawatts.
EasyRider uses passive components plus actively controlled energy storage at the rack level, paired with lifetime-maximizing software, to keep AI training power transients inside grid safety limits without code changes or energy waste.
Amoeba adaptively adjusts tensor parallelism at runtime for LLM inference services to handle mixed short and long context requests, delivering 1.75x-6.57x throughput gains over prior solutions in real-world trace evaluations.
The paper introduces Experiment-as-Code Labs as a declarative stack synthesizing AI agents, systems orchestration, and physical lab control for AI-driven discovery.
EnergAIzer predicts module-level GPU utilization from structured kernel patterns and feeds it into a power model to estimate dynamic power with 8% error on Ampere GPUs and 7% on H100 forecasts.
citing papers explorer
-
Feedback-Driven Execution for LLM-Based Binary Analysis
FORGE uses a reasoning-action-observation loop and Dynamic Forest of Agents to perform scalable LLM-based binary analysis, finding 1,274 vulnerabilities across 591 of 3,457 real-world firmware binaries at 72.3% precision and broader coverage than prior methods.
-
Designing Datacenter Power Delivery Hierarchies for the AI Era
Develops a simulation framework showing multi-resource stranding changes deployable capacity and effective costs in AI datacenters, arguing the key metric is deployable capacity over time rather than installed megawatts.
-
EasyRider: Mitigating Power Transients in Datacenter-Scale Training Workloads
EasyRider uses passive components plus actively controlled energy storage at the rack level, paired with lifetime-maximizing software, to keep AI training power transients inside grid safety limits without code changes or energy waste.
-
Amoeba: Runtime Tensor Parallel Transformation for LLM Inference Services
Amoeba adaptively adjusts tensor parallelism at runtime for LLM inference services to handle mixed short and long context requests, delivering 1.75x-6.57x throughput gains over prior solutions in real-world trace evaluations.
-
Experiment-as-Code Labs: A Declarative Stack for AI-Driven Scientific Discovery
The paper introduces Experiment-as-Code Labs as a declarative stack synthesizing AI agents, systems orchestration, and physical lab control for AI-driven discovery.
-
EnergAIzer: Fast and Accurate GPU Power Estimation Framework for AI Workloads
EnergAIzer predicts module-level GPU utilization from structured kernel patterns and feeds it into a power model to estimate dynamic power with 8% error on Ampere GPUs and 7% on H100 forecasts.