DORA is an instruction-based DNN accelerator architecture with a two-stage compilation framework that delivers stable efficiency across varied workloads and up to 5x throughput gains versus prior accelerators on FPGA.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 5representative citing papers
A distributed arithmetic algorithm for CMVM operations on FPGAs reduces area by up to one third and latency for quantized neural networks, integrated into hls4ml.
A hybrid concrete-symbolic verifier checks MLIR program equivalence in linear time for a supported subset and is applied to AMD MLIR-AIR, MLIR-AIE, and mlir-opt on hundreds of benchmarks.
FILCO introduces a real-time reconfigurable composing architecture for DNN acceleration that achieves 1.3x-5x better throughput and hardware efficiency than prior designs on diverse workloads via an analytical model and two-stage design space exploration.
Systematic study concludes overlay architectures suit frequent model switching in current autonomous driving setups, while customized ones may become preferable as bitstream reload overhead decreases.
citing papers explorer
-
DORA: Dataflow-Instruction Orchestration Architecture for DNN Acceleration
DORA is an instruction-based DNN accelerator architecture with a two-stage compilation framework that delivers stable efficiency across varied workloads and up to 5x throughput gains versus prior accelerators on FPGA.
-
da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs
A distributed arithmetic algorithm for CMVM operations on FPGAs reduces area by up to one third and latency for quantized neural networks, integrated into hls4ml.
-
Practical Formal Verification for MLIR Programs
A hybrid concrete-symbolic verifier checks MLIR program equivalence in linear time for a supported subset and is applied to AMD MLIR-AIR, MLIR-AIE, and mlir-opt on hundreds of benchmarks.
-
FILCO: Flexible Composing Architecture with Real-Time Reconfigurability for DNN Acceleration
FILCO introduces a real-time reconfigurable composing architecture for DNN acceleration that achieves 1.3x-5x better throughput and hardware efficiency than prior designs on diverse workloads via an analytical model and two-stage design space exploration.
-
To Overlay or to Customize? Revisiting Architectural Choices in Heterogeneous Systems
Systematic study concludes overlay architectures suit frequent model switching in current autonomous driving setups, while customized ones may become preferable as bitstream reload overhead decreases.