archive

Every paper Pith has read. Search by title, abstract, or pith.

493 papers in cs.AR · page 5

cs.AR 2026-04-24 reviewed

Helpers from high-level features speed HLS verification up to 6x
AutoINV: Automated Invariant Generation Framework for Formal Verification on High-Level Synthesis Designs

Xiaofeng Zhou +5
cs.AR 2026-04-24 reviewed

LLM evolves router code to cut wirelength up to 8.72%
GR-Evolve: Design-Adaptive Global Routing via LLM-Driven Algorithm Evolution

Taizun Jafri +1
cs.AR 2026-04-24 reviewed

Optical neural net design cuts energy-delay product by 64%
ROSA: Robust and Energy-Efficient Microring-Based Optical Neural Networks via Optical Shift-and-Add and Layer-Wise Hybrid Mapping

Huifan Zhang +4
cs.AR 2026-04-24 reviewed

PyTorch SNNs run on FPGAs with exact software accuracy
Hardware-Software Co-Design for Event-Driven SNN Deployment on Low-Cost Neuromorphic FPGAs

Jiwoon Lee +3
cs.NI 2026-04-23 reviewed

SPAC reduces FPGA switch resources by 55% and latency by 38%
SPAC: Automating FPGA-based Network Switches with Protocol Adaptive Customization

Guoyu Li +11
cs.NE 2026-04-23 reviewed

Volatile memristors reach 95.89% MNIST accuracy in reservoir computing
On the Role of Preprocessing and Memristor Dynamics in Reservoir Computing for Image Classification

Rishona Daniels +4
cs.DC 2026-04-23 reviewed

Restructured big-integer ops deliver 4x SIMD speedups in libraries
Leveraging SIMD for Accelerating Large-number Arithmetic

Subhrajit Das +2
cs.AR 2026-04-23 reviewed

Online learning delays failure in radiation-exposed spiking nets
Shooting Neutrons at Neurons: Radiation Testing of a Spiking Neural Network on Flash-Based FPGAs

Wim Nijsink +3
quant-ph 2026-04-23 reviewed

Tree-encoded fusion suppresses erasure errors in photonic MBQC
Suppressing the Erasure Error of Fusion Operation in Photonic Quantum Computing

Xiangyu Ren +6
cs.LG 2026-04-23 reviewed

Co-design accelerates multimodal foundation models
Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models

Muhammad Shafique +5
cs.ET 2026-04-23 reviewed

HDC operations realized as coherent wave behaviors
A wave-geometric duality for hyperdimensional computing

Tyler L. Poore (Independent Researcher)
cs.AR 2026-04-22 reviewed

Runtime dispatcher shares Versal AI Engine tiles among mixed-criticality tasks
Enabling Mixed criticality applications for the Versal AI-Engines

Vincent Sprave +4
cs.AR 2026-04-22 reviewed

FPGA level-wise batch search speeds B+ tree lookups 4.9x
Efficient Batch Search Algorithm for B+ Tree Index Structures with Level-Wise Traversal on FPGAs

Max Tzschoppe +3
cs.LG 2026-04-22 reviewed

Calibration-free quantization compresses LLMs to 37% size while beating 4-bit methods
FASQ: Flexible Accelerated Subspace Quantization for Calibration-Free LLM Compression

Ye Qiao +4
cs.AR 2026-04-22 reviewed

FPGAs cut carbon in low-volume changing workloads
Evaluating Computing Platforms for Sustainability: A Comparative Analysis of FPGAs against ASICs, GPUs, and CPUs

Chetan Choppali Sudarshan +2
cs.CR 2026-04-22 reviewed

Victim-row counting boosts RowHammer tolerance in DRAM
PVAC: A RowHammer Mitigation Architecture Exploiting Per-victim-row Counting

Jumin Kim +5
cs.AR 2026-04-22 reviewed

Series SRAM cells reduce cache leakage power
A Novel Low-Power Cache Architecture Based on 6-Transistor SRAM Cells

Naser Khatti Dizabadi +1
cs.AR 2026-04-22 reviewed

LLM pipeline completes analog IC design from image to layout
AnalogMaster: Large Language Model-based Automated Analog IC Design Framework from Image to Layout

Xian Rong Qin +5
cs.AR 2026-04-22 reviewed

AI GPU power estimated in seconds with 8% error
EnergAIzer: Fast and Accurate GPU Power Estimation Framework for AI Workloads

Kyungmi Lee +5
cs.AR 2026-04-21 reviewed

Dropout brings uncertainty estimates to complex neural networks
Algorithm and Hardware Co-Design for Efficient Complex-Valued Uncertainty Estimation

Zehuan Zhang +3
cs.AR 2026-04-21 reviewed

Duon skips TLB shootdowns for hybrid memory page moves
Efficient Page Migration in Hybrid Memory Systems

Upasna +1
quant-ph 2026-04-21 reviewed

Co-designed detection and cancellation cuts logical errors 2-11x
Co-Designing Error Mitigation and Error Detection for Logical Qubits

Rohan S. Kumar +5
cs.AR 2026-04-21 reviewed

Multi-agent system generates correct Verilog at 97% success
ChipCraftBrain: Validation-First RTL Generation via Multi-Agent Orchestration

Cagri Eryilmaz
quant-ph 2026-04-21 reviewed

Workload profile guides surface-code layout to save 21% data tiles
Toward designing workload-aware Surface Code Architectures

Archisman Ghosh +2
cs.AR 2026-04-21 reviewed

Parameterized design hits 11.89 GOP/s/W for LSTM on embedded FPGAs
Energy Efficient LSTM Accelerators for Embedded FPGAs through Parameterised Architecture Design

Chao Qian +2
cs.AR 2026-04-21 reviewed

Metric shows when specialized engines beat FPGA logic for edge models
Design Rules for Extreme-Edge Scientific Computing on AI Engines

Zhenghua Ma +5
cs.AR 2026-04-20 reviewed

Joint chiplet and optical design speeds LLM training
ChipLight: Cross-Layer Optimization of Chiplet Design with Optical Interconnects for LLM Training

Kangbo Bai +3
cs.AR 2026-04-20 reviewed

Ternary memristive junctions store assertions for direct hardware reasoning
Ternary Memristive Logic: Hardware for Reasoning Realized via Domain Algebra

Chao Li
cs.AR 2026-04-20 reviewed

Apple M3 uses about 6 times less energy than AMD Ryzen on key tasks
A Comparative Analysis of ARM and x86-64 Laptop-Class Processors: Architecture, Assembly-Level Performance, and Energy Efficiency

Mustafa Mert \"Ozy{\i}lmaz
cs.LG 2026-04-20 reviewed

Surrogate models select better 3D-IC partitions with fewer evaluations
A PPA-Driven 3D-IC Partitioning Selection Framework with Surrogate Models

Shang Wang (1) +5
cs.AR 2026-04-20 reviewed

LLM agents find lower-cost chiplet designs than simulated annealing
CHICO-Agent: An LLM Agent for the Cross-layer Optimization of 2.5D and 3D Chiplet-based Systems

Qihang Wu +2
cs.AR 2026-04-20 reviewed

Branch predictors can be tuned to cut mispredictions in graph apps
Optimizing Branch Predictor for Graph Applications

Upasna +1
cs.LG 2026-04-20 reviewed

AutoPPA learns circuit rules by comparing code variants
AutoPPA: Automated Circuit PPA Optimization via Contrastive Code-based Rule Library Learning

Chongxiao Li +16
cs.ET 2026-04-20 reviewed

Equal inductors turn bridged-T network into high-pass filter
Scattering-Matrix-Based Parametric Characterization of a Two-Port Bridged-T Network for Microstrip Filter Applications

Naser Khatti Dizabadi +1
cs.AR 2026-04-20 reviewed

Contrastive pairs raise Verilog LLM compile and correctness rates
VerilogCL: A Contrastive Learning Framework for Robust LLM-Based Verilog Generation

Yan Tan +3
cs.AR 2026-04-20 reviewed

In-memory quantization breaks PIM capacity wall for LLMs
AQPIM: Breaking the PIM Capacity Wall for LLMs with In-Memory Activation Quantization

Kosuke Matsushima +3
cs.OS 2026-04-20 reviewed

Processes and pipes made lightweight for far memory accelerators
Proxics: an efficient programming model for far memory accelerators

Zikai Liu +5
cs.LG 2026-04-20 reviewed

Dataflow chip outperforms GPUs on autonomous driving AI
M100: An Orchestrated Dataflow Architecture Powering General AI Computing

Yan Xie +36
cs.AR 2026-04-20 reviewed

ZKP kernels reformulated to run 10x faster on TPUs
Enabling AI ASICs for Zero Knowledge Proof

Jianming Tong +8
cs.AR 2026-04-20 reviewed

AccelCIM charts complete dataflow options for SRAM memory chips
AccelCIM: Systematic Dataflow Exploration for SRAM Compute-in-Memory Accelerator

Chenhao Xue +13
cs.AR 2026-04-19 reviewed

Multi-tier KV cache cuts LLM inference costs by 47%
Predictive Multi-Tier Memory Management for KV Cache in Large-Scale GPU Inference

Sanjeev Rao Ganjihal
cs.CR 2026-04-19 reviewed

Offloading avatars privately scales VR to 2.37x more users
Privatar: Scalable Privacy-preserving Multi-user VR via Secure Offloading

Jianming Tong +7
cs.SE 2026-04-19 reviewed

ML automation targets RISC-V certification costs for cars
RISC-V Functional Safety for Autonomous Automotive Systems: An Analytical Framework and Research Roadmap for ML-Assisted Certification

Nick Andreasyan +4
cs.AR 2026-04-19 reviewed

Stochastic tree search repairs 96.8% of RTL bugs
Clover: A Neural-Symbolic Agentic Harness with Stochastic Tree-of-Thoughts for Verified RTL Repair

Zizhang Luo +8
cs.CR 2026-04-19 reviewed

Bit flips in shared KV caches silently alter LLM outputs
Bit-Flip Vulnerability of Shared KV-Cache Blocks in LLM Serving Systems

Yuji Yamamoto +1
cs.AR 2026-04-18 reviewed

Hyperparameter choices matter more than model choice for LLM RTL generation
Configuration Over Selection: Hyperparameter Sensitivity Exceeds Model Differences in Open-Source LLMs for RTL Generation

Minghao Shao +7
cs.AR 2026-04-18 reviewed

IR choice, not LLM, sets hardware design success rates
From Natural Language to Silicon: The Representation Bottleneck in LLM Hardware Design

Weimin Fu +7
cs.LG 2026-04-18 reviewed

Spike sparsity fails to lower latency or energy on Jetson GPU
When Spike Sparsity Does Not Translate to Deployed Cost: VS-WNO on Jetson Orin Nano

Jason Yoo +3
cs.AR 2026-04-18 reviewed

CPU-memory interface fixes close simulator-to-hardware gaps
Different Perspectives of Memory System Simulation

Pouya Esmaili-Dokht +6
cs.AR 2026-04-18 reviewed

Multiplier-free square-root unit hits 7.63 mW and 4.6 ns on FPGA
E2AFS: Energy-Efficient Approximate Floating Point Square Rooter for Error Tolerant Computing

Prateek Goyal +3