pith. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

493 papers in cs.AR · page 5

  1. cs.AR 2026-04-24 reviewed
    Helpers from high-level features speed HLS verification up to 6x

    AutoINV: Automated Invariant Generation Framework for Formal Verification on High-Level Synthesis Designs

    Xiaofeng Zhou +5

  2. cs.AR 2026-04-24 reviewed
    LLM evolves router code to cut wirelength up to 8.72%

    GR-Evolve: Design-Adaptive Global Routing via LLM-Driven Algorithm Evolution

    Taizun Jafri +1

  3. cs.AR 2026-04-24 reviewed
    Optical neural net design cuts energy-delay product by 64%

    ROSA: Robust and Energy-Efficient Microring-Based Optical Neural Networks via Optical Shift-and-Add and Layer-Wise Hybrid Mapping

    Huifan Zhang +4

  4. cs.AR 2026-04-24 reviewed
    PyTorch SNNs run on FPGAs with exact software accuracy

    Hardware-Software Co-Design for Event-Driven SNN Deployment on Low-Cost Neuromorphic FPGAs

    Jiwoon Lee +3

  5. cs.NI 2026-04-23 reviewed
    SPAC reduces FPGA switch resources by 55% and latency by 38%

    SPAC: Automating FPGA-based Network Switches with Protocol Adaptive Customization

    Guoyu Li +11

  6. cs.NE 2026-04-23 reviewed
    Volatile memristors reach 95.89% MNIST accuracy in reservoir computing

    On the Role of Preprocessing and Memristor Dynamics in Reservoir Computing for Image Classification

    Rishona Daniels +4

  7. cs.DC 2026-04-23 reviewed
    Restructured big-integer ops deliver 4x SIMD speedups in libraries

    Leveraging SIMD for Accelerating Large-number Arithmetic

    Subhrajit Das +2

  8. cs.AR 2026-04-23 reviewed
    Online learning delays failure in radiation-exposed spiking nets

    Shooting Neutrons at Neurons: Radiation Testing of a Spiking Neural Network on Flash-Based FPGAs

    Wim Nijsink +3

  9. quant-ph 2026-04-23 reviewed
    Tree-encoded fusion suppresses erasure errors in photonic MBQC

    Suppressing the Erasure Error of Fusion Operation in Photonic Quantum Computing

    Xiangyu Ren +6

  10. cs.LG 2026-04-23 reviewed
    Co-design accelerates multimodal foundation models

    Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models

    Muhammad Shafique +5

  11. cs.ET 2026-04-23 reviewed
    HDC operations realized as coherent wave behaviors

    A wave-geometric duality for hyperdimensional computing

    Tyler L. Poore (Independent Researcher)

  12. cs.AR 2026-04-22 reviewed
    Runtime dispatcher shares Versal AI Engine tiles among mixed-criticality tasks

    Enabling Mixed criticality applications for the Versal AI-Engines

    Vincent Sprave +4

  13. cs.AR 2026-04-22 reviewed
    FPGA level-wise batch search speeds B+ tree lookups 4.9x

    Efficient Batch Search Algorithm for B+ Tree Index Structures with Level-Wise Traversal on FPGAs

    Max Tzschoppe +3

  14. cs.LG 2026-04-22 reviewed
    Calibration-free quantization compresses LLMs to 37% size while beating 4-bit methods

    FASQ: Flexible Accelerated Subspace Quantization for Calibration-Free LLM Compression

    Ye Qiao +4

  15. cs.AR 2026-04-22 reviewed
    FPGAs cut carbon in low-volume changing workloads

    Evaluating Computing Platforms for Sustainability: A Comparative Analysis of FPGAs against ASICs, GPUs, and CPUs

    Chetan Choppali Sudarshan +2

  16. cs.CR 2026-04-22 reviewed
    Victim-row counting boosts RowHammer tolerance in DRAM

    PVAC: A RowHammer Mitigation Architecture Exploiting Per-victim-row Counting

    Jumin Kim +5

  17. cs.AR 2026-04-22 reviewed
    Series SRAM cells reduce cache leakage power

    A Novel Low-Power Cache Architecture Based on 6-Transistor SRAM Cells

    Naser Khatti Dizabadi +1

  18. cs.AR 2026-04-22 reviewed
    LLM pipeline completes analog IC design from image to layout

    AnalogMaster: Large Language Model-based Automated Analog IC Design Framework from Image to Layout

    Xian Rong Qin +5

  19. cs.AR 2026-04-22 reviewed
    AI GPU power estimated in seconds with 8% error

    EnergAIzer: Fast and Accurate GPU Power Estimation Framework for AI Workloads

    Kyungmi Lee +5

  20. cs.AR 2026-04-21 reviewed
    Dropout brings uncertainty estimates to complex neural networks

    Algorithm and Hardware Co-Design for Efficient Complex-Valued Uncertainty Estimation

    Zehuan Zhang +3

  21. cs.AR 2026-04-21 reviewed
    Duon skips TLB shootdowns for hybrid memory page moves

    Efficient Page Migration in Hybrid Memory Systems

    Upasna +1

  22. quant-ph 2026-04-21 reviewed
    Co-designed detection and cancellation cuts logical errors 2-11x

    Co-Designing Error Mitigation and Error Detection for Logical Qubits

    Rohan S. Kumar +5

  23. cs.AR 2026-04-21 reviewed
    Multi-agent system generates correct Verilog at 97% success

    ChipCraftBrain: Validation-First RTL Generation via Multi-Agent Orchestration

    Cagri Eryilmaz

  24. quant-ph 2026-04-21 reviewed
    Workload profile guides surface-code layout to save 21% data tiles

    Toward designing workload-aware Surface Code Architectures

    Archisman Ghosh +2

  25. cs.AR 2026-04-21 reviewed
    Parameterized design hits 11.89 GOP/s/W for LSTM on embedded FPGAs

    Energy Efficient LSTM Accelerators for Embedded FPGAs through Parameterised Architecture Design

    Chao Qian +2

  26. cs.AR 2026-04-21 reviewed
    Metric shows when specialized engines beat FPGA logic for edge models

    Design Rules for Extreme-Edge Scientific Computing on AI Engines

    Zhenghua Ma +5

  27. cs.AR 2026-04-20 reviewed
    Joint chiplet and optical design speeds LLM training

    ChipLight: Cross-Layer Optimization of Chiplet Design with Optical Interconnects for LLM Training

    Kangbo Bai +3

  28. cs.AR 2026-04-20 reviewed
    Ternary memristive junctions store assertions for direct hardware reasoning

    Ternary Memristive Logic: Hardware for Reasoning Realized via Domain Algebra

    Chao Li

  29. cs.AR 2026-04-20 reviewed
    Apple M3 uses about 6 times less energy than AMD Ryzen on key tasks

    A Comparative Analysis of ARM and x86-64 Laptop-Class Processors: Architecture, Assembly-Level Performance, and Energy Efficiency

    Mustafa Mert \"Ozy{\i}lmaz

  30. cs.LG 2026-04-20 reviewed
    Surrogate models select better 3D-IC partitions with fewer evaluations

    A PPA-Driven 3D-IC Partitioning Selection Framework with Surrogate Models

    Shang Wang (1) +5

  31. cs.AR 2026-04-20 reviewed
    LLM agents find lower-cost chiplet designs than simulated annealing

    CHICO-Agent: An LLM Agent for the Cross-layer Optimization of 2.5D and 3D Chiplet-based Systems

    Qihang Wu +2

  32. cs.AR 2026-04-20 reviewed
    Branch predictors can be tuned to cut mispredictions in graph apps

    Optimizing Branch Predictor for Graph Applications

    Upasna +1

  33. cs.LG 2026-04-20 reviewed
    AutoPPA learns circuit rules by comparing code variants

    AutoPPA: Automated Circuit PPA Optimization via Contrastive Code-based Rule Library Learning

    Chongxiao Li +16

  34. cs.ET 2026-04-20 reviewed
    Equal inductors turn bridged-T network into high-pass filter

    Scattering-Matrix-Based Parametric Characterization of a Two-Port Bridged-T Network for Microstrip Filter Applications

    Naser Khatti Dizabadi +1

  35. cs.AR 2026-04-20 reviewed
    Contrastive pairs raise Verilog LLM compile and correctness rates

    VerilogCL: A Contrastive Learning Framework for Robust LLM-Based Verilog Generation

    Yan Tan +3

  36. cs.AR 2026-04-20 reviewed
    In-memory quantization breaks PIM capacity wall for LLMs

    AQPIM: Breaking the PIM Capacity Wall for LLMs with In-Memory Activation Quantization

    Kosuke Matsushima +3

  37. cs.OS 2026-04-20 reviewed
    Processes and pipes made lightweight for far memory accelerators

    Proxics: an efficient programming model for far memory accelerators

    Zikai Liu +5

  38. cs.LG 2026-04-20 reviewed
    Dataflow chip outperforms GPUs on autonomous driving AI

    M100: An Orchestrated Dataflow Architecture Powering General AI Computing

    Yan Xie +36

  39. cs.AR 2026-04-20 reviewed
    ZKP kernels reformulated to run 10x faster on TPUs

    Enabling AI ASICs for Zero Knowledge Proof

    Jianming Tong +8

  40. cs.AR 2026-04-20 reviewed
    AccelCIM charts complete dataflow options for SRAM memory chips

    AccelCIM: Systematic Dataflow Exploration for SRAM Compute-in-Memory Accelerator

    Chenhao Xue +13

  41. cs.AR 2026-04-19 reviewed
    Multi-tier KV cache cuts LLM inference costs by 47%

    Predictive Multi-Tier Memory Management for KV Cache in Large-Scale GPU Inference

    Sanjeev Rao Ganjihal

  42. cs.CR 2026-04-19 reviewed
    Offloading avatars privately scales VR to 2.37x more users

    Privatar: Scalable Privacy-preserving Multi-user VR via Secure Offloading

    Jianming Tong +7

  43. cs.SE 2026-04-19 reviewed
    ML automation targets RISC-V certification costs for cars

    RISC-V Functional Safety for Autonomous Automotive Systems: An Analytical Framework and Research Roadmap for ML-Assisted Certification

    Nick Andreasyan +4

  44. cs.AR 2026-04-19 reviewed
    Stochastic tree search repairs 96.8% of RTL bugs

    Clover: A Neural-Symbolic Agentic Harness with Stochastic Tree-of-Thoughts for Verified RTL Repair

    Zizhang Luo +8

  45. cs.CR 2026-04-19 reviewed
    Bit flips in shared KV caches silently alter LLM outputs

    Bit-Flip Vulnerability of Shared KV-Cache Blocks in LLM Serving Systems

    Yuji Yamamoto +1

  46. cs.AR 2026-04-18 reviewed
    Hyperparameter choices matter more than model choice for LLM RTL generation

    Configuration Over Selection: Hyperparameter Sensitivity Exceeds Model Differences in Open-Source LLMs for RTL Generation

    Minghao Shao +7

  47. cs.AR 2026-04-18 reviewed
    IR choice, not LLM, sets hardware design success rates

    From Natural Language to Silicon: The Representation Bottleneck in LLM Hardware Design

    Weimin Fu +7

  48. cs.LG 2026-04-18 reviewed
    Spike sparsity fails to lower latency or energy on Jetson GPU

    When Spike Sparsity Does Not Translate to Deployed Cost: VS-WNO on Jetson Orin Nano

    Jason Yoo +3

  49. cs.AR 2026-04-18 reviewed
    CPU-memory interface fixes close simulator-to-hardware gaps

    Different Perspectives of Memory System Simulation

    Pouya Esmaili-Dokht +6

  50. cs.AR 2026-04-18 reviewed
    Multiplier-free square-root unit hits 7.63 mW and 4.6 ns on FPGA

    E2AFS: Energy-Efficient Approximate Floating Point Square Rooter for Error Tolerant Computing

    Prateek Goyal +3