{"total":16,"items":[{"citing_arxiv_id":"2605.21171","ref_index":16,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"FTerViT: Fully Ternary Vision Transformer","primary_cat":"cs.CV","submitted_at":"2026-05-20T13:41:53+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"FTerViT introduces fully ternary Vision Transformers with TernaryBitConv2d and TernaryLayerNorm operators, achieving 82.43% ImageNet top-1 at 6.09 MB with 15x compression.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.11558","ref_index":11,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Composite Activation Function for Learning Stable Binary Representations","primary_cat":"cs.LG","submitted_at":"2026-05-12T05:41:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"HTAF is a sigmoid-tanh composite that approximates the Heaviside function to allow stable gradient training of binary activation networks, yielding ICBMs with stable discretization and competitive performance on image tasks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Recently, the Heaviside activation function has attracted increasing attention again in the AI com- munity due to their potential benefits in memory and computational efficiency ([77, 14, 52]), inter- pretability ([73, 79, 40]), and connections to biological neural networks ([83, 41, 66]). Furthermore, the Heaviside activation function is closely related to quantized and binarized neural networks ([11, 26, 52]), which have become increasingly important for energy-efficient deployment of large deep neural networks such as large language models (LLMs, [70, 44]). Preprint. arXiv:2605.11558v1 [cs.LG] 12 May 2026 Although the Heaviside activation function offers several advantages, training neural networks that use it with gradient-based optimization remains difficult because the function is inherently non-"},{"citing_arxiv_id":"2605.09604","ref_index":10,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"DAP: Doppler-aware Point Network for Heterogeneous mmWave Action Recognition","primary_cat":"cs.CV","submitted_at":"2026-05-10T15:34:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Introduces the first heterogeneous multi-source mmWave point cloud HAR dataset and DAP-Net architecture with Doppler reparameterization and text alignment for cross-source robustness.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"be interpreted as a soft dot product between the sorted Doppler values and the Gaussian weight distribution. Given the thresholdτt, each Doppler valuevi t is compared against it to obtain a soft motion indicator: si t =σ \u0012 vi t −τ t γ \u0013 , (2) whereγcontrols the sharpness of the transition. The resulting scores are then binarized using a Straight-Through Estimator (STE) [4,10], maintaining ex- plicit partitioning in the forward pass while propagating gradients through the soft probabilitiessi t in backpropagation. By modeling relative statistics via quan- tiles, DSQ achieves stable motion-aware partitioning under heterogeneous multi- source distributions, effectively mitigating source-induced statistical shifts and providing consistent motion priors for subsequent geometric reparameterization."},{"citing_arxiv_id":"2605.10989","ref_index":22,"ref_count":2,"confidence":0.98,"is_internal_anchor":true,"paper_title":"SURGE: Surrogate Gradient Adaptation in Binary Neural Networks","primary_cat":"cs.LG","submitted_at":"2026-05-09T09:52:38+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.03396","ref_index":7,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Design and Implementation of BNN-Based Object Detection on FPGA","primary_cat":"cs.AR","submitted_at":"2026-05-05T06:16:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"A BNN-based YOLOv3-tiny-like object detector with 1-bit weights and 8-bit activations is implemented in Verilog on FPGA, achieving 39.6% mAP50 on VOC and 0.999964 correlation with the ONNX model in RTL simulation.","context_count":1,"top_context_role":"method","top_context_polarity":"use_method","context_text":"779-788 (2016). [4] Redmon, J., Farhadi, A.: YOLO9000: Better, Faster, Stronger. In: CVPR, pp. 7263 -7271 (2017). [5] Redmon, J., Farhadi, A.: YOLOv3: An Incremental Improvement. arXiv:1804.02767 (2018). [6] Courbariaux, M., Bengio, Y., David, J.P.: BinaryConnect: Training Deep Neural Networks with Binary Weights during Propagations. In: NeurIPS (2015). [7] Hubara, I., Courbariaux, M., Soudry, D., et al.: Binarized Neural Networks. arXiv:1602.02830 (2016). [8] Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. In: ECCV, pp. 525-542 (2016). [9] Zhou, S., Wu, Y., Ni, Z., et al.: DoReFa-Net: Training Low Bitwidth Convolutional Neural"},{"citing_arxiv_id":"2604.19167","ref_index":43,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"LBLLM: Lightweight Binarization of Large Language Models via Three-Stage Distillation","primary_cat":"cs.LG","submitted_at":"2026-04-21T07:25:02+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LBLLM achieves better accuracy than prior binarization methods for LLMs by decoupling weight and activation quantization through initialization, layer-wise distillation, and learnable activation scaling.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2508.06974","ref_index":7,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models","primary_cat":"cs.CL","submitted_at":"2025-08-09T13:00:16+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A progressive training scheme with binary-aware initialization and dual-scaling allows pre-trained LLMs to be converted to high-performance 1-bit models without training from scratch.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2507.16079","ref_index":1,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"A Lower Bound for the Number of Linear Regions of Ternary ReLU Regression Neural Networks","primary_cat":"cs.LG","submitted_at":"2025-07-21T21:29:33+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Proves polynomial-in-width and exponential-in-depth lower bounds on linear regions for ternary ReLU regression networks, with width-doubling constructions achieving bounds comparable to unrestricted ReLU networks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2212.08989","ref_index":297,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Deep learning applied to computational mechanics: A comprehensive review, state of the art, and the classics","primary_cat":"cs.LG","submitted_at":"2022-12-18T02:03:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":2.0,"formal_verification":"none","one_line_summary":"A comprehensive review of deep learning techniques for computational mechanics, including LSTM for constitutive modeling, PINNs for PDE solving, optimizers, and kernel methods.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2208.07339","ref_index":125,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale","primary_cat":"cs.LG","submitted_at":"2022-08-15T17:08:50+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"LLM.int8() performs 8-bit inference for transformers up to 175B parameters with no accuracy loss by combining vector-wise quantization for most features with 16-bit mixed-precision handling of systematic outlier dimensions.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"1907.10804","ref_index":3,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Co-Evolutionary Compression for Unpaired Image Translation","primary_cat":"cs.CV","submitted_at":"2019-07-25T02:26:14+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A co-evolutionary compression technique reduces parameters and FLOPs in unpaired image-to-image translation GAN generators while maintaining translation quality on benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"1907.10159","ref_index":15,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Efficient Detection and Quantification of Timing Leaks with Neural Networks","primary_cat":"cs.CR","submitted_at":"2019-07-23T22:24:20+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Neural networks are trained as timing models of programs and analyzed via MILP to detect and quantify timing side-channel information leaks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"1907.09077","ref_index":12,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron SuperconductingTechnology","primary_cat":"cs.NE","submitted_at":"2019-07-22T01:44:49+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Proposes first stochastic-computing DNN acceleration framework tailored to AQFP superconducting technology.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"1906.12172","ref_index":3,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"New pointwise convolution in Deep Neural Networks through Extremely Fast and Non Parametric Transforms","primary_cat":"cs.CV","submitted_at":"2019-06-25T10:47:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Replacing pointwise convolutions with DWHT yields a model with 79.1% fewer parameters, 48.4% fewer FLOPs, and 1.49% higher accuracy than MobileNet-V1 on CIFAR-100.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"1906.09395","ref_index":24,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Adaptive Precision CNN Accelerator Using Radix-X Parallel Connected Memristor Crossbars","primary_cat":"eess.SP","submitted_at":"2019-06-22T06:14:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Radix-5 memristor crossbar CNN accelerator reaches 90.5% CIFAR-10 accuracy with 46% area reduction by using variable memristor counts and single-column signed weights.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"1906.09889","ref_index":13,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Improving Branch Prediction By Modeling Global History with Convolutional Neural Networks","primary_cat":"cs.DC","submitted_at":"2019-06-20T23:19:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"CNNs applied to global history improve prediction accuracy for hard-to-predict branches in SPEC 2017, with hardware-adapted inference and reusability across inputs.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}