pith. sign in

arxiv: 2605.23051 · v1 · pith:AS7SQSS6new · submitted 2026-05-21 · ⚛️ physics.optics · physics.app-ph

General-Purpose Photonic Computing Primitive for Contemporary Artificial Intelligence

Pith reviewed 2026-05-25 05:04 UTC · model grok-4.3

classification ⚛️ physics.optics physics.app-ph
keywords photonic computingoptical neural networktensor coresigned operand encodinghardware-aware trainingAI accelerationVODICDUET
0
0 comments X

The pith

Photonic tensor core uses structural symmetry to encode signed operands directly for AI workloads.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the dynamic universal encoding tensorcore (DUET) as a general-purpose photonic computing primitive. It uses vectorized operand differential interferometric cells (VODICs) that exploit symmetry to provide a full-range linear encoding for signed numbers. This removes common overheads in optical neural networks like sign splitting and preprocessing. The design is tested on various AI tasks with a hardware-aware training approach to manage imperfections. Readers would care as it suggests a path to more efficient optical acceleration for current AI models.

Core claim

By exploiting inherent structural symmetry, this design provides a full-range linear encoding interface that directly accommodates signed operands. This approach eliminates the sign-based path splitting, nonlinear remapping, and auxiliary preprocessing typically required in conventional ONNs.

What carries the argument

Vectorized operand differential interferometric cells (VODICs) in the DUET architecture, which enable full-range linear encoding through structural symmetry.

If this is right

  • Lower latency and reduced hardware overhead for matrix operations in ONNs.
  • Support for diverse modern AI architectures including Transformers without special adaptations.
  • Improved energy efficiency in photonic AI accelerators.
  • More stable inference performance through hardware-aware training.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar symmetry exploitation could be applied to other analog computing platforms to handle signed values.
  • If hardware precision improves, DUET might extend to real-time applications like autonomous systems.
  • Integration with digital systems could become simpler due to reduced preprocessing needs.

Load-bearing premise

The VODIC cells and overall DUET architecture can be realized in hardware with sufficient precision and stability that the hardware-aware training strategy fully compensates for on-chip non-idealities without degrading inference accuracy across diverse models.

What would settle it

Demonstrating a significant drop in accuracy on a Transformer model when deployed on actual photonic hardware compared to software simulation after applying the hardware-aware training.

read the original abstract

Photonic computing offers a promising route to accelerating artificial intelligence (AI) by providing high analog bandwidth, low latency, and low energy consumption. However, existing optical neural networks (ONNs) struggle with substantial hardware overhead and limited support for the dynamic, arbitrary matrix operations essential for modern AI architectures. Here we present the dynamic universal encoding tensorcore (DUET), a general-purpose photonic computing paradigm based on vectorized operand differential interferometric cells (VODICs). By exploiting inherent structural symmetry, this design provides a full-range linear encoding interface that directly accommodates signed operands. This approach eliminates the sign-based path splitting, nonlinear remapping, and auxiliary preprocessing typically required in conventional ONNs, thereby reducing latency and minimizing hardware and memory overhead. We further implement a hardware-aware training (HAT) strategy to alleviate the impact of on-chip non-idealities and ensure stable inference. DUET is experimentally validated across diverse architectures and application domains, ranging from image classification and medical segmentation to Transformer-based content generation, demonstrating competitive performance. By extending optical computing to universal, full-range operators across diverse model architectures, DUET provides a viable pathway toward general-purpose optical acceleration for contemporary AI workloads.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the Dynamic Universal Encoding Tensorcore (DUET) photonic computing paradigm based on vectorized operand differential interferometric cells (VODICs). It claims that structural symmetry enables a full-range linear encoding interface for signed operands without sign-based path splitting, nonlinear remapping, or auxiliary preprocessing, thereby reducing hardware overhead in optical neural networks. A hardware-aware training (HAT) strategy is introduced to mitigate on-chip non-idealities. The architecture is experimentally validated on image classification, medical segmentation, and Transformer-based content generation tasks, reporting competitive performance across diverse AI models.

Significance. If the experimental results and hardware assumptions hold, this work offers a meaningful advance toward general-purpose photonic accelerators for contemporary AI by extending optical computing to universal full-range operators. The symmetry-based encoding approach and HAT strategy receive credit for addressing key limitations of prior ONNs; the internal consistency of the symmetry argument provides a solid foundation for the central claim.

major comments (2)
  1. [Experimental validation section] Experimental validation section: the claim of competitive performance and stable inference across tasks is load-bearing for the overall contribution, yet the manuscript provides no quantitative data, error bars, implementation parameters for VODIC cells, or ablation studies on HAT compensation, preventing verification of the results.
  2. [HAT strategy description] Architecture and HAT description: the assertion that HAT fully compensates on-chip non-idealities without accuracy degradation (the weakest assumption) lacks specific analysis of the compensation range or stability bounds, which is required to support the hardware realization claim.
minor comments (2)
  1. Figure legends and captions should explicitly define all acronyms (VODIC, DUET, HAT) on first use and include scale bars or units where applicable.
  2. Ensure consistent notation for the linear encoding range throughout the text and equations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript to strengthen the experimental and HAT sections as requested.

read point-by-point responses
  1. Referee: [Experimental validation section] Experimental validation section: the claim of competitive performance and stable inference across tasks is load-bearing for the overall contribution, yet the manuscript provides no quantitative data, error bars, implementation parameters for VODIC cells, or ablation studies on HAT compensation, preventing verification of the results.

    Authors: We agree that the experimental validation section requires expanded quantitative detail for independent verification. The manuscript reports competitive performance on the listed tasks but does not include the requested error bars, VODIC cell parameters, or HAT ablations. In the revised manuscript we will add these elements, including tabulated accuracy metrics with standard deviations from repeated runs, explicit VODIC implementation parameters (phase resolution, insertion loss, and crosstalk values), and ablation results isolating the contribution of HAT. revision: yes

  2. Referee: [HAT strategy description] Architecture and HAT description: the assertion that HAT fully compensates on-chip non-idealities without accuracy degradation (the weakest assumption) lacks specific analysis of the compensation range or stability bounds, which is required to support the hardware realization claim.

    Authors: The referee is correct that the current description of HAT would be strengthened by explicit bounds. While the manuscript shows that HAT enables stable inference in the reported experiments, it does not quantify the compensation range or stability limits. We will add this analysis in revision, providing simulation and measurement results that delineate the range of phase/amplitude errors that can be compensated without measurable accuracy loss and the corresponding stability bounds under thermal and fabrication variation. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's central claims rest on the architectural description of VODIC symmetry enabling direct signed linear encoding and the use of HAT to compensate non-idealities, with validation through experimental results on multiple tasks. No equations, derivations, or self-citations are presented that reduce by construction to fitted inputs or prior author results; the symmetry argument and performance claims are supported by hardware experiments rather than self-referential fitting or renaming. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no equations, parameters, or explicit assumptions; free parameters, axioms, and invented entities cannot be extracted.

pith-pipeline@v0.9.0 · 5756 in / 1019 out tokens · 14467 ms · 2026-05-25T05:04:29.875071+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages · 5 internal anchors

  1. [1]

    & Hinton, G

    LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015)

  2. [2]

    Jordan, M. I. & Mitchell, T. M. Machine learning: trends, perspectives, and prospects. Science 349, 255–260 (2015)

  3. [3]

    Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017)

  4. [4]

    Radford, A. et al. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning 8748–8763 (PMLR, 2021)

  5. [5]

    Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020)

  6. [6]

    Kaplan, J. et al. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361 (2020)

  7. [7]

    Patterson, D. et al. Carbon emissions and large neural network training. arXiv preprint arXiv:2104.10350 (2021)

  8. [8]

    and Manso, G.F., 2020

    Thompson, N.C., Greenewald, K., Lee, K. and Manso, G.F., 2020. The computational limits of deep learning. arXiv preprint arXiv:2007.05558, 10(2)

  9. [9]

    DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

    Guo, D., Yang, D., Zhang, H., Song, J., Wang, P., Zhu, Q., Xu, R., Zhang, R., Ma, S., Bi, X. and Zhang, X., 2025. Deepseek- R1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948

  10. [10]

    Touvron, H. et al. Llama 2: open foundation and fine-tuned chat models. arXiv arXiv:2307.09288 (2023)

  11. [11]

    Llama 2 inferencing on a single GPU

    Dell Technologies. Llama 2 inferencing on a single GPU. Info Hub https://infohub.delltechnologies.com/t/llama-2- inferencing-on-a-single-gpu/ (2023). Accessed 1 March 2026

  12. [12]

    NVIDIA A100 Tensor Core GPU datasheet

    NVIDIA. NVIDIA A100 Tensor Core GPU datasheet. https://www.nvidia.com/en-us/data-center/a100/. Accessed 1 March 2026

  13. [13]

    Waldrop, M. M. More than Moore. Nature 530, 144–148 (2016)

  14. [14]

    The future of computing beyond Moore’s Law

    Shalf, J. The future of computing beyond Moore’s Law. Philos. Trans. R. Soc. A 378, 20190061 (2020)

  15. [15]

    1.1 computing’s energy problem and what we can do about it

    Horowitz, M. 1.1 computing’s energy problem and what we can do about it. In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers 10–14 (2014)

  16. [16]

    Fang, F. et al. Towards atomic and close-to-atomic scale manufacturing. Int. J. Extrem. Manuf. 1, 012001 (2019)

  17. [17]

    & Dongarra, J

    Reed, D., Gannon, D. & Dongarra, J. Reinventing high performance computing: challenges and opportunities. arXiv preprint arXiv:2203.02544 (2022)

  18. [18]

    Ning, S. et al. Photonic-electronic integrated circuits for high -performance computing and AI accelerators. J. Lightwave Technol. 42, 7834–7859 (2024)

  19. [19]

    Shastri, B. J. et al. Photonics for artificial intelligence and neuromorphic computing. Nat. Photonics 15, 102–114 (2021)

  20. [20]

    Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photonics 11, 441–446 (2017)

  21. [21]

    Tait, A. N. et al. Neuromorphic photonic networks using silicon photonic weight banks. Sci. Rep. 7, 7430 (2017)

  22. [22]

    Feldmann, J. et al. Parallel convolutional processing using an integrated photonic tensor core. Nature 589, 52–58 (2021)

  23. [23]

    Xu, X. et al. 11 TOPS photonic convolutional accelerator for optical neural networks. Nature 589, 44–51 (2021)

  24. [24]

    Fu, T. et al. Photonic machine learning with on-chip diffractive optics. Nat. Commun. 14, 70 (2023)

  25. [25]

    Xu, Z. et al. Large-scale photonic chiplet Taichi empowers 160-TOPS/W artificial general intelligence. Science 384, 202–209 (2024)

  26. [26]

    & Fang, L

    Wang, C., Cheng, Y., Xu, Z., Dai, Q. & Fang, L. Diffractive tensorized unit for million-TOPS general-purpose computing. Nat. Photonics 19, 1078–1087 (2025)

  27. [27]

    Meng, X. et al. Compact optical convolution processing unit based on multimode interference. Nat. Commun. 14, 20 3000 (2023)

  28. [28]

    Li, J. et al. End-to-end closed-loop optoelectronic computing breaking precision-accuracy coupling. Adv. Photonics 8, 016005 (2026)

  29. [29]

    Zhu, H. H. et al. Space-efficient optical computing with an integrated chip diffractive neural network. Nat. Commun. 13, 1044 (2022)

  30. [30]

    Feng, C. et al. A compact butterfly-style silicon photonic-electronic neural chip for hardware-efficient deep learning. ACS Photonics 9, 3906–3916 (2022)

  31. [31]

    Ning, S. et al. Hardware-efficient photonic tensor core: accelerating deep neural networks with structured compression. Optica 12, 1079–1089 (2025)

  32. [32]

    V., Byrd, M

    Timurdogan, E., Poulton, C. V., Byrd, M. J. & Watts, M. R. Electric field-induced second-order nonlinear optical effects in silicon waveguides. Nat. Photonics 11, 200–206 (2017)

  33. [33]

    Peltier, J. et al. High-speed silicon photonic electro-optic Kerr modulation. Photonics Res. 12, 51–60 (2023)

  34. [34]

    Xia, P. et al. High linearity silicon DC Kerr modulator enhanced by slow light for 112 Gbit/s PAM4 over 2 km single mode fiber transmission. Opt. Express 30, 16996–17007 (2022)

  35. [35]

    Hiraki, T. et al. Heterogeneously integrated III -V/Si MOS capacitor Mach-Zehnder modulator. Nat. Photonics 11, 482–485 (2017)

  36. [36]

    S., Hu, Y., Li, W

    Berikaa, E., Alam, M. S., Hu, Y., Li, W. & Plant, D. V. C -band 100 Gb/s transmission over 40 km SSMF using a silicon photonic vestigial sideband transmitter based on dual -drive MZM and passive optical delay line. In Optical Fiber Communication Conference (OFC), paper Th3E.7 (2023)

  37. [37]

    & Chen, Y

    Dong, P., Chen, L. & Chen, Y. K. High -speed low-voltage single-drive push-pull silicon Mach-Zehnder modulators. Opt. Express 20, 6163–6169 (2012)

  38. [38]

    & Simard, P

    Chellapilla, K., Puri, S. & Simard, P. High performance convolutional neural networks for document processing. In Tenth International Workshop on Frontiers in Handwriting Recognition (2006)

  39. [39]

    Jia, Y. et al. Caffe: convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM International Conference on Multimedia 675–678 (2014)

  40. [40]

    Xue, Z. et al. Fully forward mode training for optical neural networks. Nature 632, 280–286 (2024)

  41. [41]

    Wu, B. et al. Scaling up for end-to-end on-chip photonic neural network inference. Light Sci. Appl. 14, 328 (2025)

  42. [42]

    Zhao, B. et al. In-situ trained microring-based neural networks for scalable and robust photonic computing. Laser Photonics Rev. 20, 2501576 (2025)

  43. [43]

    & Igel, C

    Stallkamp, J., Schlipsing, M., Salmen, J. & Igel, C. The German traffic sign recognition benchmark: a multi -class classification competition. In The 2011 International Joint Conference on Neural Networks 1453–1460 (2011)

  44. [44]

    Menze, B. H. et al. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging 34, 1993–2024 (2014)

  45. [45]

    Bakas, S. et al. Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci. Data 4, 170117 (2017)

  46. [46]

    Bakas, S. et al. Segmentation labels for the pre -operative scans of the TCGA-GBM collection. The Cancer Imaging Archive (2017)

  47. [47]

    & Brox, T

    Ronneberger, O., Fischer, P. & Brox, T. U -net: convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention 234–241 (2015)

  48. [48]

    S., Brox, T

    Çiçek, Ö., Abdulkadir, A., Lienkamp, S. S., Brox, T. & Ronneberger, O. 3D U -Net: learning dense volumetric segmentation from sparse annotation. In International Conference on Medical Image Computing and Computer - 21 Assisted Intervention 424–432 (2016)

  49. [49]

    & Sun, J

    He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016)

  50. [50]

    Radford, A. et al. Language models are unsupervised multitask learners. OpenAI Blog 1, 9 (2019)

  51. [51]

    W., Lee, K

    Devlin, J., Chang, M. W., Lee, K. & Toutanova, K. Bert: pre -training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 Long and Short Papers 4171–4186 (2019)

  52. [52]

    Neural Machine Translation by Jointly Learning to Align and Translate

    Bahdanau, D., Cho, K. & Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)

  53. [53]

    Dosovitskiy, A. et al. An image is worth 16x16 words: transformers for image recognition at scale. In International Conference on Learning Representations (2021)

  54. [54]

    Wang, J. et al. Optimization and demonstration of a large-bandwidth carrier-depletion silicon optical modulator. J. Lightwave Technol. 31, 4119–4125 (2013)

  55. [55]

    & Choi, Y

    Holtzman, A., Buys, J., Du, L., Forbes, M. & Choi, Y. The curious case of neural text degeneration. In International Conference on Learning Representations (2020)

  56. [56]

    & Dauphin, Y

    Fan, A., Lewis, M. & Dauphin, Y. Hierarchical neural story generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics Volume 1: Long Papers 889–898 (2018)