Small-scale photonic Kolmogorov-Arnold networks using standard telecom nonlinear modules
Pith reviewed 2026-05-21 09:24 UTC · model grok-4.3
The pith
Small photonic networks using standard telecom modules achieve high accuracy on nonlinear classification and regression tasks with far fewer parameters than software KANs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Replacing linear optical meshes with a small number of trainable nonlinear modules—each consisting of a Mach-Zehnder interferometer, semiconductor optical amplifier, and variable optical attenuators—allows fully optical KANs to deliver strong inference performance on classification, regression, and image tasks while using significantly fewer parameters than equivalent software networks. A four-module implementation reaches 94.3 percent accuracy (IQR 90.3–97.4 percent across ten seeds) on nonlinear classification; a seven-module network reaches R² = 0.986 ± 0.015 on six-input regression. The approach remains effective down to 6-bit input resolution and 14 dB signal-to-noise ratio.
What carries the argument
The four-parameter optical transfer function obtained from gain saturation in the semiconductor optical amplifier combined with interferometric mixing in the Mach-Zehnder interferometer, which supplies the nonlinear activation for each edge in the KAN graph.
If this is right
- Networks of only four optical modules reach 94.3 percent median accuracy on nonlinear classification benchmarks.
- Seven-module networks attain R² = 0.986 on six-input regression tasks.
- Performance stays high under 6-bit input resolution and 14 dB signal-to-noise ratio.
- End-to-end optimization via a differentiable physics model removes the need for separate electronic nonlinear stages.
- The architecture uses substantially fewer parameters than software KAN baselines while remaining fully optical.
Where Pith is reading between the lines
- All-optical inference becomes feasible without repeated optical-to-electrical conversions for small KAN models.
- The same module design could be tested on other photonic computing primitives beyond KANs.
- Scaling beyond seven modules would require checking cumulative effects of noise and parameter drift in longer optical paths.
- Experimental realization on a single photonic integrated circuit would directly test the simulation-to-hardware pathway.
Load-bearing premise
The specific nonlinear response produced by gain saturation and interferometric mixing in each Mach-Zehnder-plus-SOA-plus-attenuator module is expressive enough to support effective KAN-style learning despite having only four adjustable parameters.
What would settle it
A physical four-module network optimized in simulation fails to exceed 80 percent accuracy on the same nonlinear classification task when implemented with real hardware under measured noise levels and component tolerances.
read the original abstract
Photonic neural networks promise ultrafast inference, yet most architectures rely on linear optical meshes with electronic nonlinearities, reintroducing optical-electrical-optical bottlenecks. Here we introduce small-scale photonic Kolmogorov-Arnold networks (SSP-KANs) implemented entirely with standard telecommunications components. Each network edge employs a trainable nonlinear module composed of a Mach-Zehnder interferometer, semiconductor optical amplifier, and variable optical attenuators, providing a four-parameter transfer function derived from gain saturation and interferometric mixing. Despite the constrained functional form of these optical nonlinearities, SSP-KANs comprising only a few optical modules achieve strong nonlinear inference performance across classification, regression, and image recognition tasks, approaching software baselines with significantly fewer parameters. A four-module network achieves $94.3$\% (IQR: $90.3$--$97.4$\%, 10~seeds) accuracy on nonlinear classification benchmarks; a seven-module network attains $R^2 = 0.986 \pm 0.015$ on six-input regression. Performance remains robust under realistic hardware impairments, maintaining high accuracy down to 6-bit input resolution and 14 dB signal-to-noise ratio. By using a fully differentiable physics model for end-to-end optimisation of optical parameters, this work establishes a practical pathway from simulation to experimental demonstration of photonic KANs using commodity telecom hardware.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces small-scale photonic Kolmogorov-Arnold networks (SSP-KANs) implemented with standard telecom components. Each network edge uses a trainable nonlinear module consisting of a Mach-Zehnder interferometer, semiconductor optical amplifier, and variable optical attenuators, yielding a four-parameter transfer function derived from gain saturation and interferometric mixing. Using end-to-end differentiable physics-based optimization, the authors report that networks with only a few modules achieve 94.3% accuracy (IQR 90.3-97.4%, 10 seeds) on nonlinear classification, R² = 0.986 ± 0.015 on six-input regression, and competitive results on image recognition tasks, while maintaining performance under realistic impairments such as 6-bit resolution and 14 dB SNR.
Significance. If the results hold under experimental validation, this work offers a practical route to ultrafast photonic KANs using commodity hardware, avoiding optical-electrical-optical bottlenecks and employing far fewer parameters than conventional photonic neural networks. The fully differentiable physics model for end-to-end optimization and the inclusion of robustness tests with multiple seeds and IQR reporting are clear strengths that enhance reproducibility and experimental feasibility.
major comments (2)
- §3 (Nonlinear Module Description): The central claim that the constrained four-parameter optical transfer function supports effective KAN-style learning rests on its functional expressivity. The manuscript does not provide an analysis or visualization of the function family realizable by the MZI+SOA+attenuator module (e.g., range of monotonicity, number of inflection points, or ability to approximate non-monotonic shapes), which is load-bearing for generalizing the 94.3% accuracy and R²=0.986 results beyond the tested benchmarks.
- Results section and performance tables: Concrete metrics are reported with IQR and seed counts, yet the text provides no details on simulation validation against physical hardware measurements, error propagation analysis, or explicit baseline comparisons to software KANs or other photonic architectures. This omission undermines verification of the claim that performance approaches software baselines with significantly fewer parameters.
minor comments (2)
- Abstract: The phrase 'six-input regression' lacks a brief description of the underlying task or dataset, which would improve clarity for readers.
- Figure captions (throughout): Adding explicit labels for the network topologies (e.g., number of modules and connectivity) used in each experiment would aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed review. We address each major comment below and indicate the revisions made to the manuscript.
read point-by-point responses
-
Referee: §3 (Nonlinear Module Description): The central claim that the constrained four-parameter optical transfer function supports effective KAN-style learning rests on its functional expressivity. The manuscript does not provide an analysis or visualization of the function family realizable by the MZI+SOA+attenuator module (e.g., range of monotonicity, number of inflection points, or ability to approximate non-monotonic shapes), which is load-bearing for generalizing the 94.3% accuracy and R²=0.986 results beyond the tested benchmarks.
Authors: We agree that an explicit analysis of the realizable function family would strengthen the central claim. In the revised manuscript we have added a new subsection to §3 together with a supplementary figure that plots the four-parameter transfer function over the full parameter range. The plots show that the module produces strictly monotonic responses for certain parameter regimes and non-monotonic responses with a single inflection point when the MZI phase shift interacts with SOA saturation; the family remains continuous and differentiable, consistent with the end-to-end optimisation used in the work. These additions directly support the reported benchmark performance. revision: yes
-
Referee: Results section and performance tables: Concrete metrics are reported with IQR and seed counts, yet the text provides no details on simulation validation against physical hardware measurements, error propagation analysis, or explicit baseline comparisons to software KANs or other photonic architectures. This omission undermines verification of the claim that performance approaches software baselines with significantly fewer parameters.
Authors: The study is entirely simulation-based using a differentiable physics model of the telecom components. We have now inserted an explicit comparison table in the Results section that reports accuracy and parameter count against both software KAN implementations and representative photonic neural-network architectures from the literature, confirming that the photonic KANs reach within a few percent of software baselines while using orders-of-magnitude fewer trainable parameters. The existing robustness sweeps under 6-bit quantisation and 14 dB SNR already constitute a model-level error-propagation study; we have clarified this point in the text. Direct experimental measurements on fabricated hardware are outside the scope of the present simulation-focused manuscript. revision: partial
- Direct experimental validation against physical hardware measurements
Circularity Check
No significant circularity; performance arises from independent optimization under physics-derived constraints
full rationale
The paper models the nonlinear module via explicit physical equations for gain saturation in the SOA and phase shifts in the MZI, yielding a four-parameter transfer function that is then optimized end-to-end in a differentiable simulator. Reported accuracies (94.3 %) and R² values (0.986) are outputs of this numerical training process on benchmark tasks, not quantities that reduce by construction to the input parameters or to any self-citation. No load-bearing uniqueness theorem, ansatz smuggling, or renaming of known results is present; the functional form is fixed by telecom-component physics rather than fitted post-hoc to the target metrics. The chain therefore remains self-contained and externally falsifiable against software KAN baselines.
Axiom & Free-Parameter Ledger
free parameters (1)
- four-parameter transfer function coefficients
axioms (1)
- domain assumption The combined Mach-Zehnder, SOA, and attenuator module produces a differentiable nonlinear response suitable for gradient-based optimization.
Reference graph
Works this paper leans on
-
[1]
Nature Photonics 15(2), 102–114 (2021) https://doi.org/10.1038/ s41566-020-00754-y 29
Shastri, B.J., Tait, A.N., Lima, T., Pernice, W.H.P., Bhaskaran, H., Wright, C.D., Prucnal, P.R.: Photonics for artificial intelligence and neuromorphic computing. Nature Photonics 15(2), 102–114 (2021) https://doi.org/10.1038/ s41566-020-00754-y 29
work page 2021
-
[2]
Nature Photonics 11 (2017) https://doi.org/10.1038/s41566-017-0058-3
Zibar, D., Wymeersch, H., Lyubomirsky, I.: Machine learning under the spotlight. Nature Photonics 11 (2017) https://doi.org/10.1038/s41566-017-0058-3
-
[3]
Nature Photonics 15, 91 (2021)
Genty, G., Salmela, L., Dudley, J.M., Brunner, D., Kokhanovskiy, A., Kobtsev, S.M., Turitsyn, S.K.: Machine learning and applications in ultrafast photonics. Nature Photonics 15, 91 (2021)
work page 2021
-
[4]
IEEE Journal of Selected Topics in Quantum Electronics 28(4: Mach
Freire, P.J., Napoli, A., Spinnler, B., Costa, N., Turitsyn, S.K., Prilepsky, J.E.: Neural networks-based equalizers for coherent optical transmission: Caveats and pitfalls. IEEE Journal of Selected Topics in Quantum Electronics 28(4: Mach. Learn. in Photon. Commun. and Meas. Syst.), 7600223 (2022) https://doi.org/ 10.1109/JSTQE.2022.3174268
-
[5]
Journal of Lightwave Technology 34(6), 1442–1452 (2016) https://doi.org/10.1109/JLT.2015.2508502
Zibar, D., Piels, M., Jones, R., Schäeffer, C.G.: Machine learning techniques in optical communication. Journal of Lightwave Technology 34(6), 1442–1452 (2016) https://doi.org/10.1109/JLT.2015.2508502
-
[6]
In: Advanced Photonics 2017 (IPR, NOMA, Sensors, Networks, SPPCom, PS), pp
Khan, F.N., Lu, C., Lau, A.P.T.: Machine learning methods for optical communication systems. In: Advanced Photonics 2017 (IPR, NOMA, Sensors, Networks, SPPCom, PS), pp. 2–3. Optica Publish- ing Group, ??? (2017). https://doi.org/10.1364/SPPCOM.2017.SpW2F.3 . http://opg.optica.org/abstract.cfm?URI=SPPCom-2017-SpW2F.3
-
[7]
Freire, P., Manuylovich, E., Prilepsky, J.E., Turitsyn, S.K.: Artificial neural net- works for photonic applications—from algorithms to implementation: tutorial. Adv. Opt. Photon. 15(3), 739–834 (2023) https://doi.org/10.1364/AOP.484119
-
[8]
Reports on Progress in Physics 84(1), 012401 (2020) https://doi.org/10.1088/1361-6633/abb4c7
Piccinotti, D., MacDonald, K.F., Gregory, S.A., Youngs, I., Zheludev, N.I.: Arti- ficial intelligence for photonics and photonic materials. Reports on Progress in Physics 84(1), 012401 (2020) https://doi.org/10.1088/1361-6633/abb4c7
-
[9]
Reviews in Physics 12, 100093 (2024) https://doi.org/ 10.1016/j.revip.2024.100093
Abreu, S., Boikov, I., Goldmann, M., Jonuzi, T., Lupo, A., Masaad, S., Nguyen, L., Picco, E., Pourcel, G., Skalli, A., Talandier, L., Vettelschoss, B., Vlieg, E.A., Argyris, A., Bienstman, P., Brunner, D., Dambre, J., Daudet, L., Domenech, J.D., Fischer, I., Horst, F., Massar, S., Mirasso, C.R., Offrein, B.J., Rossi, A., Soriano, M.C., Sygletos, S., Turit...
-
[10]
Nature 588(7836), 39–47 (2020) https://doi.org/10.1038/ s41586-020-2973-6
Wetzstein, G., Ozcan, A., Gigan, S., Fan, S., Englund, D., Soljačić, M., Denz, C., Miller, D.A.B., Psaltis, D.: Inference in artificial intelligence with deep optics and photonics. Nature 588(7836), 39–47 (2020) https://doi.org/10.1038/ s41586-020-2973-6
work page 2020
-
[11]
Nature Reviews Physics 5(12), 717–734 (2023) 30
McMahon, P.L.: The physics of optical computing. Nature Reviews Physics 5(12), 717–734 (2023) 30
work page 2023
-
[12]
Nature 606(7914), 501–506 (2022) https://doi.org/10
Ashtiani, F., Geers, A.J., Aflatouni, F.: An on-chip photonic deep neural network for image classification. Nature 606(7914), 501–506 (2022) https://doi.org/10. 1038/s41586-022-04714-0
work page 2022
-
[13]
: Kolmogorov-arnold network for efficient equalization in short-reach im/dd systems
Chen, C., Xu, Z., Liu, Y., Wu, Q., Ji, T., Ji, H., Tang, J., Sun, Z., Fan, L., Liang, J., et al. : Kolmogorov-arnold network for efficient equalization in short-reach im/dd systems. Optics Express 33(16), 33139–33152 (2025)
work page 2025
-
[14]
Fischer, R., Matalla, P., Randel, S., Schmalen, L.: Non-linear equalization in 112 Gb/s PONs using Kolmogorov–Arnold networks (2024). https://doi.org/10. 48550/arXiv.2411.19631
-
[15]
https://doi.org/10.48550/arXiv.2408.08407
Peng, Y., Hooten, S., Yu, X., Van Vaerenbergh, T., Yuan, Y., Xiao, X., Tossoun, B., Cheung, S., Fiorentino, M., Beausoleil, R.: Photonic KAN: a Kolmogorov– Arnold network inspired efficient photonic neuromorphic architecture (2024). https://doi.org/10.48550/arXiv.2408.08407
-
[16]
Stroev, N., Berloff, N.G.: Programmable k-local Ising machines and all-optical Kolmogorov–Arnold networks on photonic platforms (2025). https://doi.org/10. 48550/arXiv.2508.17440
-
[17]
KAN: Kolmogorov-Arnold Networks
Liu, Z., Wang, Y., Vaidya, S., Ruehle, F., Halverson, J., Soljačić, M., Hou, T.Y., Tegmark, M.: KAN: Kolmogorov–Arnold Networks (2025). https://doi.org/10. 48550/arXiv.2404.19756
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[18]
Dua, D., Graff, C.: UCI Machine Learning Repository. https://archive.ics.uci. edu/ml (2017)
work page 2017
-
[19]
International Shipbuilding Progress 28(328), 276–297 (1981) https://doi.org/10.3233/ISP-1981-2832801
Gerritsma, J., Onnink, R., Versluis, A.: Geometry, resistance and stability of the Delft systematic yacht hull series. International Shipbuilding Progress 28(328), 276–297 (1981) https://doi.org/10.3233/ISP-1981-2832801
-
[20]
Advances in Optics and Photonics 14(3), 571 (2022) https: //doi.org/10.1364/AOP.451872
Sobhanan, A., Anthur, A., O’Duill, S., Pelusi, M., Namiki, S., Barry, L., Venkitesh, D., Agrawal, G.P.: Semiconductor optical amplifiers: recent advances and applications. Advances in Optics and Photonics 14(3), 571 (2022) https: //doi.org/10.1364/AOP.451872
-
[21]
In: Proceedings of the International Conference on Learning Representations (ICLR) (2019)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: Proceedings of the International Conference on Learning Representations (ICLR) (2019)
work page 2019
-
[22]
IEEE Journal of Quantum Electronics 25(11), 2297–2306 (1989) https://doi.org/10.1109/3.42059
Agrawal, G.P., Olsson, N.A.: Self-phase modulation and spectral broadening of optical pulses in semiconductor laser amplifiers. IEEE Journal of Quantum Electronics 25(11), 2297–2306 (1989) https://doi.org/10.1109/3.42059 . Accessed 2025-11-18 31
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.