pith. sign in

arxiv: 2604.14645 · v1 · submitted 2026-04-16 · 💻 cs.CV · cs.AI· nlin.CD

Chaotic CNN for Limited Data Image Classification

Pith reviewed 2026-05-10 12:13 UTC · model grok-4.3

classification 💻 cs.CV cs.AInlin.CD
keywords chaotic mapsfeature transformationlimited data classificationimage classificationconvolutional neural networksgeneralization improvementnonlinear dynamics
0
0 comments X

The pith

Chaotic nonlinear maps applied to normalized CNN features improve accuracy on limited-data image classification tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether inserting a fixed nonlinear transformation drawn from a chaotic map into the feature pipeline of a convolutional network can reduce overfitting when training data are scarce. The transformation is applied after L2 normalization of the features extracted by the CNN backbone and immediately before the classification layer. It uses one of three standard chaotic maps (logistic, skew tent, or sine) and adds no trainable parameters or extra layers. Experiments on MNIST, Fashion-MNIST, and CIFAR-10 with 40 to 200 samples per class show consistent accuracy gains over identical networks without the map, with the largest recorded improvement reaching 9.11 percent. The authors attribute the benefit to the shared nonlinear and dynamical character of chaotic systems rather than to any single map.

Core claim

By inserting a fixed nonlinear transformation drawn from one of three chaotic maps into the feature pipeline of a convolutional neural network, the method reshapes the representation space so that classes become more separable even when only a few dozen examples per class are available for training. The transformation is parameter-free and is placed immediately before the final classification layer. Across three standard image datasets and CNNs of different depths, this single change produces accuracy improvements that range from a few percent to over nine percent relative to the identical network trained without the transformation. The gains appear for all three maps tested, suggesting that

What carries the argument

A fixed nonlinear map (logistic, skew tent, or sine) applied to L2-normalized feature vectors extracted by the CNN backbone before the classification layer.

Load-bearing premise

The accuracy gains arise specifically from the nonlinear and dynamical properties shared by chaotic systems rather than from generic nonlinearity or the normalization step alone.

What would settle it

Running the identical experiments with a non-chaotic but strongly nonlinear function such as a sigmoid or tanh applied after the same normalization, and observing no accuracy gain, would indicate that the chaotic character is not required.

read the original abstract

Convolutional neural networks (CNNs) often exhibit poor generalisation in limited training data scenarios due to overfitting and insufficient feature diversity. In this work, a simple and effective chaos-based feature transformation is proposed to enhance CNN performance without increasing model complexity. The method applies nonlinear transformations using logistic, skew tent, and sine maps to normalised feature vectors before the classification layer, thereby reshaping the feature space and improving class separability. The approach is evaluated on greyscale datasets (MNIST and Fashion-MNIST) and an RGB dataset (CIFAR-10) using CNN architectures of varying depth under limited data conditions. The results show consistent improvement over the standalone (SA) CNN across all datasets. Notably, a maximum performance gain of 5.43% is achieved on MNIST using the skew tent map with a 3-layer CNN at 40 samples per class. A higher gain of 9.11% is observed on Fashion-MNIST using the sine map with a 3-layer CNN at 50 samples per class. Additionally, a strong gain of 7.47% is obtained on CIFAR-10 using the skew tent map at 200 samples per class. The consistent improvements across different chaotic maps indicate that the performance gain is driven by the shared nonlinear and dynamical properties of chaotic systems. The proposed method is computationally efficient, requires no additional trainable parameters, and can be easily integrated into existing CNN architectures, making it a practical solution for data-scarce image classification tasks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a simple chaos-based feature transformation for CNNs in limited-data regimes: normalized feature vectors are passed through one of three chaotic maps (logistic, skew tent, sine) immediately before the classification layer. This is evaluated on MNIST, Fashion-MNIST and CIFAR-10 using 3- and 5-layer CNNs at varying samples-per-class (40–200), with reported accuracy gains over the standalone CNN baseline reaching 5.43 %, 9.11 % and 7.47 % respectively. The authors conclude that the consistent gains across maps demonstrate that performance improvement stems from the shared nonlinear and dynamical properties of chaotic systems. The method adds no trainable parameters and is claimed to be computationally lightweight.

Significance. If the reported gains prove robust and are shown to arise specifically from chaotic dynamics rather than generic nonlinearity plus normalization, the technique would supply a parameter-free, plug-in enhancement for data-scarce image classification. Its simplicity and lack of architectural overhead are genuine practical strengths. At present, however, the attribution to chaos remains an untested hypothesis, so the work’s immediate impact is limited to an interesting empirical observation rather than a validated methodological advance.

major comments (2)
  1. [Abstract and §4] Abstract and §4 (Results): the central claim that “the performance gain is driven by the shared nonlinear and dynamical properties of chaotic systems” is unsupported. The experiments compare only against the standalone CNN; no ablation replaces the chaotic maps with non-chaotic nonlinearities (element-wise tanh, ReLU, or low-order polynomials) applied to the identical normalized feature vectors at the same insertion point. Without this control the attribution cannot be distinguished from generic nonlinearity.
  2. [§3 and §4] §3 (Experimental Setup) and §4: no statistical significance tests, standard deviations across random seeds, or multiple-run averages are reported for the quoted gains (5.43 %, 9.11 %, 7.47 %). It is therefore impossible to assess whether the improvements exceed run-to-run variability or hyper-parameter sensitivity.
minor comments (2)
  1. [§2] §2 (Method): the precise insertion point and normalization formula would be clearer if presented as a short equation or pseudocode rather than prose only.
  2. [Tables and Figures] Table captions and axis labels in the result figures should explicitly state the number of independent runs and whether error bars represent standard deviation or standard error.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments on our manuscript. We address each major comment below and describe the revisions we will make to strengthen the work.

read point-by-point responses
  1. Referee: [Abstract and §4] Abstract and §4 (Results): the central claim that “the performance gain is driven by the shared nonlinear and dynamical properties of chaotic systems” is unsupported. The experiments compare only against the standalone CNN; no ablation replaces the chaotic maps with non-chaotic nonlinearities (element-wise tanh, ReLU, or low-order polynomials) applied to the identical normalized feature vectors at the same insertion point. Without this control the attribution cannot be distinguished from generic nonlinearity.

    Authors: We agree that the current experiments lack direct controls comparing the chaotic maps to non-chaotic nonlinearities applied at the same point on normalized features. This leaves open the possibility that the gains arise from generic nonlinearity rather than chaotic dynamics specifically. To resolve this, we will add ablation studies in the revised manuscript using element-wise tanh, ReLU, and low-order polynomial transformations on the identical normalized feature vectors. We will report the resulting accuracies and discuss whether the chaotic maps provide advantages beyond these baselines. revision: yes

  2. Referee: [§3 and §4] §3 (Experimental Setup) and §4: no statistical significance tests, standard deviations across random seeds, or multiple-run averages are reported for the quoted gains (5.43 %, 9.11 %, 7.47 %). It is therefore impossible to assess whether the improvements exceed run-to-run variability or hyper-parameter sensitivity.

    Authors: We acknowledge that single-run results without reported variability or statistical tests make it difficult to evaluate the robustness of the gains. In the revised manuscript we will rerun all experiments across multiple random seeds (at least five), report mean accuracies with standard deviations, and include statistical significance tests (e.g., paired t-tests) comparing each chaotic variant to the standalone CNN baseline. revision: yes

Circularity Check

0 steps flagged

No circularity; purely empirical evaluation of known maps

full rationale

The paper applies standard chaotic maps (logistic, skew-tent, sine) taken from prior dynamical-systems literature to normalized feature vectors inside existing CNN architectures and measures accuracy gains against standalone CNN baselines on MNIST, Fashion-MNIST and CIFAR-10 under limited-data regimes. No derivation, uniqueness theorem, fitted parameter, or self-citation is invoked to produce the reported numbers; the maps are inserted by ansatz and the gains are observed outcomes. The interpretive claim that gains stem from 'shared nonlinear and dynamical properties' is unsupported by controls but does not constitute a circular reduction of any result to the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on the domain assumption that chaotic maps enhance separability; no free parameters or new entities are introduced.

axioms (1)
  • domain assumption Nonlinear transformations from chaotic maps improve class separability in normalized feature space
    Invoked to explain why the method works across different maps and datasets.

pith-pipeline@v0.9.0 · 5569 in / 1205 out tokens · 50608 ms · 2026-05-10T12:13:19.589560+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

  1. [1]

    Nonlinear Dynamics, 1–16 (2025)

    Anusree, M., Pramod, P.N.: Understanding chaotic neural networks: A compre- hensive review. Nonlinear Dynamics, 1–16 (2025)

  2. [2]

    In: 2020 International Conference on Signal Processing and Communications (SPCOM), pp

    Harikrishnan, N., Nagaraj, N.: Neurochaos inspired hybrid machine learn- ing architecture for classification. In: 2020 International Conference on Signal Processing and Communications (SPCOM), pp. 1–5 (2020). IEEE

  3. [3]

    Electronics and Communications in Japan (Part III: Fundamental Electronic Science)81(8), 73–82 (1998)

    Mizutani, S., Sano, T., Uchiyama, T., Sonehara, N.: Controlling chaos in chaotic neural networks. Electronics and Communications in Japan (Part III: Fundamental Electronic Science)81(8), 73–82 (1998)

  4. [4]

    Journal of big data6(1), 1–48 (2019)

    Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. Journal of big data6(1), 1–48 (2019)

  5. [5]

    Neural Networks71, 1–10 (2015)

    Wu, H., Gu, X.: Towards dropout training for convolutional neural networks. Neural Networks71, 1–10 (2015)

  6. [6]

    IEEE transactions on neural networks and learning systems30(9), 2866–2875 (2019)

    Park, J.-G., Jo, S.: Bayesian weight decay on bounded approximation for deep convolutional neural networks. IEEE transactions on neural networks and learning systems30(9), 2866–2875 (2019)

  7. [7]

    Remote Sensing Letters8(9), 839–848 (2017)

    Liu, B., Yu, X., Zhang, P., Tan, X., Yu, A., Xue, Z.: A semi-supervised convo- lutional neural network for hyperspectral image classification. Remote Sensing Letters8(9), 839–848 (2017)

  8. [8]

    In: Journal of Physics: Conference Series, vol

    Gupta, J., Pathak, S., Kumar, G.: Deep learning (cnn) and transfer learning: a review. In: Journal of Physics: Conference Series, vol. 2273, p. 012029 (2022). IOP Publishing

  9. [9]

    Computerized Medical Imaging and Graphics70, 53–62 (2018)

    Gao, F., Wu, T., Li, J., Zheng, B., Ruan, L., Shang, D., Patel, B.: Sd-cnn: A 11 shallow-deep cnn for improved breast cancer diagnosis. Computerized Medical Imaging and Graphics70, 53–62 (2018)

  10. [10]

    In: 2017 20th International Conference of Computer and Information Technology (ICCIT), pp

    Hasan, M.S.,et al.: An application of pre-trained cnn for image classification. In: 2017 20th International Conference of Computer and Information Technology (ICCIT), pp. 1–6 (2017). IEEE

  11. [11]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp

    Wang, T., Zhang, X., Yuan, L., Feng, J.: Few-shot adaptive faster r-cnn. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7173–7182 (2019)

  12. [12]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp

    Wang, Y.-X., Ramanan, D., Hebert, M.: Meta-learning to detect rare objects. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9925–9934 (2019)

  13. [13]

    Chaos: An Interdisciplinary Journal of Nonlinear Science35(6) (2025)

    Henry, A., Sundaravaradhan, R., Nagaraj, N.: Simplified neurochaos learn- ing architectures for data classification. Chaos: An Interdisciplinary Journal of Nonlinear Science35(6) (2025)

  14. [14]

    Chaos Theory and Applications7(1), 10–30 (2025)

    AS, R.A., Nagaraj, N.: Random heterogeneous neurochaos learning architecture for data classification. Chaos Theory and Applications7(1), 10–30 (2025)

  15. [15]

    In: ESANN, pp

    Crook, N.T., Scheper, T.O.: A novel chaotic neural network architecture. In: ESANN, pp. 295–300 (2001)

  16. [16]

    Neural networks8(6), 915–930 (1995)

    Chen, L., Aihara, K.: Chaotic simulated annealing by a neural network model with transient chaos. Neural networks8(6), 915–930 (1995)

  17. [17]

    Nonlinear Theory and Its Applications, IEICE 12(4), 639–661 (2021)

    Fukuda, K., Horio, Y.: Analysis of dynamics in chaotic neural network reser- voirs: Time-series prediction tasks. Nonlinear Theory and Its Applications, IEICE 12(4), 639–661 (2021)

  18. [18]

    Advances in neural information processing systems15(2002)

    Jaeger, H.: Adaptive nonlinear system identification with echo state networks. Advances in neural information processing systems15(2002)

  19. [19]

    Chaos, Solitons & Fractals170, 113347 (2023)

    AS, R.A., Harikrishnan, N.B., Nagaraj, N.: Analysis of logistic map based neurons in neurochaos learning architectures for data classification. Chaos, Solitons & Fractals170, 113347 (2023)

  20. [20]

    Chaos: An Interdis- ciplinary Journal of Nonlinear Science29(11) (2019)

    Balakrishnan, H.N., Kathpalia, A., Saha, S., Nagaraj, N.: Chaosnet: A chaos based artificial neural network architecture for classification. Chaos: An Interdis- ciplinary Journal of Nonlinear Science29(11) (2019)

  21. [21]

    Chaos Theory and Applications7(2), 107–116 (2025)

    Henry, A., Nagaraj, N.: Neurochaos learning for classification using composition of chaotic maps. Chaos Theory and Applications7(2), 107–116 (2025)

  22. [22]

    Chaos Theory and Applications8(1), 16–23 12

    Anusree, M., Henry, A., Nair, P.: Self-training the neurochaos learning algorithm. Chaos Theory and Applications8(1), 16–23 12

  23. [23]

    In: International Conference on Information and Communication Technology for Competitive Strategies, pp

    Anusree, M., Reshmi, P., Valadi, J., Nair, P.P., Suravajhala, P.: Hypothetical protein classification using neurochaos learning architecture. In: International Conference on Information and Communication Technology for Competitive Strategies, pp. 337–346 (2024). Springer 13