Artificial Adaptive Intelligence: The Missing Stage Between Narrow and General Intelligence

Boris Kriuk

arxiv: 2605.16844 · v1 · pith:23WWUGMBnew · submitted 2026-05-16 · 💻 cs.AI

Artificial Adaptive Intelligence: The Missing Stage Between Narrow and General Intelligence

Boris Kriuk This is my paper

Pith reviewed 2026-05-19 21:08 UTC · model grok-4.3

classification 💻 cs.AI

keywords artificial adaptive intelligencehyperparameter removaladaptivity indexparametric minimalitymeta-learningevolutionary computationminimum description lengthself-adapting systems

0 comments

The pith

The path from narrow AI to general AI goes through an intermediate stage of Artificial Adaptive Intelligence that eliminates human-specified hyperparameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that current AI systems sit between narrow, task-specific tools and the hypothetical general intelligence, in a regime where various techniques are converging to let the system itself handle what humans used to tune. By naming this Artificial Adaptive Intelligence, or AAI, the author proposes measuring how much a system can adapt without external hyperparameter input while keeping strong results on varied tasks. This matters because it offers a practical yardstick for advancement that does not rely solely on bigger models or more data. A sympathetic reader would see it as a way to focus efforts on self-configuring systems that could bridge to more flexible intelligence.

Core claim

The paper claims that between narrow and general intelligence there exists a distinct regime called Artificial Adaptive Intelligence, defined operationally as systems that require no human-specified tunable hyperparameters while maintaining competitive performance across diverse tasks. It introduces an adaptivity index to quantify this by combining the fraction of absorbed hyperparameters with performance ratios, grounds parametric minimality in minimum description length principles, and outlines three pathways: data- and task-aware configuration, structural and evolutionary morphing, and in-training self-adaptation.

What carries the argument

The adaptivity index, which quantifies progress by measuring the portion of hyperparameters the system handles internally against its performance relative to specialized baselines.

If this is right

Measurement of AI progress should include an adaptivity dimension separate from scaling laws.
Development should prioritize methods that internalize hyperparameter choices through data-driven or evolutionary means.
Success criteria shift toward systems that generalize across tasks with minimal human intervention in setup.
Applications in fields like aerospace design and turbulence modeling benefit from self-adapting models that handle regime changes automatically.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If this framing is adopted, benchmarks for AI would need to test systems in hyperparameter-free modes across changing task distributions.
Research might prioritize integration of meta-learning and evolutionary computation to achieve parametric minimality faster.
This stage implies that true generality may emerge from cumulative adaptation rather than sudden jumps in capability.

Load-bearing premise

The various techniques in meta-learning, architecture search, and related areas have already converged on the principle of steadily removing human involvement in specifying parameters.

What would settle it

A survey or experiment showing that leading implementations in these fields still depend on substantial human-specified hyperparameters for competitive performance on diverse tasks would undermine the convergence claim.

read the original abstract

Between the narrow systems we deploy and the general intelligence we speculate about lies an entire regime of machine behavior that has never received its own name. This monograph argues that this regime is not empty: it is where meta-learning, neural architecture search, AutoML, continual learning, evolutionary computation, and physics-informed modeling have quietly converged on a common principle, namely the steady removal of the human from the loop of parameter specification. We name this regime Artificial Adaptive Intelligence (AAI) and define it operationally: a system exhibits AAI to the extent that it requires no human-specified tunable hyperparameters while maintaining competitive performance across a diverse distribution of tasks. To make the definition quantitative, we introduce an adaptivity index that measures progress along an axis orthogonal to scale, combining the fraction of hyperparameters absorbed by the system with the performance ratio against a task-specialized baseline. We develop the principle of parametric minimality and ground it in the minimum description length framework, showing that the appropriate hyperparameter count is data-determined rather than designer-determined. We then organize the field around three pathways to minimality: data- and task-aware configuration, structural and evolutionary morphing, and in-training self-adaptation. We analyze their stability, convergence, and governance implications, and illustrate them through case studies spanning aerospace design, financial regime detection, turbulence modeling, ecological dynamics, and vision-language systems. The thesis is that the path from ANI to AGI passes through AAI, and that naming this stage changes what we measure, what we build, and what we call a success.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper names Artificial Adaptive Intelligence as a stage between narrow and general AI and offers an adaptivity index, but the convergence claim across fields rests on assertion rather than evidence.

read the letter

This paper's main move is to label the space between narrow AI and AGI as Artificial Adaptive Intelligence, defined by systems that absorb hyperparameters without human input while holding performance. It introduces an adaptivity index that blends the fraction of absorbed parameters with a performance ratio to a baseline, and it grounds the idea in minimum description length to argue that the right hyperparameter count is data-driven.

Referee Report

3 major / 2 minor

Summary. The paper claims that an unnamed intermediate regime exists between narrow AI (ANI) and general AI (AGI), which the authors name Artificial Adaptive Intelligence (AAI). This regime is defined operationally as systems that require no human-specified tunable hyperparameters while maintaining competitive performance across tasks. The manuscript introduces a quantitative adaptivity index (fraction of absorbed hyperparameters combined with performance ratio to a task-specialized baseline), grounds a principle of parametric minimality in the minimum description length framework, organizes the literature around three pathways to minimality, analyzes stability/governance implications, and illustrates the ideas with case studies from aerospace, finance, turbulence, ecology, and vision-language systems. The central thesis is that progress from ANI to AGI necessarily passes through AAI.

Significance. If the claimed convergence on parametric minimality across the cited subfields can be substantiated and the adaptivity index shown to be non-circular, the framework could reorient evaluation and design priorities away from scale toward measurable autonomy in configuration. The explicit naming and quantitative index orthogonal to model size represent a conceptual contribution that might influence how success is defined in meta-learning and AutoML research.

major comments (3)

[Abstract and §2] Abstract and §2 (operational definition): The claim that meta-learning, neural architecture search, AutoML, continual learning, evolutionary computation, and physics-informed modeling have converged on 'steady removal of the human from the loop of parameter specification' is asserted without a comparative analysis of remaining human choices. Search-space boundaries, meta-objective formulations, initial population distributions, and physics-constraint selections remain human-specified in these fields, directly contradicting the AAI requirement of zero human-specified tunable hyperparameters.
[§3] §3 (adaptivity index definition): The index is defined as a combination of absorbed-hyperparameter fraction and performance ratio against a task-specialized baseline. Because the baseline itself is typically tuned with human-chosen hyperparameters, the index risks measuring a quantity that is partly defined by the human input it claims to have removed, rendering the 'parameter-free' interpretation circular.
[§5] §5 (three pathways and stability analysis): The pathways (data- and task-aware configuration, structural/evolutionary morphing, in-training self-adaptation) are presented as routes to parametric minimality, yet no derivations, convergence proofs, or error bounds are supplied showing that any pathway actually reaches zero human-specified hyperparameters while preserving the claimed stability and governance properties.

minor comments (2)

[Case studies section] The case studies would be strengthened by explicit numerical values of the adaptivity index for each example rather than qualitative description.
[§3] Notation for the adaptivity index should be introduced with a numbered equation to facilitate later reference.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We address each major point below, indicating where we agree and how we will revise the text. Our responses focus on clarifying distinctions, strengthening definitions, and acknowledging scope limitations while preserving the core conceptual contribution.

read point-by-point responses

Referee: [Abstract and §2] Abstract and §2 (operational definition): The claim that meta-learning, neural architecture search, AutoML, continual learning, evolutionary computation, and physics-informed modeling have converged on 'steady removal of the human from the loop of parameter specification' is asserted without a comparative analysis of remaining human choices. Search-space boundaries, meta-objective formulations, initial population distributions, and physics-constraint selections remain human-specified in these fields, directly contradicting the AAI requirement of zero human-specified tunable hyperparameters.

Authors: We agree that the manuscript would benefit from an explicit comparative analysis distinguishing fixed, high-level design choices (e.g., search-space boundaries or meta-objective formulations) from per-task tunable hyperparameters. Our operational definition of AAI targets the elimination of the latter. In the revised version we will insert a new subsection in §2 that systematically examines each cited field, documenting the historical reduction in tunable hyperparameters while noting persistent framework-level decisions. This addition will refine rather than retract the convergence claim. revision: partial
Referee: [§3] §3 (adaptivity index definition): The index is defined as a combination of absorbed-hyperparameter fraction and performance ratio against a task-specialized baseline. Because the baseline itself is typically tuned with human-chosen hyperparameters, the index risks measuring a quantity that is partly defined by the human input it claims to have removed, rendering the 'parameter-free' interpretation circular.

Authors: The referee correctly identifies a risk of circularity. We will revise the adaptivity index definition in §3 to employ either a theoretically derived baseline or a fixed, minimally parameterized reference configuration. The revised text will also include a short discussion of practical computation methods and a sensitivity analysis with respect to baseline choice, thereby isolating the contribution of absorbed hyperparameters. revision: yes
Referee: [§5] §5 (three pathways and stability analysis): The pathways (data- and task-aware configuration, structural/evolutionary morphing, in-training self-adaptation) are presented as routes to parametric minimality, yet no derivations, convergence proofs, or error bounds are supplied showing that any pathway actually reaches zero human-specified hyperparameters while preserving the claimed stability and governance properties.

Authors: We acknowledge that §5 currently presents the pathways at a conceptual level without new formal derivations or convergence proofs. As the manuscript is a perspective piece that organizes existing literature under the parametric-minimality principle, we will expand the section to cite relevant theoretical guarantees from the meta-learning and evolutionary-computation literature. Full error bounds for every pathway lie beyond the scope of this work and would require a separate technical paper; the revision will therefore focus on existing results and their implications for stability and governance. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected; proposal is self-contained conceptual reframing

full rationale

The paper proposes an operational definition of Artificial Adaptive Intelligence (AAI) as systems requiring no human-specified tunable hyperparameters while maintaining competitive performance, quantified via an adaptivity index that combines absorbed hyperparameter fraction with performance ratio to a task-specialized baseline. This is presented as a definitional and organizational framework rather than a derivation chain in which outputs reduce to inputs by construction. The principle of parametric minimality is explicitly grounded in the external minimum description length framework, not derived from the paper's own fitted quantities or assumptions. The three pathways to minimality organize existing literature (meta-learning, NAS, AutoML, etc.) without load-bearing self-citations, uniqueness theorems imported from prior author work, or ansatzes smuggled via citation. No equations or predictions are shown to be statistically forced or equivalent to the inputs; the central thesis is a conceptual claim about field convergence and measurement change, not a closed loop. The argument remains self-contained against external benchmarks such as MDL and the cited research areas.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The proposal rests on the unproven convergence of listed subfields and on the minimum description length principle as the justification for data-determined hyperparameters. The adaptivity index introduces weighting choices between hyperparameter fraction and performance ratio that are not derived from first principles.

free parameters (1)

weighting between hyperparameter fraction and performance ratio in adaptivity index
The index combines two quantities; any specific weighting or normalization is chosen rather than derived.

axioms (1)

domain assumption Minimum description length framework determines the appropriate hyperparameter count
Invoked to ground the principle of parametric minimality.

invented entities (1)

Artificial Adaptive Intelligence (AAI) no independent evidence
purpose: To designate the intermediate regime of systems that absorb hyperparameter specification
Newly coined term without independent empirical validation outside the paper's argument.

pith-pipeline@v0.9.0 · 5801 in / 1510 out tokens · 57763 ms · 2026-05-19T21:08:57.800164+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

A system exhibits Artificial Adaptive Intelligence to the extent that it requires no human-specified tunable hyperparameters while maintaining competitive performance... Principle of Parametric Minimality... grounded in the minimum description length framework, showing that the appropriate hyperparameter count is data-determined rather than designer-determined.
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection refines

?

refines
Relation between the paper passage and the cited Recognition theorem.

The adaptivity index... combines the fraction of hyperparameters absorbed by the system with the performance ratio against a task-specialized baseline... three pathways to minimality

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages · 12 internal anchors

[1]

Learning to continually learn.arXiv preprint arXiv:2002.09571,

Shawn Beaulieu, Lapo Frati, Thomas Miconi, Joel Lehman, Kenneth O Stanley, Jeff Clune, and Nick Cheney. Learning to continually learn.arXiv preprint arXiv:2002.09571,

work page arXiv 2002
[2]

Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

Yoshua Bengio, Nicholas Léonard, and Aaron Courville. Estimating or propagating gradients through stochastic neurons for conditional computation.arXiv preprint arXiv:1308.3432,

work page internal anchor Pith review Pith/arXiv arXiv
[3]

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

Michael M Bronstein, Joan Bruna, Taco Cohen, and Petar Veličković. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges.arXiv preprint arXiv:2104.13478,

work page internal anchor Pith review Pith/arXiv arXiv
[4]

Clune, Ai-gas: Ai-generating algorithms, an alternate paradigm for producing general artificial intelligence

Jeff Clune. AI-GAs: Ai-generating algorithms.arXiv preprint arXiv:1905.10985,

work page arXiv 1905
[5]

A simple convergence proof of adam and adagrad.arXiv preprint arXiv:2003.02395, 2020

Alexandre Défossez, Léon Bottou, Francis Bach, and Nicolas Usunier. A simple convergence proof of Adam and AdaGrad.arXiv preprint arXiv:2003.02395,

work page arXiv 2003
[6]

Towards A Rigorous Science of Interpretable Machine Learning

Finale Doshi-Velez and Been Kim. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608,

work page internal anchor Pith review Pith/arXiv arXiv
[7]

The CMA Evolution Strategy: A Tutorial

Nikolaus Hansen. The CMA evolution strategy: A tutorial.arXiv preprint arXiv:1604.00772,

work page internal anchor Pith review Pith/arXiv arXiv
[8]

Particle swarm optimization

James Kennedy and Russell Eberhart. Particle swarm optimization. InICNN, volume 4, pages 1942–1948,

work page 1942
[9]

Advancing Eurasia fire understanding through machine learning techniques.arXiv preprint arXiv:2502.17023, 2025a

69 Boris Kriuk. Advancing Eurasia fire understanding through machine learning techniques.arXiv preprint arXiv:2502.17023, 2025a. Boris Kriuk. MorphBoost: Self-organizing universal gradient boosting with adaptive tree morphing.arXiv preprint arXiv:2511.13234, 2025b. Boris Kriuk. AlphaJet: Automated conceptual aircraft synthesis via disentangled generative ...

work page arXiv
[10]

ORCA -- Online Regime Correlation Analyzer

Boris Kriuk and Fedor Kriuk. ORCA – online regime correlation analyzer.arXiv preprint arXiv:2604.17251, 2026a. Boris Kriuk and Fedor Kriuk. PSTNet: Physically-structured turbulence network.arXiv preprint arXiv:2603.07957, 2026b. Boris Kriuk and Logic Ng. Q-KVComm: Efficient multi-agent communication via adaptive KV cache compression. In2026 Second Interna...

work page internal anchor Pith review Pith/arXiv arXiv
[11]

An Empirical Model of Large-Batch Training

Sam McCandlish, Jared Kaplan, Dario Amodei, and OpenAI Dota Team. An empirical model of large-batch training.arXiv preprint arXiv:1812.06162,

work page internal anchor Pith review Pith/arXiv arXiv
[12]

Illuminating search spaces by mapping elites

70 Jean-Baptiste Mouret and Jeff Clune. Illuminating search spaces by mapping elites.arXiv preprint arXiv:1504.04909,

work page internal anchor Pith review Pith/arXiv arXiv
[13]

On First-Order Meta-Learning Algorithms

Alex Nichol, Joshua Achiam, and John Schulman. On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999,

work page internal anchor Pith review Pith/arXiv arXiv
[14]

Carbon Emissions and Large Neural Network Training

David Patterson, Joseph Gonzalez, Quoc Le, Chen Liang, Lluis-Miquel Munguia, Daniel Rothchild, David So, Maud Texier, and Jeff Dean. Carbon emissions and large neural network training.arXiv preprint arXiv:2104.10350,

work page internal anchor Pith review Pith/arXiv arXiv
[15]

Progressive Neural Networks

Andrei A Rusu, Neil C Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, and Raia Hadsell. Progressive neural networks.arXiv preprint arXiv:1606.04671,

work page internal anchor Pith review Pith/arXiv arXiv
[16]

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, and Ilya Sutskever. Evolution strategies as a scalable alternative to reinforcement learning.arXiv preprint arXiv:1703.03864,

work page internal anchor Pith review Pith/arXiv arXiv
[17]

Meta-Learning: A Survey

Joaquin Vanschoren. Meta-learning: A survey.arXiv preprint arXiv:1810.03548,

work page internal anchor Pith review Pith/arXiv arXiv
[18]

Neural network growth via greedy initialization

Wei Wen, Feng Yan, Yiran Chen, and Hai Li. Neural network growth via greedy initialization. arXiv preprint arXiv:2007.05205,

work page arXiv 2007
[19]

Show, attend and tell: Neural image caption generation with visual attention

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, and Yoshua Bengio. Show, attend and tell: Neural image caption generation with visual attention. InICML, pages 2048–2057,

work page 2048

[1] [1]

Learning to continually learn.arXiv preprint arXiv:2002.09571,

Shawn Beaulieu, Lapo Frati, Thomas Miconi, Joel Lehman, Kenneth O Stanley, Jeff Clune, and Nick Cheney. Learning to continually learn.arXiv preprint arXiv:2002.09571,

work page arXiv 2002

[2] [2]

Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

Yoshua Bengio, Nicholas Léonard, and Aaron Courville. Estimating or propagating gradients through stochastic neurons for conditional computation.arXiv preprint arXiv:1308.3432,

work page internal anchor Pith review Pith/arXiv arXiv

[3] [3]

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

Michael M Bronstein, Joan Bruna, Taco Cohen, and Petar Veličković. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges.arXiv preprint arXiv:2104.13478,

work page internal anchor Pith review Pith/arXiv arXiv

[4] [4]

Clune, Ai-gas: Ai-generating algorithms, an alternate paradigm for producing general artificial intelligence

Jeff Clune. AI-GAs: Ai-generating algorithms.arXiv preprint arXiv:1905.10985,

work page arXiv 1905

[5] [5]

A simple convergence proof of adam and adagrad.arXiv preprint arXiv:2003.02395, 2020

Alexandre Défossez, Léon Bottou, Francis Bach, and Nicolas Usunier. A simple convergence proof of Adam and AdaGrad.arXiv preprint arXiv:2003.02395,

work page arXiv 2003

[6] [6]

Towards A Rigorous Science of Interpretable Machine Learning

Finale Doshi-Velez and Been Kim. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608,

work page internal anchor Pith review Pith/arXiv arXiv

[7] [7]

The CMA Evolution Strategy: A Tutorial

Nikolaus Hansen. The CMA evolution strategy: A tutorial.arXiv preprint arXiv:1604.00772,

work page internal anchor Pith review Pith/arXiv arXiv

[8] [8]

Particle swarm optimization

James Kennedy and Russell Eberhart. Particle swarm optimization. InICNN, volume 4, pages 1942–1948,

work page 1942

[9] [9]

Advancing Eurasia fire understanding through machine learning techniques.arXiv preprint arXiv:2502.17023, 2025a

69 Boris Kriuk. Advancing Eurasia fire understanding through machine learning techniques.arXiv preprint arXiv:2502.17023, 2025a. Boris Kriuk. MorphBoost: Self-organizing universal gradient boosting with adaptive tree morphing.arXiv preprint arXiv:2511.13234, 2025b. Boris Kriuk. AlphaJet: Automated conceptual aircraft synthesis via disentangled generative ...

work page arXiv

[10] [10]

ORCA -- Online Regime Correlation Analyzer

Boris Kriuk and Fedor Kriuk. ORCA – online regime correlation analyzer.arXiv preprint arXiv:2604.17251, 2026a. Boris Kriuk and Fedor Kriuk. PSTNet: Physically-structured turbulence network.arXiv preprint arXiv:2603.07957, 2026b. Boris Kriuk and Logic Ng. Q-KVComm: Efficient multi-agent communication via adaptive KV cache compression. In2026 Second Interna...

work page internal anchor Pith review Pith/arXiv arXiv

[11] [11]

An Empirical Model of Large-Batch Training

Sam McCandlish, Jared Kaplan, Dario Amodei, and OpenAI Dota Team. An empirical model of large-batch training.arXiv preprint arXiv:1812.06162,

work page internal anchor Pith review Pith/arXiv arXiv

[12] [12]

Illuminating search spaces by mapping elites

70 Jean-Baptiste Mouret and Jeff Clune. Illuminating search spaces by mapping elites.arXiv preprint arXiv:1504.04909,

work page internal anchor Pith review Pith/arXiv arXiv

[13] [13]

On First-Order Meta-Learning Algorithms

Alex Nichol, Joshua Achiam, and John Schulman. On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999,

work page internal anchor Pith review Pith/arXiv arXiv

[14] [14]

Carbon Emissions and Large Neural Network Training

David Patterson, Joseph Gonzalez, Quoc Le, Chen Liang, Lluis-Miquel Munguia, Daniel Rothchild, David So, Maud Texier, and Jeff Dean. Carbon emissions and large neural network training.arXiv preprint arXiv:2104.10350,

work page internal anchor Pith review Pith/arXiv arXiv

[15] [15]

Progressive Neural Networks

Andrei A Rusu, Neil C Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, and Raia Hadsell. Progressive neural networks.arXiv preprint arXiv:1606.04671,

work page internal anchor Pith review Pith/arXiv arXiv

[16] [16]

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, and Ilya Sutskever. Evolution strategies as a scalable alternative to reinforcement learning.arXiv preprint arXiv:1703.03864,

work page internal anchor Pith review Pith/arXiv arXiv

[17] [17]

Meta-Learning: A Survey

Joaquin Vanschoren. Meta-learning: A survey.arXiv preprint arXiv:1810.03548,

work page internal anchor Pith review Pith/arXiv arXiv

[18] [18]

Neural network growth via greedy initialization

Wei Wen, Feng Yan, Yiran Chen, and Hai Li. Neural network growth via greedy initialization. arXiv preprint arXiv:2007.05205,

work page arXiv 2007

[19] [19]

Show, attend and tell: Neural image caption generation with visual attention

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, and Yoshua Bengio. Show, attend and tell: Neural image caption generation with visual attention. InICML, pages 2048–2057,

work page 2048