A Survey on Dynamic Neural Networks: from Computer Vision to Multi-modal Sensor Fusion
Pith reviewed 2026-05-23 05:36 UTC · model grok-4.3
The pith
Dynamic neural networks adapt computations to each input's complexity instead of using fixed structures for all inputs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Dynamic Neural Networks allow to condition the number of computations to the specific input. The current literature on the topic is very extensive and fragmented. We present a comprehensive survey that synthesizes and unifies existing Dynamic Neural Networks research in the context of Computer Vision. Additionally, we provide a logical taxonomy based on which component of the network is adaptive: the output, the computation graph or the input. Furthermore, we argue that Dynamic Neural Networks are particularly beneficial in the context of Sensor Fusion for better adaptivity, noise reduction and information prioritization. We present preliminary works in this direction.
What carries the argument
Taxonomy that classifies dynamic networks by the adaptive component: output, computation graph, or input
If this is right
- Static compression methods ignore that different inputs need different amounts of work.
- In sensor fusion, adaptive computation can down-weight noisy channels and emphasize reliable ones.
- A shared taxonomy reduces duplication of effort across vision and fusion papers.
- The supplied repository lowers the barrier to reproducing and extending existing dynamic methods.
Where Pith is reading between the lines
- The same adaptive-component taxonomy could be tested on sequential decision tasks outside vision.
- Measuring average FLOPs saved on embedded hardware would make the fusion benefit concrete.
- Dynamic prioritization might interact with uncertainty estimation techniques already used in robotics.
Load-bearing premise
The existing literature is fragmented enough that a taxonomy based on adaptive components will make the field easier to navigate and apply.
What would settle it
A controlled comparison in which models using the proposed taxonomy show no measurable gains in adaptivity, noise handling, or prioritization during multi-modal sensor fusion tasks.
Figures
read the original abstract
Model compression is essential in the deployment of large Computer Vision models on embedded devices. However, static optimization techniques (e.g. pruning, quantization, etc.) neglect the fact that different inputs have different complexities, thus requiring different amount of computations. Dynamic Neural Networks allow to condition the number of computations to the specific input. The current literature on the topic is very extensive and fragmented. We present a comprehensive survey that synthesizes and unifies existing Dynamic Neural Networks research in the context of Computer Vision. Additionally, we provide a logical taxonomy based on which component of the network is adaptive: the output, the computation graph or the input. Furthermore, we argue that Dynamic Neural Networks are particularly beneficial in the context of Sensor Fusion for better adaptivity, noise reduction and information prioritization. We present preliminary works in this direction. We complement this survey with a curated repository listing all the surveyed papers, each with a brief summary of the solution and the code base when available: https://github.com/DTU-PAS/awesome-dynn-for-cv .
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript surveys dynamic neural networks (DyNNs) in computer vision, presenting a taxonomy that classifies methods according to the adaptive component (output, computation graph, or input). It argues that DyNNs offer particular advantages for multi-modal sensor fusion through improved adaptivity, noise reduction, and information prioritization, supports this with preliminary works, and provides a curated GitHub repository of surveyed papers with summaries and code links where available.
Significance. If the taxonomy proves comprehensive and the literature synthesis accurate, the survey would usefully consolidate an extensive and fragmented body of work while highlighting an underexplored application area in sensor fusion. The inclusion of a public repository with code availability is a concrete strength that enhances the practical utility of the contribution.
major comments (2)
- [Sensor Fusion discussion (likely §5 or equivalent)] The central argument that DyNNs are 'particularly beneficial' for sensor fusion (adaptivity, noise reduction, prioritization) rests on preliminary works; the manuscript should explicitly map each claimed benefit to specific cited methods or results in the sensor-fusion section to demonstrate that the benefits are evidenced rather than extrapolated.
- [Taxonomy definition (likely §3)] The taxonomy is presented as 'logical' and based on adaptive components, but without an explicit statement of the decision criteria used to assign papers to the three categories (output / graph / input), it is difficult to assess whether the taxonomy is exhaustive or whether borderline methods are handled consistently.
minor comments (2)
- [Abstract and repository description] The abstract states that the repository 'lists all the surveyed papers'; the manuscript should include a brief description of the search strategy, inclusion criteria, and cut-off date used to compile the list so readers can judge completeness.
- [Taxonomy overview] Figure or table summarizing the taxonomy would benefit from explicit counts or percentages of papers falling into each adaptive-component category to give a quantitative sense of the literature distribution.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and constructive feedback. We address both major comments below and will revise the manuscript accordingly to strengthen the sensor-fusion discussion and clarify the taxonomy.
read point-by-point responses
-
Referee: [Sensor Fusion discussion (likely §5 or equivalent)] The central argument that DyNNs are 'particularly beneficial' for sensor fusion (adaptivity, noise reduction, prioritization) rests on preliminary works; the manuscript should explicitly map each claimed benefit to specific cited methods or results in the sensor-fusion section to demonstrate that the benefits are evidenced rather than extrapolated.
Authors: We agree that explicit mappings will make the claims more rigorous. In the revision we will insert a structured table (or bulleted mapping) in the sensor-fusion section that directly links each of the three claimed benefits to the specific preliminary works cited, quoting the relevant results or mechanisms from those papers. revision: yes
-
Referee: [Taxonomy definition (likely §3)] The taxonomy is presented as 'logical' and based on adaptive components, but without an explicit statement of the decision criteria used to assign papers to the three categories (output / graph / input), it is difficult to assess whether the taxonomy is exhaustive or whether borderline methods are handled consistently.
Authors: We will add an explicit paragraph (or short subsection) at the beginning of §3 that states the decision criteria used to assign a method to one of the three categories. The criteria will be defined in terms of the primary adaptive component, with examples of how borderline cases (e.g., methods that adapt both output and graph) are classified to ensure consistency and transparency. revision: yes
Circularity Check
No significant circularity; survey is organizational
full rationale
This is a survey paper synthesizing existing Dynamic Neural Networks literature for CV and arguing benefits for sensor fusion via a three-way taxonomy on adaptive components. No derivations, equations, fitted parameters, or predictions are present that could reduce to inputs by construction. The central argument rests on coverage and synthesis of prior work rather than any self-referential step, self-citation chain, or ansatz. No load-bearing claim reduces to a fit or definition within the paper itself.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
An Algorithm for On-Sensor Agnostic Detection of Changes in Human Activity for Ultra-Low-Power Applications
A non-parametric change-detection gate based on dynamic template matching reduces HAR computational load by over 67% with 97-98% sensitivity on two public datasets while requiring only brief device calibration.
Reference graph
Works this paper leans on
-
[1]
Selective Sensor Fusion for Neural Visual-Inertial Odometry
End-to-End Object Detection with Transformers. Springer Interna- tionalPublishing, Cham. pp.213–229. doi:10.1007/978-3-030-58452-8_ 13. Chen, C., Rosa, S., Miao, Y., Lu, C.X., Wu, W., Markham, A., Trigoni, N., 2019a. Selective Sensor Fusion for Neural Visual-Inertial Odometry. arXiv:1903.01534. 59 Chen, L., Odema, M., Faruque, M.A.A., 2022. Romanus: Robus...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1007/978-3-030-58452-8_ 1903
-
[2]
Han, X., Wei, L., Dou, Z., Wang, Z., Qiang, C., He, X., Sun, Y., Han, Z., Tian, Q., 2024
doi:10.1109/CVPR.2017.540. Han, X., Wei, L., Dou, Z., Wang, Z., Qiang, C., He, X., Sun, Y., Han, Z., Tian, Q., 2024. ViMoE: An Empirical Study of Designing Vision Mixture- of-Experts. doi:10.48550/arXiv.2410.15732,arXiv:2410.15732. 63 Han, Y., Han, D., Liu, Z., Wang, Y., Pan, X., Pu, Y., Deng, C., Feng, J., Song, S., Huang, G., 2023. Dynamic Perceiver for...
-
[3]
Multi-Scale Dense Networks for Resource Efficient Image Classification
Multi-Scale Dense Networks for Resource Efficient Image Classifica- tion.arXiv:1703.09844. Huang, X., Huang, Z., Zuo, Y., Gong, Y., Zhang, C., Liu, D., Fang, Y.,
work page internal anchor Pith review Pith/arXiv arXiv
-
[4]
ProceedingsoftheAAAIConferenceonArtificialIntelligence 39, 3788–3796
PSReg: Prior-guided Sparse Mixture of Experts for Point Cloud Registration. ProceedingsoftheAAAIConferenceonArtificialIntelligence 39, 3788–3796. doi:10.1609/aaai.v39i4.32395. 64 Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E., 1991. Adaptive Mixtures of Local Experts. Neural Computation 3, 79–87. doi:10.1162/ neco.1991.3.1.79. Jain, G., Hegde, N.,...
-
[5]
Jie, Z., Sun, P., Li, X., Feng, J., Liu, W., 2021
doi:10.1109/TIP.2020.3018269. Jie, Z., Sun, P., Li, X., Feng, J., Liu, W., 2021. Anytime Recognition with Routing Convolutional Networks. IEEE Transactions on Pattern Anal- ysis and Machine Intelligence 43, 1875–1886. doi:10.1109/TPAMI.2019. 2959322. John, V., Boyali, A., Tehrani, H., Ishimaru, K., Konishi, M., Liu, Z., Mita, S.,
-
[6]
IEEE Transactions on Intelligent Vehicles 3, 571–584
Estimation of Steering Angle and Collision Avoidance for Automated Driving Using Deep Mixture of Experts. IEEE Transactions on Intelligent Vehicles 3, 571–584. doi:10.1109/TIV.2018.2874555. Ju, W., Bao, W., Ge, L., Yuan, D., 2021. Dynamic Early Exit Schedul- ing for Deep Neural Network Inference through Contextual Bandits, in: Proceedings of the 30th ACM ...
-
[7]
Springer Nature Switzerland, Cham. volume 13681, pp. 330–349. doi:10.1007/978-3-031-19803-8_20. Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. ImageNet Classification with Deep Convolutional Neural Networks, in: Advances in Neural Infor- mation Processing Systems, Curran Associates, Inc. Kuhse, D., Teper, H., Buschjäger, S., Wang, C.Y., Chen, J.J., 20...
-
[8]
Li, Y., Geller, T., Kim, Y., Panda, P., 2023b
doi:10.1609/aaai.v37i7.26042. Li, Y., Geller, T., Kim, Y., Panda, P., 2023b. SEENN: Towards Temporal Spiking Early-Exit Neural Networks.arXiv:2304.01230. Li, Y., Song, L., Chen, Y., Li, Z., Zhang, X., Wang, X., Sun, J., 2020b. Learn- ing Dynamic Routing for Semantic Segmentation, in: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVP...
-
[9]
73 Schiebener, D., Morimoto, J., Asfour, T., Ude, A., 2013
doi:10.1007/s12559-020-09734-4. 73 Schiebener, D., Morimoto, J., Asfour, T., Ude, A., 2013. Integrating visual perception and manipulation for autonomous learning of object represen- tations. Adaptive Behavior doi:10.1177/1059712313484502. Seol, K.S., Roh, S.D., Chung, K.S., 2023. Token Merging with Class Im- portance Score, in: IECON 2023- 49th Annual Co...
-
[10]
Valade, F., Hebiri, M., Gay, P., 2024
doi:10.1109/ICRA.2017.7989540. Valade, F., Hebiri, M., Gay, P., 2024. EERO: Early Exit with Reject Option for Efficient Classification with limited budget.arXiv:2402.03779. Veit, A., Belongie, S., 2018. Convolutional Networks with Adaptive Inference Graphs . 75 Verelst, T., Tuytelaars, T., 2020. Dynamic Convolutions: Exploiting Spatial Sparsity for Faster...
-
[11]
Springer Nature Switzerland, Cham. volume 13664, pp. 226–243. doi:10.1007/978-3-031-19772-7_14. Wang, Z., Bao, W., Yuan, D., Ge, L., Tran, N.H., Zomaya, A.Y., 2019b. SEE: Scheduling Early Exit for Mobile DNN Inference during Service Outage, in: Proceedings of the 22nd International ACM Conference on Modeling, Analysis and Simulation of Wireless and Mobile...
- [12]
-
[13]
doi:10.1007/978-3-030-58517-4_17. Xu, C., McAuley, J., 2023. A Survey on Dynamic Neural Networks for Nat- ural Language Processing, in: Vlachos, A., Augenstein, I. (Eds.), Findings of the Association for Computational Linguistics: EACL 2023, Associa- tion for Computational Linguistics, Dubrovnik, Croatia. pp. 2370–2381. doi:10.18653/v1/2023.findings-eacl....
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.