Hippocampus-DETR: An Explicit Memory Object Detection Framework Based on Hippocampus Modeling

Bo Liang; Bo Ma; Hao Xu; Zepeng Yang; Zhaoning Shi

arxiv: 2606.27831 · v1 · pith:CGYPFDLAnew · submitted 2026-06-26 · 💻 cs.CV · cs.AI

Hippocampus-DETR: An Explicit Memory Object Detection Framework Based on Hippocampus Modeling

Zhaoning Shi , Bo Ma , Hao Xu , Zepeng Yang , Bo Liang This is my paper

Pith reviewed 2026-06-29 04:43 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords object detectionhippocampus modelingDETRmemory modulepattern separationfew-shot learningneurocognitive integration

0 comments

The pith

Hippocampus-DETR adds an explicit memory module modeled on hippocampal subregions to the DETR detector, claiming improved accuracy and generalization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper tries to fix the absence of built-in memory in modern object detectors by copying the organization of the hippocampus. It inserts a module called HipNet into DETR that mirrors the sequence of entorhinal cortex, dentate gyrus, CA3, CA1, and subiculum. The module handles pattern separation, completion, filtering, and integration of visual features through layer-wise training. If the approach works, detectors could become more accurate, require less data, and transfer better to related vision tasks such as few-shot classification and image restoration.

Core claim

The central claim is that by integrating a hippocampal memory network module, HipNet, into the DETR architecture and simulating the anatomical structure and functional organization of hippocampal subregions including the entorhinal cortex, dentate gyrus, CA3, CA1, and subiculum, the model realizes pattern separation, pattern completion, importance filtering, and information integration of visual encoding features, leading to higher detection accuracy and better generalization and data efficiency in various tasks.

What carries the argument

HipNet, which simulates hippocampal subregions to enable pattern separation, completion, importance filtering, and information integration.

If this is right

Higher detection accuracy than current mainstream models.
Excellent generalization ability and data efficiency in few-shot image classification.
Improved performance in multimodal feature construction and image restoration.
Validation of the functional necessity and internal interpretability of each memory submodule through experiments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Adding comparable explicit memory structures to other detection or vision architectures could yield similar gains in efficiency.
The biological modeling might offer a route to more interpretable AI systems by aligning internal modules with known brain functions.
Layer-wise optimization of submodules could be tested as a general strategy for training complex memory-augmented networks.
Success here suggests that other cognitive functions modeled from neuroscience could be integrated into deep learning pipelines for robustness.

Load-bearing premise

That building an artificial network to replicate the specific subregions and functions of the hippocampus will deliver better pattern handling and higher performance in object detection and related tasks.

What would settle it

Running the same experiments with the HipNet module removed or with its subregions ablated and finding no drop in accuracy or generalization would falsify the necessity of the hippocampal simulation.

read the original abstract

This paper addresses the lack of explicit memory mechanisms in current object detection models and proposes Hippocampus-DETR, a novel detection framework based on biological hippocampal memory modeling. This framework integrates a hippocampal memory network module, HipNet, into the DETR architecture and systematically simulates the anatomical structure and functional organization of hippocampal subregions, including the entorhinal cortex, dentate gyrus, CA3, CA1, and subiculum. Through this design, Hippocampus-DETR realizes pattern separation, pattern completion, importance filtering, and information integration of visual encoding features. During training, different memory submodules are optimized using a layer-wise training strategy, ultimately forming a memory system with memory retrieval and completion capabilities. Experimental results demonstrate that Hippocampus-DETR achieves higher detection accuracy than current mainstream models. More importantly, models equipped with this framework also exhibit excellent generalization ability and data efficiency in tasks such as few-shot image classification, multimodal feature construction, and image restoration. Subsequent experiments further validate the functional necessity and internal interpretability of each memory submodule. This study not only provides a novel object detection framework, but also offers a feasible technical pathway for integrating neurocognitive mechanisms with deep learning models, highlighting its significant value in improving model learning efficiency and task robustness. The project is available at https://github.com/2186cloud/hipnet.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Hippocampus-DETR adds a staged memory module to DETR with hippocampal subregion labels, but the reported gains are not shown to require that specific mapping.

read the letter

The main thing to know is that this paper inserts HipNet into DETR, maps entorhinal cortex, dentate gyrus, CA3, CA1, and subiculum onto separate memory stages for pattern separation and completion, trains them layer-wise, and claims higher detection accuracy plus better results on few-shot classification, multimodal features, and image restoration.

The concrete contribution is the explicit functional partitioning of the memory system and the decision to optimize submodules in sequence rather than end-to-end. That produces a detector with retrieval and completion behavior that standard DETR attention does not have. Testing the same framework on detection plus transfer tasks is also a reasonable scope.

The experiments appear to include checks on submodule necessity and some interpretability, which is better than many architecture papers that stop at headline numbers. The github release is a practical plus.

The soft spot is the one the stress-test flags. Nothing described isolates whether the anatomical labels and wiring are responsible for the lifts or whether any memory bank with comparable capacity and the same staged training would perform similarly. If the ablations only remove modules or change training order without a matched non-biological control, the central claim that hippocampal modeling drives the improvement stays under-supported. The abstract performance statements also lack the actual deltas and baseline details needed to judge effect size.

This is for people already working on memory-augmented detectors or bio-inspired CV. A reader looking for new modular tricks might extract the staged memory design even if the neuroscience framing is loose. Someone expecting rigorous evidence that the subregion structure itself matters would want stronger controls.

The paper is coherent enough on its own terms to go to review. The architecture is described clearly and the multi-task evaluation is broader than typical DETR variants.

Recommendation: send it to referees so they can examine the ablation tables and the size of the reported gains.

Referee Report

2 major / 1 minor

Summary. The paper proposes Hippocampus-DETR, an extension of the DETR object detection model that incorporates a HipNet module explicitly modeling the anatomical structure and functions of hippocampal subregions (entorhinal cortex, dentate gyrus, CA3, CA1, subiculum). This is intended to enable pattern separation, pattern completion, importance filtering, and information integration of visual features via a layer-wise training strategy for the memory submodules. The central claims are higher detection accuracy than mainstream models plus improved generalization and data efficiency on few-shot image classification, multimodal feature construction, and image restoration, with additional experiments validating the functional necessity and interpretability of each submodule. The code is released at https://github.com/2186cloud/hipnet.

Significance. If the performance and generalization gains are shown to stem specifically from the hippocampal subregion partitioning rather than added capacity or the layer-wise schedule, the work would supply a concrete, reproducible example of embedding neurocognitive memory mechanisms into a detection architecture. The open-sourced implementation is a clear strength for reproducibility. The approach illustrates one feasible route for bio-inspired design in CV, which could inform data-efficient models if the biological fidelity proves causal.

major comments (2)

[Experiments] Experiments / ablation studies: The validation that each memory submodule is functionally necessary does not include the critical control of a non-anatomically partitioned memory bank (e.g., a single unified memory module) possessing equivalent total capacity and trained under the identical layer-wise schedule. Without this comparison, it remains possible that reported accuracy and transfer gains arise from extra parameters or the training procedure rather than the specific pattern-separation / completion roles assigned to dentate gyrus, CA3, etc. This directly bears on the central claim that the hippocampal modeling itself produces the observed improvements.
[Method / HipNet] HipNet architecture description: The mapping of biological subregions to concrete network operations is presented as an engineering analogy. No quantitative diagnostics (e.g., measured pattern-separation ratios on held-out feature sets or completion error curves) are supplied to confirm that the implemented modules actually perform the claimed biological functions at a level distinguishable from generic memory operations. This weakens the interpretability and necessity arguments.

minor comments (1)

[Abstract] Abstract: The statement that the model 'achieves higher detection accuracy than current mainstream models' should be accompanied by the primary dataset(s) and the absolute or relative improvement magnitude for immediate context.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. These points help clarify the evidence needed to support our central claims about the benefits of explicit hippocampal subregion modeling. We address each major comment below and indicate where revisions will be made.

read point-by-point responses

Referee: [Experiments] Experiments / ablation studies: The validation that each memory submodule is functionally necessary does not include the critical control of a non-anatomically partitioned memory bank (e.g., a single unified memory module) possessing equivalent total capacity and trained under the identical layer-wise schedule. Without this comparison, it remains possible that reported accuracy and transfer gains arise from extra parameters or the training procedure rather than the specific pattern-separation / completion roles assigned to dentate gyrus, CA3, etc. This directly bears on the central claim that the hippocampal modeling itself produces the observed improvements.

Authors: We agree that a control experiment using a single unified memory module with equivalent total capacity and the identical layer-wise training schedule is necessary to more rigorously isolate the contribution of the anatomical partitioning and assigned functional roles. Our existing ablations demonstrate necessity by selectively removing or altering individual submodules, but they do not directly compare against a non-partitioned equivalent. We will add this control experiment to the revised manuscript, reporting detection accuracy, generalization, and transfer results for the unified baseline alongside the HipNet version. revision: yes
Referee: [Method / HipNet] HipNet architecture description: The mapping of biological subregions to concrete network operations is presented as an engineering analogy. No quantitative diagnostics (e.g., measured pattern-separation ratios on held-out feature sets or completion error curves) are supplied to confirm that the implemented modules actually perform the claimed biological functions at a level distinguishable from generic memory operations. This weakens the interpretability and necessity arguments.

Authors: The HipNet design is explicitly described as an engineering analogy that maps hippocampal subregions to operations intended to realize pattern separation, completion, filtering, and integration. Necessity and interpretability are currently evidenced by the targeted ablation studies showing performance degradation when specific submodules are removed. We acknowledge that additional quantitative diagnostics, such as pattern-separation ratios computed on held-out feature sets or completion error curves, would provide stronger confirmation that the modules achieve these functions in a manner distinguishable from generic memory banks. We will incorporate such metrics in the revision. revision: yes

Circularity Check

0 steps flagged

No circularity; architecture is explicit engineering design with empirical validation

full rationale

The paper presents Hippocampus-DETR as an architectural proposal that integrates a custom HipNet module into DETR to simulate hippocampal subregions (entorhinal cortex, dentate gyrus, CA3, CA1, subiculum) for pattern separation and completion. No equations, first-principles derivations, or predictions appear that reduce by construction to fitted inputs or self-citations; the design is described as an explicit modeling choice followed by layer-wise training and experimental evaluation. Performance and generalization claims rest on reported results rather than tautological mappings, and no load-bearing uniqueness theorems or ansatzes imported via self-citation are invoked. The derivation chain is therefore self-contained as an engineering contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Only the abstract is available, so the ledger is inferred from high-level claims. The central design rests on the untested premise that hippocampal subregion functions can be usefully emulated by neural-network modules.

axioms (1)

domain assumption Simulating the anatomical structure and functional organization of hippocampal subregions will realize pattern separation, pattern completion, importance filtering, and information integration in visual features.
This premise is invoked to justify the HipNet architecture and the layer-wise training strategy.

invented entities (1)

HipNet no independent evidence
purpose: Memory network module that emulates hippocampal subregions inside DETR.
New module introduced by the paper; no independent evidence outside the model itself is provided in the abstract.

pith-pipeline@v0.9.1-grok · 5779 in / 1394 out tokens · 64246 ms · 2026-06-29T04:43:40.174693+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

66 extracted references · 25 canonical work pages · 8 internal anchors

[1]

End-to-end object detection with transformers

Carion N, Massa F, Synnaeve G, et al. End -to-End Object Detection with Transformers[J]. 2020.DOI:10.1007/978-3-030-58452-8_13

work page doi:10.1007/978-3-030-58452-8_13 2020
[2]

Deformable DETR: Deformable Transformers for End-to-End Object Detection

Zhu X, Su W, Lu L, et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection[J]. 2020.DOI:10.48550/arXiv.2010.04159

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2010.04159 2020
[3]

Efficient DETR: Improving End -to-End Object Detector with Dense Prior[J]

Yao Z, Ai J, Li B, et al. Efficient DETR: Improving End -to-End Object Detector with Dense Prior[J]. 2021.DOI:10.48550/arXiv.2104.01318

work page doi:10.48550/arxiv.2104.01318 2021
[4]

Sparse DETR: Efficient End -to-End Object Detection with Learnable Sparsity[J].arXiv e-prints, 2021.DOI:10.48550/arXiv.2111.14330

Roh B, Shin J W, Shin W, et al. Sparse DETR: Efficient End -to-End Object Detection with Learnable Sparsity[J].arXiv e-prints, 2021.DOI:10.48550/arXiv.2111.14330

work page doi:10.48550/arxiv.2111.14330 2021
[5]

DETRs Beat YOLOs on Real-time Object Detection[J].ArXiv, 2023, abs/2304.08069.DOI:10.48550/arXiv.2304.08069

Lv W, Xu S, Zhao Y , et al. DETRs Beat YOLOs on Real-time Object Detection[J].ArXiv, 2023, abs/2304.08069.DOI:10.48550/arXiv.2304.08069

work page doi:10.48550/arxiv.2304.08069 2023
[6]

Emerging Properties in Self-Supervised Vision Transformers

Caron M, Touvron H, Misra I, et al. Emerging Properties in S elf-Supervised Vision Transformers[J]. 2021.DOI:10.48550/arXiv.2104.14294

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2104.14294 2021
[7]

You Only Look Once: Unified, Real-Time Object Detection,

Redmon J, Divvala S, Girshick R, et al. You Only Look Once: Unified, Real -Time Object Detection[J].IEEE, 2016.DOI:10.1109/CVPR.2016.91

work page doi:10.1109/cvpr.2016.91 2016
[8]

YOLOv4: Optimal Speed and Accuracy of Object Detection

Bochkovskiy A, Wang C Y , Liao H Y M. YOLOv4: Optimal Speed and Accuracy of Object Detection[J]. 2020.DOI:10.48550/arXiv.2004.10934

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2004.10934 2020
[9]

Exploring Attention Placement in YOLOv5 for Ship Detection in Infrared Maritime Scenes

Zhu R, Zhang J, Yang D, Zhao D, Chen J, Zhu Z. Exploring Attention Placement in YOLOv5 for Ship Detection in Infrared Maritime Scenes. Technologies. 2025; 13(9):391. https://doi.org/10.3390/technologies13090391

work page doi:10.3390/technologies13090391 2025
[10]

Li, C., Li, L., Jiang, H., Weng, K., Geng, Y ., Li, L., Ke, Z., Li, Q., Cheng, M., Nie, W., Li, Y ., Zhang, B., Liang, Y ., Zhou, L., Xu, X., Chu, X., Wei, X., & Wei, X. (2022). YOLOv6: A Single - Stage Object Detection Framework for Industrial Applications. ArXiv, abs/2209.02976

work page arXiv 2022
[11]

YOLOv7: Trainable bag-of-freebies sets new state- of-the-art for real-time object detectors[J].arXiv e-prints, 2022.DOI:10.48550/arXiv.2207.02696

Wang C Y , Bochkovskiy A, Liao H Y M. YOLOv7: Trainable bag-of-freebies sets new state- of-the-art for real-time object detectors[J].arXiv e-prints, 2022.DOI:10.48550/arXiv.2207.02696

work page doi:10.48550/arxiv.2207.02696 2022
[12]

Ultralytics. (2023). Ultralytics YOLOv8 (Version 8.0.0) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.8046616

work page doi:10.5281/zenodo.8046616 2023
[13]

48550/arXiv.2402.13616

Wang, C. Y ., & Liao, H. Y . M. (2024). YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. arXiv preprint arXiv:2402.13616

work page arXiv 2024
[14]

Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real - Time End-to-End Object Detection. In Advances in Neural Information Processing Systems 37 (NeurIPS 2024) (pp. 107984–108011). https://doi.org/10.52202/079017-3429

work page doi:10.52202/079017-3429 2024
[15]

Khanam, R., & Hussain, M. (2024). YOLOv11: An Overview of the Key Architectural Enhancements. arXiv preprint arXiv:2410.17725

work page internal anchor Pith review Pith/arXiv arXiv 2024
[16]

Alif, M. A. R., & Hussain, M. (2025). YOLOv12: A Breakdown of the Key Arc hitectural Features. arXiv preprint arXiv:2502.14740

work page arXiv 2025
[17]

Lei, M., Li, S., Wu, Y ., Hu, H., Zhou, Y ., Zheng, X., Ding, G., Du, S., Wu, Z., & Gao, Y . (2025). *YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception*. arXiv preprint arXiv:2506.17733

work page arXiv 2025
[18]

Liu, S., Li, F., Zhang, H., Yang, X., Qi, X., Su, H., Zhu, J., & Zhang, L. (2022). DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR. In International Conference on Learning Representations (ICLR)

2022
[19]

M., & Zhang, L

Li, F., Zhang, H., Liu, S., Guo, J., Ni, L. M., & Zhang, L. (2022, June). DN-DETR: Accelerate DETR training by introducing query denoising. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 13619-13627)

2022
[20]

(2023, October)

Chen, Q., Chen, X., Wang, J., Zhang, S., Yao, K., Feng, H., Han, J., Ding, E., Zeng, G., & Wang, J. (2023, October). Group DETR: Fast DETR training with group-wise one-to-many assignment. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 66 33- 6642)

2023
[21]

Li, F., Zeng, A., Liu, S., Zhang, H., Li, H., Zhang, L., & Ni, L. M. (2023, June). Lite DETR: An interleaved multi -scale encoder for efficient DETR. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 18558-18567)

2023
[22]

(2021, October)

Meng, D., Chen, X., Fan, Z., Zeng, G., Li, H., Yuan, Y ., Sun, L., & Wang, J. (2021, October). Conditional DETR for fast training convergence. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 3651-3660)

2021
[23]

Wang, Y ., Zhang, X., Yang, T., & Sun, J. (2022). Anchor DETR: Query design for transformer- based detector. In Proceedings of the AAAI Conference on Artificial Intelligence (V ol. 36, No. 3, pp. 2567-2575). https://doi.org/10.1609/aaai.v36i3.20158

work page doi:10.1609/aaai.v36i3.20158 2022
[24]

Jian, Y ., Yu, F., Zhang, Q., Levine, W., Dubbs, B., & Karianakis, N. (2024). Online Learning via Memory: Retrieval-Augmented Detector Adaptation. arXiv preprint arXiv:2409.10716

work page arXiv 2024
[25]

Agro, B., Casas, S., Wang, P., Gilles, T., & Urtasun, R. (2025) . MAD: Memory-Augmented Detection of 3D Objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1449-1460)

2025
[26]

De Monte, R., Dalle Pezze, D., & Susto, G. A. (2025). Teach YOLO to Remember: A Self - Distillation Approach for Continual Object Detection. arXiv preprint arXiv:2503.04688

work page arXiv 2025
[27]

Behrouz, A., Zhong, P., & Mirrokni, V . (2024). Titans: Learning to Memorize at Test Time. arXiv preprint arXiv:2501.00663

work page internal anchor Pith review Pith/arXiv arXiv 2024
[28]

Knowles, W. D. Normal anatomy and neurophysiology of the hippocampal formation. Journal of Clinical Neurophysiology. 9(2), 253-263(1992)

1992
[30]

Kesner, R. P. & Rolls, E. T. A computational theory of hippocampal function, and tests of the theory: new developments. Neuroscience & Biobehavioral Reviews. 48, 92-147(2015)

2015
[31]

Aggleton, J. P. & Christiansen, K. The subiculum: the hear t of the extended hippocampal system. Progress in brain research. 219, 65-82(2015)

2015
[32]

B., Wouterlood, F

Canto, C. B., Wouterlood, F. G. & Witter, M. P. What does the anatomical organization of the entorhinal cortex tell us? Neural plasticity. 2008(1), 381243(2008)

2008
[33]

S., Doan, T

Nilssen, E. S., Doan, T. P., Nigro, M. J., Ohara, S., & Witter, M. P. Neurons and networks in the entorhinal cortex: A reappraisal of the lateral and medial entorhinal subdivisions mediating parallel cortical pathways. Hippocampus. 29(12), 1238-1254(2019)

2019
[34]

& Alonso, A

Tahvildari, B. & Alonso, A. Morphological and electrophysiological properties of lateral entorhinal cortex layers II and III principal neurons. Journal of Comparative Neurology. 491(2), 123-140(2005)

2005
[35]

Sewards, T. V . & Sewards, M. A. Input and output stations of the entorhinal cortex: superficial vs. deep layers or lateral vs. medial divisions? Brain Research Reviews. 42(3), 243-251(2003)

2003
[36]

& Wang, X

Qiu, S., Hu, Y ., Huang, Y ., Gao, T. & Wang, X. et al. Whole -brain spatial organization of hippocampal single-neuron projectomes. Science. 383(6682), eadj9198(2024)

2024
[38]

C., & López, A

Xu, X., Sun, Y ., Holmes, T. C., & López, A. J. Noncanonical connections between the subiculum and hippocampal CA1. Journal of Comparative Neurology. 524(17), 3666-3673 (2016)

2016
[39]

& Insausti, R

Muñoz, M. & Insausti, R. Cortical efferents of the entorhinal cortex and the adjacent parahippocampal region in the monkey (Macaca fascicularis). European Journal of Neuroscience. 22(6), 1368-1388 (2005)

2005
[40]

backprojection

Scharfman, H. E. The CA3 “backprojection” to the dentate gyrus. Progress in brain research. 163, 627-637(2007)

2007
[41]

S., Doan , T

Nilssen, E. S., Doan , T. P., Nigro, M. J., Ohara, S., & Witter, M. P. Neurons and networks in the entorhinal cortex: A reappraisal of the lateral and medial entorhinal subdivisions mediating parallel cortical pathways. Hippocampus. 29(12), 1238-1254(2019)

2019
[42]

Guillery, R. W. Brodmann's 'Localisation in the Cerebral Cortex' (transl. and ed. by L. J. Garey). J. Anat. 196, 493–496 (2000)

2000
[43]

K., Leutgeb, S., Moser, M

Leutgeb, J. K., Leutgeb, S., Moser, M. B., & Moser, E. I. Pattern separation in the dentate gyrus and CA3 of the hippocampus. Science. 315(5814), 961-966(2007)

2007
[44]

& Kuijf, H

Berron, D., Schütze, H., Maass, A., Cardenas-Blanco, A. & Kuijf, H. J. et al. Strong evidence for pattern separation in human dentate gyrus. Journal of Neuroscience. 36(29), 7569-7579(2016)

2016
[45]

Gold, A. E. & Kesner, R. P. The role of the CA3 subregion of the dorsal hippocampus in spatial pattern completion in the rat. Hippocampus. 15(6), 808-814(2005)

2005
[46]

J., Schlögl, A., Frotscher, M., & Jonas, P

Guzman, S. J., Schlögl, A., Frotscher, M., & Jonas, P. Synaptic mechanisms of pattern completion in the hippocampal CA3 network. Science. 353(6304), 1117-1123(2016)

2016
[47]

& Bartos, M

Hainmueller, T. & Bartos, M. Dentate gyrus circuits for encoding, retrieval and discrimination of episodic memories. Nature Reviews Neuroscience. 21(3), 153-168 (2020)

2020
[48]

& Mäkisara, K

Kohonen, T. & Mäkisara, K. The self-organizing feature maps. Physica Scripta. 39(1), 168(1989)

1989
[49]

J., Bisby, J

Grande, X., Berron, D., Horner, A. J., Bisby, J. A. & Düzel, E. et al. Holistic recollection via pattern completion involves hippocampal subfield CA3. Journal of Neuroscience. 39(41), 81 00- 8111(2019)

2019
[50]

H., Wiskott, L

Azizi, A. H., Wiskott, L. & Cheng, S. A computational model for preplay in the hippocampus. Frontiers in computational neuroscience. 7, 161(2013)

2013
[51]

Hopfield Networks is All You Need

Ramsauer, H., Schäfl, B., Lehner, J., Seidl, P. & Widrich, M. et al. Hopfield networks is all you need. arXiv preprint arXiv:2008.02217(2020)

work page internal anchor Pith review Pith/arXiv arXiv 2008
[52]

& Losonczy, A

Soltesz, I. & Losonczy, A. CA1 pyramidal cell diversity enabling parallel information processing in the hippocampus. Nature neuroscience. 21(4), 484-493(2018)

2018
[53]

A., Witter, M

Naber, P. A., Witter, M. P. & Lopes da Silva, F. H. Networks of the Hippocampal Memory System of the Rat: The Pivotal Role of the Subiculum a. Annals of the New York Academy of Sciences. 911(1), 392-403(2000)

2000
[54]

& Mizuseki, K

Matsumoto, N., Kitanishi, T. & Mizuseki, K. The subiculum: Unique hi ppocampal hub and more. Neuroscience research. 143, 1-12(2019)

2019
[55]

Aggleton, J. P. & Christiansen, K. The subiculum: the heart of the extended hippocampal system. Progress in brain research. 219, 65-82(2015)

2015
[56]

Sanders, D. M. W. & Schacter, D. L. Adap tive Memory Distortions. Interdisciplinary Perspectives and Advances in Understanding Adaptive Memory. 31(2024)

2024
[57]

Microsoft COCO: Common objects in context,

T.-Y . Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan,P. Doll´ar, and C. L. Zitnick, “Microsoft COCO: Common objects in context,” in Eur. Conf. Comput. Vis., 2014, pp. 740–755

2014
[58]

GoldYOLO: Efficient object detector via gather-and-distribute mechanism,

C. Wang, W. He, Y . Nie, J. Guo, C. Liu, Y . Wang, and K. Han, “GoldYOLO: Efficient object detector via gather-and-distribute mechanism,” Adv. Neural Inform. Process. Syst., pp. 51 094–51 112, 2023

2023
[59]

Chen, Q., Su, X., Zhang, X., Wang, J., Chen, J., Shen, Y ., Han, C., Chen, Z., Xu, W., Li, F., Zhang, S., Yao, K., Ding, E., Zhang, G., & Wang, J. (2024). LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection. arXiv. https://doi.org/10.48550/arXiv.2406.03459

work page doi:10.48550/arxiv.2406.03459 2024
[60]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

Xiao, H., Rasul, K. & V ollgraf, R. Fashion -mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747(2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[61]

& Haffner, P

LeCun, Y ., Bottou, L., Bengio, Y . & Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE. 86(11), 2278-2324(2002)

2002
[62]

& Hinton, G

Krizhevsky, A., Sutskever, I. & Hinton, G. E. Learning multiple layers of features from tiny images. Technical Report (University of Toronto, 2009)

2009
[63]

& Sun, J

He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770-778(2016)

2016
[64]

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D. & Wang, W. et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861(2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[65]

& Weinberger, K

Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700- 4708 (2017)

2017
[66]

Brock, A., De, S., Smith, S. L. & Simonyan, K. High -performance large -scale image recognition without normalization. Proc. Mach. Learn. Res. 139, 1059–1071 (2021)

2021
[67]

E., Kesner, R

Gilbert, P. E., Kesner, R. P. & Lee, I. Dissociating hippocampal subregions: A double dissociation between dentate gyrus and CA1. Hippocampus. 11(6), 626-636 (2001)

2001
[68]

distorted

Bartsch, T., Schönfeld, R., Müller, F. J., Alfke, K. & Leplow, B. et al. Focal lesions of human hippocampal CA1 neurons in transient global amnesia impair place memory. Science. 328(5984), 1412-1415(2010). Supplementary Information EC EC2 performs empty-feature removal and normalization on the features from the sensory input component. Among the features ...

2010

[1] [1]

End-to-end object detection with transformers

Carion N, Massa F, Synnaeve G, et al. End -to-End Object Detection with Transformers[J]. 2020.DOI:10.1007/978-3-030-58452-8_13

work page doi:10.1007/978-3-030-58452-8_13 2020

[2] [2]

Deformable DETR: Deformable Transformers for End-to-End Object Detection

Zhu X, Su W, Lu L, et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection[J]. 2020.DOI:10.48550/arXiv.2010.04159

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2010.04159 2020

[3] [3]

Efficient DETR: Improving End -to-End Object Detector with Dense Prior[J]

Yao Z, Ai J, Li B, et al. Efficient DETR: Improving End -to-End Object Detector with Dense Prior[J]. 2021.DOI:10.48550/arXiv.2104.01318

work page doi:10.48550/arxiv.2104.01318 2021

[4] [4]

Sparse DETR: Efficient End -to-End Object Detection with Learnable Sparsity[J].arXiv e-prints, 2021.DOI:10.48550/arXiv.2111.14330

Roh B, Shin J W, Shin W, et al. Sparse DETR: Efficient End -to-End Object Detection with Learnable Sparsity[J].arXiv e-prints, 2021.DOI:10.48550/arXiv.2111.14330

work page doi:10.48550/arxiv.2111.14330 2021

[5] [5]

DETRs Beat YOLOs on Real-time Object Detection[J].ArXiv, 2023, abs/2304.08069.DOI:10.48550/arXiv.2304.08069

Lv W, Xu S, Zhao Y , et al. DETRs Beat YOLOs on Real-time Object Detection[J].ArXiv, 2023, abs/2304.08069.DOI:10.48550/arXiv.2304.08069

work page doi:10.48550/arxiv.2304.08069 2023

[6] [6]

Emerging Properties in Self-Supervised Vision Transformers

Caron M, Touvron H, Misra I, et al. Emerging Properties in S elf-Supervised Vision Transformers[J]. 2021.DOI:10.48550/arXiv.2104.14294

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2104.14294 2021

[7] [7]

You Only Look Once: Unified, Real-Time Object Detection,

Redmon J, Divvala S, Girshick R, et al. You Only Look Once: Unified, Real -Time Object Detection[J].IEEE, 2016.DOI:10.1109/CVPR.2016.91

work page doi:10.1109/cvpr.2016.91 2016

[8] [8]

YOLOv4: Optimal Speed and Accuracy of Object Detection

Bochkovskiy A, Wang C Y , Liao H Y M. YOLOv4: Optimal Speed and Accuracy of Object Detection[J]. 2020.DOI:10.48550/arXiv.2004.10934

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2004.10934 2020

[9] [9]

Exploring Attention Placement in YOLOv5 for Ship Detection in Infrared Maritime Scenes

Zhu R, Zhang J, Yang D, Zhao D, Chen J, Zhu Z. Exploring Attention Placement in YOLOv5 for Ship Detection in Infrared Maritime Scenes. Technologies. 2025; 13(9):391. https://doi.org/10.3390/technologies13090391

work page doi:10.3390/technologies13090391 2025

[10] [10]

Li, C., Li, L., Jiang, H., Weng, K., Geng, Y ., Li, L., Ke, Z., Li, Q., Cheng, M., Nie, W., Li, Y ., Zhang, B., Liang, Y ., Zhou, L., Xu, X., Chu, X., Wei, X., & Wei, X. (2022). YOLOv6: A Single - Stage Object Detection Framework for Industrial Applications. ArXiv, abs/2209.02976

work page arXiv 2022

[11] [11]

YOLOv7: Trainable bag-of-freebies sets new state- of-the-art for real-time object detectors[J].arXiv e-prints, 2022.DOI:10.48550/arXiv.2207.02696

Wang C Y , Bochkovskiy A, Liao H Y M. YOLOv7: Trainable bag-of-freebies sets new state- of-the-art for real-time object detectors[J].arXiv e-prints, 2022.DOI:10.48550/arXiv.2207.02696

work page doi:10.48550/arxiv.2207.02696 2022

[12] [12]

Ultralytics. (2023). Ultralytics YOLOv8 (Version 8.0.0) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.8046616

work page doi:10.5281/zenodo.8046616 2023

[13] [13]

48550/arXiv.2402.13616

Wang, C. Y ., & Liao, H. Y . M. (2024). YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. arXiv preprint arXiv:2402.13616

work page arXiv 2024

[14] [14]

Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., Han, J., & Ding, G. (2024). YOLOv10: Real - Time End-to-End Object Detection. In Advances in Neural Information Processing Systems 37 (NeurIPS 2024) (pp. 107984–108011). https://doi.org/10.52202/079017-3429

work page doi:10.52202/079017-3429 2024

[15] [15]

Khanam, R., & Hussain, M. (2024). YOLOv11: An Overview of the Key Architectural Enhancements. arXiv preprint arXiv:2410.17725

work page internal anchor Pith review Pith/arXiv arXiv 2024

[16] [16]

Alif, M. A. R., & Hussain, M. (2025). YOLOv12: A Breakdown of the Key Arc hitectural Features. arXiv preprint arXiv:2502.14740

work page arXiv 2025

[17] [17]

Lei, M., Li, S., Wu, Y ., Hu, H., Zhou, Y ., Zheng, X., Ding, G., Du, S., Wu, Z., & Gao, Y . (2025). *YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception*. arXiv preprint arXiv:2506.17733

work page arXiv 2025

[18] [18]

Liu, S., Li, F., Zhang, H., Yang, X., Qi, X., Su, H., Zhu, J., & Zhang, L. (2022). DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR. In International Conference on Learning Representations (ICLR)

2022

[19] [19]

M., & Zhang, L

Li, F., Zhang, H., Liu, S., Guo, J., Ni, L. M., & Zhang, L. (2022, June). DN-DETR: Accelerate DETR training by introducing query denoising. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 13619-13627)

2022

[20] [20]

(2023, October)

Chen, Q., Chen, X., Wang, J., Zhang, S., Yao, K., Feng, H., Han, J., Ding, E., Zeng, G., & Wang, J. (2023, October). Group DETR: Fast DETR training with group-wise one-to-many assignment. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 66 33- 6642)

2023

[21] [21]

Li, F., Zeng, A., Liu, S., Zhang, H., Li, H., Zhang, L., & Ni, L. M. (2023, June). Lite DETR: An interleaved multi -scale encoder for efficient DETR. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 18558-18567)

2023

[22] [22]

(2021, October)

Meng, D., Chen, X., Fan, Z., Zeng, G., Li, H., Yuan, Y ., Sun, L., & Wang, J. (2021, October). Conditional DETR for fast training convergence. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 3651-3660)

2021

[23] [23]

Wang, Y ., Zhang, X., Yang, T., & Sun, J. (2022). Anchor DETR: Query design for transformer- based detector. In Proceedings of the AAAI Conference on Artificial Intelligence (V ol. 36, No. 3, pp. 2567-2575). https://doi.org/10.1609/aaai.v36i3.20158

work page doi:10.1609/aaai.v36i3.20158 2022

[24] [24]

Jian, Y ., Yu, F., Zhang, Q., Levine, W., Dubbs, B., & Karianakis, N. (2024). Online Learning via Memory: Retrieval-Augmented Detector Adaptation. arXiv preprint arXiv:2409.10716

work page arXiv 2024

[25] [25]

Agro, B., Casas, S., Wang, P., Gilles, T., & Urtasun, R. (2025) . MAD: Memory-Augmented Detection of 3D Objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1449-1460)

2025

[26] [26]

De Monte, R., Dalle Pezze, D., & Susto, G. A. (2025). Teach YOLO to Remember: A Self - Distillation Approach for Continual Object Detection. arXiv preprint arXiv:2503.04688

work page arXiv 2025

[27] [27]

Behrouz, A., Zhong, P., & Mirrokni, V . (2024). Titans: Learning to Memorize at Test Time. arXiv preprint arXiv:2501.00663

work page internal anchor Pith review Pith/arXiv arXiv 2024

[28] [28]

Knowles, W. D. Normal anatomy and neurophysiology of the hippocampal formation. Journal of Clinical Neurophysiology. 9(2), 253-263(1992)

1992

[29] [30]

Kesner, R. P. & Rolls, E. T. A computational theory of hippocampal function, and tests of the theory: new developments. Neuroscience & Biobehavioral Reviews. 48, 92-147(2015)

2015

[30] [31]

Aggleton, J. P. & Christiansen, K. The subiculum: the hear t of the extended hippocampal system. Progress in brain research. 219, 65-82(2015)

2015

[31] [32]

B., Wouterlood, F

Canto, C. B., Wouterlood, F. G. & Witter, M. P. What does the anatomical organization of the entorhinal cortex tell us? Neural plasticity. 2008(1), 381243(2008)

2008

[32] [33]

S., Doan, T

Nilssen, E. S., Doan, T. P., Nigro, M. J., Ohara, S., & Witter, M. P. Neurons and networks in the entorhinal cortex: A reappraisal of the lateral and medial entorhinal subdivisions mediating parallel cortical pathways. Hippocampus. 29(12), 1238-1254(2019)

2019

[33] [34]

& Alonso, A

Tahvildari, B. & Alonso, A. Morphological and electrophysiological properties of lateral entorhinal cortex layers II and III principal neurons. Journal of Comparative Neurology. 491(2), 123-140(2005)

2005

[34] [35]

Sewards, T. V . & Sewards, M. A. Input and output stations of the entorhinal cortex: superficial vs. deep layers or lateral vs. medial divisions? Brain Research Reviews. 42(3), 243-251(2003)

2003

[35] [36]

& Wang, X

Qiu, S., Hu, Y ., Huang, Y ., Gao, T. & Wang, X. et al. Whole -brain spatial organization of hippocampal single-neuron projectomes. Science. 383(6682), eadj9198(2024)

2024

[36] [38]

C., & López, A

Xu, X., Sun, Y ., Holmes, T. C., & López, A. J. Noncanonical connections between the subiculum and hippocampal CA1. Journal of Comparative Neurology. 524(17), 3666-3673 (2016)

2016

[37] [39]

& Insausti, R

Muñoz, M. & Insausti, R. Cortical efferents of the entorhinal cortex and the adjacent parahippocampal region in the monkey (Macaca fascicularis). European Journal of Neuroscience. 22(6), 1368-1388 (2005)

2005

[38] [40]

backprojection

Scharfman, H. E. The CA3 “backprojection” to the dentate gyrus. Progress in brain research. 163, 627-637(2007)

2007

[39] [41]

S., Doan , T

Nilssen, E. S., Doan , T. P., Nigro, M. J., Ohara, S., & Witter, M. P. Neurons and networks in the entorhinal cortex: A reappraisal of the lateral and medial entorhinal subdivisions mediating parallel cortical pathways. Hippocampus. 29(12), 1238-1254(2019)

2019

[40] [42]

Guillery, R. W. Brodmann's 'Localisation in the Cerebral Cortex' (transl. and ed. by L. J. Garey). J. Anat. 196, 493–496 (2000)

2000

[41] [43]

K., Leutgeb, S., Moser, M

Leutgeb, J. K., Leutgeb, S., Moser, M. B., & Moser, E. I. Pattern separation in the dentate gyrus and CA3 of the hippocampus. Science. 315(5814), 961-966(2007)

2007

[42] [44]

& Kuijf, H

Berron, D., Schütze, H., Maass, A., Cardenas-Blanco, A. & Kuijf, H. J. et al. Strong evidence for pattern separation in human dentate gyrus. Journal of Neuroscience. 36(29), 7569-7579(2016)

2016

[43] [45]

Gold, A. E. & Kesner, R. P. The role of the CA3 subregion of the dorsal hippocampus in spatial pattern completion in the rat. Hippocampus. 15(6), 808-814(2005)

2005

[44] [46]

J., Schlögl, A., Frotscher, M., & Jonas, P

Guzman, S. J., Schlögl, A., Frotscher, M., & Jonas, P. Synaptic mechanisms of pattern completion in the hippocampal CA3 network. Science. 353(6304), 1117-1123(2016)

2016

[45] [47]

& Bartos, M

Hainmueller, T. & Bartos, M. Dentate gyrus circuits for encoding, retrieval and discrimination of episodic memories. Nature Reviews Neuroscience. 21(3), 153-168 (2020)

2020

[46] [48]

& Mäkisara, K

Kohonen, T. & Mäkisara, K. The self-organizing feature maps. Physica Scripta. 39(1), 168(1989)

1989

[47] [49]

J., Bisby, J

Grande, X., Berron, D., Horner, A. J., Bisby, J. A. & Düzel, E. et al. Holistic recollection via pattern completion involves hippocampal subfield CA3. Journal of Neuroscience. 39(41), 81 00- 8111(2019)

2019

[48] [50]

H., Wiskott, L

Azizi, A. H., Wiskott, L. & Cheng, S. A computational model for preplay in the hippocampus. Frontiers in computational neuroscience. 7, 161(2013)

2013

[49] [51]

Hopfield Networks is All You Need

Ramsauer, H., Schäfl, B., Lehner, J., Seidl, P. & Widrich, M. et al. Hopfield networks is all you need. arXiv preprint arXiv:2008.02217(2020)

work page internal anchor Pith review Pith/arXiv arXiv 2008

[50] [52]

& Losonczy, A

Soltesz, I. & Losonczy, A. CA1 pyramidal cell diversity enabling parallel information processing in the hippocampus. Nature neuroscience. 21(4), 484-493(2018)

2018

[51] [53]

A., Witter, M

Naber, P. A., Witter, M. P. & Lopes da Silva, F. H. Networks of the Hippocampal Memory System of the Rat: The Pivotal Role of the Subiculum a. Annals of the New York Academy of Sciences. 911(1), 392-403(2000)

2000

[52] [54]

& Mizuseki, K

Matsumoto, N., Kitanishi, T. & Mizuseki, K. The subiculum: Unique hi ppocampal hub and more. Neuroscience research. 143, 1-12(2019)

2019

[53] [55]

Aggleton, J. P. & Christiansen, K. The subiculum: the heart of the extended hippocampal system. Progress in brain research. 219, 65-82(2015)

2015

[54] [56]

Sanders, D. M. W. & Schacter, D. L. Adap tive Memory Distortions. Interdisciplinary Perspectives and Advances in Understanding Adaptive Memory. 31(2024)

2024

[55] [57]

Microsoft COCO: Common objects in context,

T.-Y . Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan,P. Doll´ar, and C. L. Zitnick, “Microsoft COCO: Common objects in context,” in Eur. Conf. Comput. Vis., 2014, pp. 740–755

2014

[56] [58]

GoldYOLO: Efficient object detector via gather-and-distribute mechanism,

C. Wang, W. He, Y . Nie, J. Guo, C. Liu, Y . Wang, and K. Han, “GoldYOLO: Efficient object detector via gather-and-distribute mechanism,” Adv. Neural Inform. Process. Syst., pp. 51 094–51 112, 2023

2023

[57] [59]

Chen, Q., Su, X., Zhang, X., Wang, J., Chen, J., Shen, Y ., Han, C., Chen, Z., Xu, W., Li, F., Zhang, S., Yao, K., Ding, E., Zhang, G., & Wang, J. (2024). LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection. arXiv. https://doi.org/10.48550/arXiv.2406.03459

work page doi:10.48550/arxiv.2406.03459 2024

[58] [60]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

Xiao, H., Rasul, K. & V ollgraf, R. Fashion -mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747(2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[59] [61]

& Haffner, P

LeCun, Y ., Bottou, L., Bengio, Y . & Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE. 86(11), 2278-2324(2002)

2002

[60] [62]

& Hinton, G

Krizhevsky, A., Sutskever, I. & Hinton, G. E. Learning multiple layers of features from tiny images. Technical Report (University of Toronto, 2009)

2009

[61] [63]

& Sun, J

He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770-778(2016)

2016

[62] [64]

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D. & Wang, W. et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861(2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[63] [65]

& Weinberger, K

Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700- 4708 (2017)

2017

[64] [66]

Brock, A., De, S., Smith, S. L. & Simonyan, K. High -performance large -scale image recognition without normalization. Proc. Mach. Learn. Res. 139, 1059–1071 (2021)

2021

[65] [67]

E., Kesner, R

Gilbert, P. E., Kesner, R. P. & Lee, I. Dissociating hippocampal subregions: A double dissociation between dentate gyrus and CA1. Hippocampus. 11(6), 626-636 (2001)

2001

[66] [68]

distorted

Bartsch, T., Schönfeld, R., Müller, F. J., Alfke, K. & Leplow, B. et al. Focal lesions of human hippocampal CA1 neurons in transient global amnesia impair place memory. Science. 328(5984), 1412-1415(2010). Supplementary Information EC EC2 performs empty-feature removal and normalization on the features from the sensory input component. Among the features ...

2010