pith. sign in

arxiv: 2602.20845 · v1 · pith:RJZ4C27Gnew · submitted 2026-02-24 · 💻 cs.CV

FLIM Networks with Bag of Feature Points

Pith reviewed 2026-05-22 11:33 UTC · model grok-4.3

classification 💻 cs.CV
keywords FLIMBag of Feature Pointsfilter estimationparasite detectionoptical microscopyconvolutional networksno backpropagation
0
0 comments X

The pith

FLIM-BoFP estimates network filters from a single clustering on input images, speeding up training while improving detection of parasites in microscopy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FLIM-BoFP as a new way to train convolutional networks without backpropagation for tasks like finding parasites in microscope images. Instead of clustering patches at every layer of the network, it clusters once at the beginning and maps those points to create filters for all layers. This makes the process much faster and leads to better performance and ability to work on new images compared to the previous FLIM-Cluster method and other approaches. A sympathetic reader would care because it reduces the need for lots of labeled data and heavy computation in medical image analysis.

Core claim

FLIM-BoFP performs a single clustering step at the input block to create a bag of feature points, from which filters are directly defined and mapped across all subsequent encoder blocks in the FLIM network, resulting in considerably faster filter estimation that improves efficiency, effectiveness, and generalization for parasite detection in optical microscopy images compared to FLIM-Cluster.

What carries the argument

The bag of feature points created by single clustering at the input, which allows mapping to define filters in all encoder blocks without repeated clustering.

If this is right

  • FLIM-BoFP reduces computational overhead by avoiding per-block clustering.
  • It provides more control over filter locations through direct mapping from input features.
  • The method achieves better generalization in detecting parasites in new microscopy images.
  • FLIM networks become lighter and fully trainable without backpropagation using this approach.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The single clustering approach might allow easier adaptation to other image analysis tasks beyond parasite detection.
  • If the mapping preserves discriminative power, it could reduce the number of required representative images even further.
  • Extensions could test the method on different types of medical imaging data to confirm broad applicability.

Load-bearing premise

Mapping feature points from a single input clustering produces high-quality filters for all later network blocks without needing adjustments or creating new problems.

What would settle it

A test showing that FLIM-BoFP performs worse than FLIM-Cluster on a new set of microscopy images with different parasites would disprove the claim of improved generalization and effectiveness.

Figures

Figures reproduced from arXiv: 2602.20845 by Alexandre X. Falc\~ao, Felipe Crispim da Rocha Salvagnini, Gilson Junior Soares, Jefersson A. dos Santos, Jo\~ao Deltregia Martinelli, Marcelo Luis Rodrigues Filho.

Figure 1
Figure 1. Figure 1: Training pipeline of a FLIM SOD Network from user-drawn markers on [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Progressive saliency maps by decoding consecutive blocks. Figures 2a [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Training of a FLIM encoder with BoFP begins with a single clustering [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Average F-measure curves across splits on test sets for each model. Figure [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative detection results on test images of the S. Mansoni, Entamoeba and Ancylostoma datasets. GT is the ground-truth segmentation mask. 7 Conclusion We introduced FLIM-BoFP, a novel kernel estimation technique for FLIM method￾ology, designed as a more efficient alternative to FLIM-Cluster. FLIM-BoFP demonstrated superior efficiency and effectiveness, outperforming FLIM-Cluster and several heavier dee… view at source ↗
read the original abstract

Convolutional networks require extensive image annotation, which can be costly and time-consuming. Feature Learning from Image Markers (FLIM) tackles this challenge by estimating encoder filters (i.e., kernel weights) from user-drawn markers on discriminative regions of a few representative images without traditional optimization. Such an encoder combined with an adaptive decoder comprises a FLIM network fully trained without backpropagation. Prior research has demonstrated their effectiveness in Salient Object Detection (SOD), being significantly lighter than existing lightweight models. This study revisits FLIM SOD and introduces FLIM-Bag of Feature Points (FLIM-BoFP), a considerably faster filter estimation method. The previous approach, FLIM-Cluster, derives filters through patch clustering at each encoder's block, leading to computational overhead and reduced control over filter locations. FLIM-BoFP streamlines this process by performing a single clustering at the input block, creating a bag of feature points, and defining filters directly from mapped feature points across all blocks. The paper evaluates the benefits in efficiency, effectiveness, and generalization of FLIM-BoFP compared to FLIM-Cluster and other state-of-the-art baselines for parasite detection in optical microscopy images.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes FLIM-BoFP as a faster filter estimation method for FLIM networks applied to parasite detection in optical microscopy images. It replaces per-block patch clustering from FLIM-Cluster with a single clustering at the input block to form a bag of feature points, which are then mapped to define filters across all encoder blocks. The work claims improvements in efficiency, effectiveness, and generalization over FLIM-Cluster and other state-of-the-art baselines, all without backpropagation.

Significance. If the empirical claims hold with proper validation, the approach could enable more computationally efficient training of lightweight convolutional networks on limited annotations for medical imaging tasks such as parasite detection, extending prior FLIM results from salient object detection.

major comments (2)
  1. Abstract: The abstract asserts efficiency and generalization benefits but supplies no quantitative results, error bars, ablation details, or dataset descriptions. Without these, it is impossible to assess whether the central claim is supported by the data or experiments.
  2. Method description (cross-block mapping): The claim that feature points clustered once at the input block produce filters of comparable or superior discriminative quality in every subsequent encoder block rests on an unexamined assumption. No analysis is provided of how the mapping is computed, whether it accounts for changing receptive fields after convolutions and pooling, or whether it introduces misalignment in later blocks where features are more abstract. This is load-bearing for the generalization and effectiveness claims.
minor comments (2)
  1. Add explicit pseudocode or equations for the feature-point mapping procedure between blocks.
  2. Clarify the clustering algorithm (e.g., k-means parameters) and how the number of feature points is chosen.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments on our manuscript. We have carefully considered each point and provide detailed responses below, along with plans for revisions to address the concerns raised.

read point-by-point responses
  1. Referee: Abstract: The abstract asserts efficiency and generalization benefits but supplies no quantitative results, error bars, ablation details, or dataset descriptions. Without these, it is impossible to assess whether the central claim is supported by the data or experiments.

    Authors: We agree that the abstract would benefit from including quantitative results to support the claims. In the revised manuscript, we will update the abstract to incorporate specific metrics, such as the observed reduction in computational time for filter estimation, performance improvements in terms of accuracy or F1-score for parasite detection, and a description of the datasets used. We will also mention that results include error bars from multiple runs where applicable. This revision will make the abstract more informative and allow for better assessment of the central claims. revision: yes

  2. Referee: Method description (cross-block mapping): The claim that feature points clustered once at the input block produce filters of comparable or superior discriminative quality in every subsequent encoder block rests on an unexamined assumption. No analysis is provided of how the mapping is computed, whether it accounts for changing receptive fields after convolutions and pooling, or whether it introduces misalignment in later blocks where features are more abstract. This is load-bearing for the generalization and effectiveness claims.

    Authors: The cross-block mapping in FLIM-BoFP is designed to propagate the feature points from the input level to deeper layers by adjusting their positions according to the cumulative stride and pooling factors in the encoder blocks. This ensures that the filters are defined at locations that correspond to the original marked regions in the input image. However, we recognize that a more thorough analysis of potential misalignments and the impact of increasing receptive fields and feature abstraction is needed. We will expand the method section with a detailed explanation of the mapping procedure, including pseudocode or equations for the coordinate mapping, and add a new subsection discussing the assumptions and their validity. Furthermore, we plan to include additional experiments or visualizations showing the alignment of filters across blocks to strengthen the claims regarding generalization and effectiveness. revision: yes

Circularity Check

0 steps flagged

No circularity: procedural method change with empirical claims

full rationale

The paper describes FLIM-BoFP as a straightforward procedural simplification of FLIM-Cluster: single clustering at the input block followed by mapping of feature points to define filters in later blocks. No equations, fitted parameters, or self-referential derivations are presented that would reduce a claimed result to its own inputs by construction. Claims of improved efficiency, effectiveness, and generalization rest on comparative experiments against baselines rather than any mathematical loop or self-citation that serves as the sole justification. The mapping step is an explicit design choice whose validity is left to empirical validation, not assumed via prior self-citation or ansatz smuggling. This is a standard non-circular presentation of an algorithmic variant.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review prevents identification of specific fitted parameters, background axioms, or new entities. The central claim implicitly rests on the unstated premise that input-level feature points remain sufficiently informative after propagation through multiple encoder blocks.

pith-pipeline@v0.9.0 · 5766 in / 1120 out tokens · 40468 ms · 2026-05-22T11:33:52.205343+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages

  1. [1]

    Scalable Funding of Bitcoin Micropayment Channel Networks

    Bragantini, J., Martins, S., Castelo-Fernandez, C., Falcão, A.: Graph-based im- age segmentation using dynamic trees. In: Progress in Pattern Recognition, Im- age Analysis, Computer Vision, and Applications: 23rd Iberoamerican Congress, CIARP 2018. p. 470–478. Springer-Verlag (2018). https://doi.org/10.1007/978-3- 030-13469-3_55

  2. [2]

    Archives of Computational Methods in Engineering31(4), 1915–1937 (2024)

    Chen, F., Li, S., Han, J., Ren, F., Yang, Z.: Review of lightweight deep convolu- tional neural networks. Archives of Computational Methods in Engineering31(4), 1915–1937 (2024). https://doi.org/10.1007/s11831-023-10032-z

  3. [3]

    Knowledge- based systems212, 106622 (2021)

    He, X., Zhao, K., Chu, X.: Automl: A survey of the state-of-the-art. Knowledge- based systems212, 106622 (2021). https://doi.org/10.1016/j.knosys.2020.106622

  4. [4]

    João,L.,eSousa,A.M.,dosSantos,B.,Guimarães,S.,Kijak,E.,Gomes,J.,Falcão, A.: Building flyweight flim-based cnns with adaptive decoding for object detection (2023), https://arxiv.org/abs/2306.14840

  5. [5]

    In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)

    Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.Y., Dollár, P., Girshick, R.: Segment anything. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 3992–4003 (2023). https://doi.org/10.1109/ICCV51070.2023.00371

  6. [6]

    IEEE Transactions on Geoscience and Remote Sensing61, 1–11 (2023)

    Li, G., Liu, Z., Zhang, X., Lin, W.: Lightweight salient object detection in optical remote-sensing images via semantic matching and edge alignment. IEEE Transactions on Geoscience and Remote Sensing61, 1–11 (2023). https://doi.org/10.1109/TGRS.2023.3235717

  7. [7]

    In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    Liu, N., Han, J., Yang, M.: Picanet: Learning pixel-wise contextual attention for saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3089–3098 (2018)

  8. [8]

    IEEE Transactions on Cybernetics 51(9), 4439–4449 (2021)

    Liu, Y., Gu, Y., Zhang, X., Wang, W., Cheng, M.: Lightweight salient object detec- tion via hierarchical visual perception learning. IEEE Transactions on Cybernetics 51(9), 4439–4449 (2021). https://doi.org/10.1109/TCYB.2020.3035613

  9. [9]

    IEEE Trans- actions on Image Processing30, 3804–3814 (2021)

    Liu, Y., Zhang, X., Bian, J., Zhang, L., Cheng, M.: SAMNet: Stereoscopically attentive multi-scale network for lightweight salient object detection. IEEE Trans- actions on Image Processing30, 3804–3814 (2021)

  10. [10]

    Nature Communications15(1), 654 (1 2024)

    Ma, J., He, Y., Li, F., Han, L., You, C., Wang, B.: Segment any- thing in medical images. Nature Communications15(1), 654 (1 2024). https://doi.org/10.1038/s41467-024-44824-z, https://doi.org/10.1038/s41467-024- 44824-z

  11. [11]

    In: VISIGRAPP : VISAPP (2024), https://api.semanticscholar.org/CorpusID:268234049

    de Melo Joao, L., Cerqueira, M.A., Benato, B.C., Falcão, A.X.: Understanding marker-based normalization for flim networks. In: VISIGRAPP : VISAPP (2024), https://api.semanticscholar.org/CorpusID:268234049

  12. [12]

    Frontiers in ImmunologyV olume 13 - 2022(2022)

    Ogongo, P., Nyakundi, R.K., Chege, G.K., Ochola, L.: The road to elim- ination: Current state of schistosomiasis research and progress towards the end game. Frontiers in ImmunologyV olume 13 - 2022(2022). https://doi.org/10.3389/fimmu.2022.846108

  13. [13]

    Pattern recognition106, 107404 (2020)

    Qin, X., Zhang, Z., Huang, C., Dehghan, M., Zaiane, O., Jager- sand, M.: U2-net: Going deeper with nested u-structure for salient object detection. Pattern recognition106, 107404 (2020). https://doi.org/https://doi.org/10.1016/j.patcog.2020.107404

  14. [14]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M.: Bas- net: Boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 7479–7489 (2019). https://doi.org/10.1109/CVPR.2019.00766 FLIM Networks with Bag of Feature Points 15

  15. [15]

    Salvagnini, F.C., Gomes, J.F., Santos, C.A.N., Guimarães, S.J.F., Fal- cão, A.X.: Multi-level cellular automata for flim networks (2025), https://arxiv.org/abs/2504.11406

  16. [16]

    Santos, B., Soares, F., Laryssa, S.R.S., Gomes, D., Oliveira, B., Peixinho, A., Suzuki, C., Bresciani, K., Falcão, A., Gomes, J.: TF-Test quantified: a new tech- nique for diagnosis of schistosoma mansoni eggs. Trop. Med. Int. Health24(5), 586–595 (May 2019)

  17. [17]

    Soares, G., Cerqueira, M., Gomes, J., Najman, L., Guimarães, S., Falcão, A.: Flim-based salient object detection networks with adaptive decoders (2025), https://arxiv.org/abs/2504.20872

  18. [18]

    Mitchell, R., Cooper, J., Frank, E., and Holmes, G

    de Souza, I., Benato, B., Falcão, A.: Feature learning from image markers for object delineation. In: 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). pp. 116–123 (2020). https://doi.org/10.1109/SIBGRAPI51738.2020.00024

  19. [19]

    IEEE Geoscience and Remote Sensing Letters PP, 1–5 (09 2020)

    de Souza, I., Falcão, A.: Learning cnn filters from user-drawn image markers for coconut-tree image classification. IEEE Geoscience and Remote Sensing Letters PP, 1–5 (09 2020). https://doi.org/10.1109/LGRS.2020.3020098

  20. [20]

    Multimedia Tools and Applications79, 34605–34645 (2020)

    Ullah, I., Jian, M., Hussain, S., Guo, J., Yu, H., Wang, X., Yin, Y.: A brief survey of visual saliency detection. Multimedia Tools and Applications79, 34605–34645 (2020). https://doi.org/10.1007/s11042-020-08849-y