FLIM Networks with Bag of Feature Points
Pith reviewed 2026-05-22 11:33 UTC · model grok-4.3
The pith
FLIM-BoFP estimates network filters from a single clustering on input images, speeding up training while improving detection of parasites in microscopy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FLIM-BoFP performs a single clustering step at the input block to create a bag of feature points, from which filters are directly defined and mapped across all subsequent encoder blocks in the FLIM network, resulting in considerably faster filter estimation that improves efficiency, effectiveness, and generalization for parasite detection in optical microscopy images compared to FLIM-Cluster.
What carries the argument
The bag of feature points created by single clustering at the input, which allows mapping to define filters in all encoder blocks without repeated clustering.
If this is right
- FLIM-BoFP reduces computational overhead by avoiding per-block clustering.
- It provides more control over filter locations through direct mapping from input features.
- The method achieves better generalization in detecting parasites in new microscopy images.
- FLIM networks become lighter and fully trainable without backpropagation using this approach.
Where Pith is reading between the lines
- The single clustering approach might allow easier adaptation to other image analysis tasks beyond parasite detection.
- If the mapping preserves discriminative power, it could reduce the number of required representative images even further.
- Extensions could test the method on different types of medical imaging data to confirm broad applicability.
Load-bearing premise
Mapping feature points from a single input clustering produces high-quality filters for all later network blocks without needing adjustments or creating new problems.
What would settle it
A test showing that FLIM-BoFP performs worse than FLIM-Cluster on a new set of microscopy images with different parasites would disprove the claim of improved generalization and effectiveness.
Figures
read the original abstract
Convolutional networks require extensive image annotation, which can be costly and time-consuming. Feature Learning from Image Markers (FLIM) tackles this challenge by estimating encoder filters (i.e., kernel weights) from user-drawn markers on discriminative regions of a few representative images without traditional optimization. Such an encoder combined with an adaptive decoder comprises a FLIM network fully trained without backpropagation. Prior research has demonstrated their effectiveness in Salient Object Detection (SOD), being significantly lighter than existing lightweight models. This study revisits FLIM SOD and introduces FLIM-Bag of Feature Points (FLIM-BoFP), a considerably faster filter estimation method. The previous approach, FLIM-Cluster, derives filters through patch clustering at each encoder's block, leading to computational overhead and reduced control over filter locations. FLIM-BoFP streamlines this process by performing a single clustering at the input block, creating a bag of feature points, and defining filters directly from mapped feature points across all blocks. The paper evaluates the benefits in efficiency, effectiveness, and generalization of FLIM-BoFP compared to FLIM-Cluster and other state-of-the-art baselines for parasite detection in optical microscopy images.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes FLIM-BoFP as a faster filter estimation method for FLIM networks applied to parasite detection in optical microscopy images. It replaces per-block patch clustering from FLIM-Cluster with a single clustering at the input block to form a bag of feature points, which are then mapped to define filters across all encoder blocks. The work claims improvements in efficiency, effectiveness, and generalization over FLIM-Cluster and other state-of-the-art baselines, all without backpropagation.
Significance. If the empirical claims hold with proper validation, the approach could enable more computationally efficient training of lightweight convolutional networks on limited annotations for medical imaging tasks such as parasite detection, extending prior FLIM results from salient object detection.
major comments (2)
- Abstract: The abstract asserts efficiency and generalization benefits but supplies no quantitative results, error bars, ablation details, or dataset descriptions. Without these, it is impossible to assess whether the central claim is supported by the data or experiments.
- Method description (cross-block mapping): The claim that feature points clustered once at the input block produce filters of comparable or superior discriminative quality in every subsequent encoder block rests on an unexamined assumption. No analysis is provided of how the mapping is computed, whether it accounts for changing receptive fields after convolutions and pooling, or whether it introduces misalignment in later blocks where features are more abstract. This is load-bearing for the generalization and effectiveness claims.
minor comments (2)
- Add explicit pseudocode or equations for the feature-point mapping procedure between blocks.
- Clarify the clustering algorithm (e.g., k-means parameters) and how the number of feature points is chosen.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive comments on our manuscript. We have carefully considered each point and provide detailed responses below, along with plans for revisions to address the concerns raised.
read point-by-point responses
-
Referee: Abstract: The abstract asserts efficiency and generalization benefits but supplies no quantitative results, error bars, ablation details, or dataset descriptions. Without these, it is impossible to assess whether the central claim is supported by the data or experiments.
Authors: We agree that the abstract would benefit from including quantitative results to support the claims. In the revised manuscript, we will update the abstract to incorporate specific metrics, such as the observed reduction in computational time for filter estimation, performance improvements in terms of accuracy or F1-score for parasite detection, and a description of the datasets used. We will also mention that results include error bars from multiple runs where applicable. This revision will make the abstract more informative and allow for better assessment of the central claims. revision: yes
-
Referee: Method description (cross-block mapping): The claim that feature points clustered once at the input block produce filters of comparable or superior discriminative quality in every subsequent encoder block rests on an unexamined assumption. No analysis is provided of how the mapping is computed, whether it accounts for changing receptive fields after convolutions and pooling, or whether it introduces misalignment in later blocks where features are more abstract. This is load-bearing for the generalization and effectiveness claims.
Authors: The cross-block mapping in FLIM-BoFP is designed to propagate the feature points from the input level to deeper layers by adjusting their positions according to the cumulative stride and pooling factors in the encoder blocks. This ensures that the filters are defined at locations that correspond to the original marked regions in the input image. However, we recognize that a more thorough analysis of potential misalignments and the impact of increasing receptive fields and feature abstraction is needed. We will expand the method section with a detailed explanation of the mapping procedure, including pseudocode or equations for the coordinate mapping, and add a new subsection discussing the assumptions and their validity. Furthermore, we plan to include additional experiments or visualizations showing the alignment of filters across blocks to strengthen the claims regarding generalization and effectiveness. revision: yes
Circularity Check
No circularity: procedural method change with empirical claims
full rationale
The paper describes FLIM-BoFP as a straightforward procedural simplification of FLIM-Cluster: single clustering at the input block followed by mapping of feature points to define filters in later blocks. No equations, fitted parameters, or self-referential derivations are presented that would reduce a claimed result to its own inputs by construction. Claims of improved efficiency, effectiveness, and generalization rest on comparative experiments against baselines rather than any mathematical loop or self-citation that serves as the sole justification. The mapping step is an explicit design choice whose validity is left to empirical validation, not assumed via prior self-citation or ansatz smuggling. This is a standard non-circular presentation of an algorithmic variant.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
FLIM-BoFP streamlines this process by performing a single clustering at the input block, creating a bag of feature points, and defining filters directly from mapped feature points across all blocks.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The key difference from FLIM-Cluster lies in the control for filter estimation. As set B is defined only once, all layers derive convolutional filters from the same spatial locations mapped across blocks.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Scalable Funding of Bitcoin Micropayment Channel Networks
Bragantini, J., Martins, S., Castelo-Fernandez, C., Falcão, A.: Graph-based im- age segmentation using dynamic trees. In: Progress in Pattern Recognition, Im- age Analysis, Computer Vision, and Applications: 23rd Iberoamerican Congress, CIARP 2018. p. 470–478. Springer-Verlag (2018). https://doi.org/10.1007/978-3- 030-13469-3_55
-
[2]
Archives of Computational Methods in Engineering31(4), 1915–1937 (2024)
Chen, F., Li, S., Han, J., Ren, F., Yang, Z.: Review of lightweight deep convolu- tional neural networks. Archives of Computational Methods in Engineering31(4), 1915–1937 (2024). https://doi.org/10.1007/s11831-023-10032-z
-
[3]
Knowledge- based systems212, 106622 (2021)
He, X., Zhao, K., Chu, X.: Automl: A survey of the state-of-the-art. Knowledge- based systems212, 106622 (2021). https://doi.org/10.1016/j.knosys.2020.106622
- [4]
-
[5]
In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.Y., Dollár, P., Girshick, R.: Segment anything. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 3992–4003 (2023). https://doi.org/10.1109/ICCV51070.2023.00371
-
[6]
IEEE Transactions on Geoscience and Remote Sensing61, 1–11 (2023)
Li, G., Liu, Z., Zhang, X., Lin, W.: Lightweight salient object detection in optical remote-sensing images via semantic matching and edge alignment. IEEE Transactions on Geoscience and Remote Sensing61, 1–11 (2023). https://doi.org/10.1109/TGRS.2023.3235717
-
[7]
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Liu, N., Han, J., Yang, M.: Picanet: Learning pixel-wise contextual attention for saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3089–3098 (2018)
work page 2018
-
[8]
IEEE Transactions on Cybernetics 51(9), 4439–4449 (2021)
Liu, Y., Gu, Y., Zhang, X., Wang, W., Cheng, M.: Lightweight salient object detec- tion via hierarchical visual perception learning. IEEE Transactions on Cybernetics 51(9), 4439–4449 (2021). https://doi.org/10.1109/TCYB.2020.3035613
-
[9]
IEEE Trans- actions on Image Processing30, 3804–3814 (2021)
Liu, Y., Zhang, X., Bian, J., Zhang, L., Cheng, M.: SAMNet: Stereoscopically attentive multi-scale network for lightweight salient object detection. IEEE Trans- actions on Image Processing30, 3804–3814 (2021)
work page 2021
-
[10]
Nature Communications15(1), 654 (1 2024)
Ma, J., He, Y., Li, F., Han, L., You, C., Wang, B.: Segment any- thing in medical images. Nature Communications15(1), 654 (1 2024). https://doi.org/10.1038/s41467-024-44824-z, https://doi.org/10.1038/s41467-024- 44824-z
-
[11]
In: VISIGRAPP : VISAPP (2024), https://api.semanticscholar.org/CorpusID:268234049
de Melo Joao, L., Cerqueira, M.A., Benato, B.C., Falcão, A.X.: Understanding marker-based normalization for flim networks. In: VISIGRAPP : VISAPP (2024), https://api.semanticscholar.org/CorpusID:268234049
work page 2024
-
[12]
Frontiers in ImmunologyV olume 13 - 2022(2022)
Ogongo, P., Nyakundi, R.K., Chege, G.K., Ochola, L.: The road to elim- ination: Current state of schistosomiasis research and progress towards the end game. Frontiers in ImmunologyV olume 13 - 2022(2022). https://doi.org/10.3389/fimmu.2022.846108
-
[13]
Pattern recognition106, 107404 (2020)
Qin, X., Zhang, Z., Huang, C., Dehghan, M., Zaiane, O., Jager- sand, M.: U2-net: Going deeper with nested u-structure for salient object detection. Pattern recognition106, 107404 (2020). https://doi.org/https://doi.org/10.1016/j.patcog.2020.107404
-
[14]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M.: Bas- net: Boundary-aware salient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 7479–7489 (2019). https://doi.org/10.1109/CVPR.2019.00766 FLIM Networks with Bag of Feature Points 15
- [15]
-
[16]
Santos, B., Soares, F., Laryssa, S.R.S., Gomes, D., Oliveira, B., Peixinho, A., Suzuki, C., Bresciani, K., Falcão, A., Gomes, J.: TF-Test quantified: a new tech- nique for diagnosis of schistosoma mansoni eggs. Trop. Med. Int. Health24(5), 586–595 (May 2019)
work page 2019
- [17]
-
[18]
Mitchell, R., Cooper, J., Frank, E., and Holmes, G
de Souza, I., Benato, B., Falcão, A.: Feature learning from image markers for object delineation. In: 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). pp. 116–123 (2020). https://doi.org/10.1109/SIBGRAPI51738.2020.00024
-
[19]
IEEE Geoscience and Remote Sensing Letters PP, 1–5 (09 2020)
de Souza, I., Falcão, A.: Learning cnn filters from user-drawn image markers for coconut-tree image classification. IEEE Geoscience and Remote Sensing Letters PP, 1–5 (09 2020). https://doi.org/10.1109/LGRS.2020.3020098
-
[20]
Multimedia Tools and Applications79, 34605–34645 (2020)
Ullah, I., Jian, M., Hussain, S., Guo, J., Yu, H., Wang, X., Yin, Y.: A brief survey of visual saliency detection. Multimedia Tools and Applications79, 34605–34645 (2020). https://doi.org/10.1007/s11042-020-08849-y
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.