netFound is a pretrained network foundation model using protocol-aware tokenization, context embedding, hierarchical attention, and privacy design that reaches F1 0.95 on exogenous context discrimination versus under 0.62 for prior models.
An image is worth 16x16 words: Transformers for image recognition at scale
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
Caption-Matching generates image captions via pre-trained VLMs and matches them across domains to achieve SOTA CDIR performance on Office-Home and DomainNet without labeled data or fine-tuning.
An efficient transformer architecture for BEV instance prediction reduces parameter counts and inference times versus SOTA by relying on a simplified paradigm of only instance segmentation and flow prediction.
citing papers explorer
-
netFound: Principled Design for Network Foundation Models
netFound is a pretrained network foundation model using protocol-aware tokenization, context embedding, hierarchical attention, and privacy design that reaches F1 0.95 on exogenous context discrimination versus under 0.62 for prior models.
-
Caption-Matching: A Multimodal Approach for Cross-Domain Image Retrieval
Caption-Matching generates image captions via pre-trained VLMs and matches them across domains to achieve SOTA CDIR performance on Office-Home and DomainNet without labeled data or fine-tuning.
-
Fast and Efficient Transformer-based Method for Bird's Eye View Instance Prediction
An efficient transformer architecture for BEV instance prediction reduces parameter counts and inference times versus SOTA by relying on a simplified paradigm of only instance segmentation and flow prediction.