ViperGPT generates executable Python code to compose pre-trained vision-and-language modules into programs that answer visual queries, reaching state-of-the-art results with no additional training.
Selvaraju and Abhishek Das and Ramakrishna Vedantam and Michael Cogswell and Devi Parikh and Dhruv Batra , title =
9 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Adding register tokens to Vision Transformers eliminates high-norm background artifacts and raises state-of-the-art performance on dense visual prediction tasks.
VP2O maps PPO to SVGD in a MoE architecture using functional kernels and expert orthogonalization, claiming +179 ELO on Codeforces and 32% token reduction on AIME for a 33B/4B model.
OPTIMUS generates minimal and sufficient concept-based visual explanations for deep classifiers using prime implicant theory to enforce logical sufficiency and minimality.
Inpainting auxiliary task improves clustering of embeddings for individual zebrafish identification based on skin patterns.
Shapley value and variational importance switch methods produce consistent rankings of filter importance in CNNs, enabling compression and interpretability.
Transfer learning with a Zoobot CNN on SDSS DR18 data identifies 3,679 lopsided spiral galaxies at 87% test accuracy, with lopsided systems showing higher star formation, bluer colors, lower mass and concentration.
A decoupled watershed-plus-EfficientNet pipeline recovers 75.95% of cells without annotations and reaches 98.36% stage classification accuracy with instance-level explainability on the NIH BBBC041 dataset.
citing papers explorer
-
Variational Proximal Policy Optimization
VP2O maps PPO to SVGD in a MoE architecture using functional kernels and expert orthogonalization, claiming +179 ELO on Codeforces and 32% token reduction on AIME for a 33B/4B model.