PluRule is a new multimodal multilingual benchmark showing that state-of-the-art vision-language models perform only marginally better than a trivial baseline at detecting specific rule violations in pluralistic online communities.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
Curates over 900 hours of SRKW acoustic data plus other marine mammal recordings via positive-unlabeled active learning, releasing transformer classifiers that report AUROC 0.58-0.77 and species top-1 accuracy of 53.2% on held-out benchmarks.
A dual-stream Transformer using frozen GazeLLE backbones and custom token fusion detects mutual gaze and joint attention from dual-camera recordings, outperforming CNN baselines and a multimodal LLM on caregiver-infant data.
IncepDeHazeGAN is a GAN with Inception blocks and multi-layer feature fusion that claims state-of-the-art single-image dehazing performance on satellite datasets.
A literature review that categorizes deep learning approaches for visual hand gesture recognition, summarizes state-of-the-art methods across tasks, reviews datasets and metrics, and identifies challenges and future directions.
citing papers explorer
-
PluRule: A Benchmark for Moderating Pluralistic Communities on Social Media
PluRule is a new multimodal multilingual benchmark showing that state-of-the-art vision-language models perform only marginally better than a trivial baseline at detecting specific rule violations in pluralistic online communities.
-
Positive-Unlabelled Active Learning to Curate a Dataset for Orca Resident Interpretation
Curates over 900 hours of SRKW acoustic data plus other marine mammal recordings via positive-unlabeled active learning, releasing transformer classifiers that report AUROC 0.58-0.77 and species top-1 accuracy of 53.2% on held-out benchmarks.
-
Automated Detection of Mutual Gaze and Joint Attention in Dual-Camera Settings via Dual-Stream Transformers
A dual-stream Transformer using frozen GazeLLE backbones and custom token fusion detects mutual gaze and joint attention from dual-camera recordings, outperforming CNN baselines and a multimodal LLM on caregiver-infant data.
-
IncepDeHazeGAN: Novel Satellite Image Dehazing
IncepDeHazeGAN is a GAN with Inception blocks and multi-layer feature fusion that claims state-of-the-art single-image dehazing performance on satellite datasets.
-
Visual Hand Gesture Recognition with Deep Learning: A Comprehensive Review of Methods, Datasets, Challenges and Future Research Directions
A literature review that categorizes deep learning approaches for visual hand gesture recognition, summarizes state-of-the-art methods across tasks, reviews datasets and metrics, and identifies challenges and future directions.