pith. sign in

Vision-language models for vision tasks: A survey,

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

years

2026 1 2025 2

representative citing papers

Re:Verse -- Can Your VLM Read a Manga?

cs.CV · 2025-08-11 · unverdicted · novelty 6.0

Current VLMs excel at individual manga panel interpretation but systematically fail at temporal causality and cross-panel cohesion in long-form narratives.

Mobile GUI Agents under Real-world Threats: Are We There Yet?

cs.CR · 2025-07-06 · conditional · novelty 6.0

Introduces an app-content instrumentation framework and benchmark showing that examined GUI agents suffer 42.0% and 36.1% average misleading rates from third-party content in dynamic and static tests respectively.

citing papers explorer

Showing 3 of 3 citing papers.

  • Re:Verse -- Can Your VLM Read a Manga? cs.CV · 2025-08-11 · unverdicted · none · ref 38

    Current VLMs excel at individual manga panel interpretation but systematically fail at temporal causality and cross-panel cohesion in long-form narratives.

  • Mobile GUI Agents under Real-world Threats: Are We There Yet? cs.CR · 2025-07-06 · conditional · none · ref 35

    Introduces an app-content instrumentation framework and benchmark showing that examined GUI agents suffer 42.0% and 36.1% average misleading rates from third-party content in dynamic and static tests respectively.

  • Jointly Learning Predicates and Actions Enables Zero-Shot Skill Composition cs.RO · 2026-05-20 · unverdicted · none · ref 9

    PACTS jointly model action trajectories and predicate belief trajectories in a single generative policy, enabling zero-shot skill composition via symbolic planning without retraining.