pith. sign in

hub

Sana 1.5: Efficient scaling of training-time and inference-time compute in linear diffusion transformer

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

hub tools

citation-role summary

background 2 baseline 2

citation-polarity summary

fields

cs.CV 13 cs.LG 1

years

2026 8 2025 6

representative citing papers

Flow-GRPO: Training Flow Matching Models via Online RL

cs.CV · 2025-05-08 · unverdicted · novelty 8.0

Flow-GRPO is the first online RL method for flow matching models, raising GenEval accuracy from 63% to 95% and text-rendering accuracy from 59% to 92% with little reward hacking.

The Silent Brush: Evaluating Artistic Style Leakage in AI Art Generation

cs.LG · 2026-05-17 · unverdicted · novelty 7.0

Art Arena evaluates how artistic styles from training data leak into AI-generated images without explicit prompts, revealing asymmetric blending due to differences in representational strength and interaction dynamics across models like Stable Diffusion.

What Makes Synthetic Data Effective in Image Segmentation

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

Dense scene composition and instance fidelity in synthetic diffusion images drive better segmentation performance; SENSE framework exploits this to improve models on Cityscapes, COCO, and ADE20K.

Nucleus-Image: Sparse MoE for Image Generation

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

A 17B-parameter sparse MoE diffusion transformer activates 2B parameters per pass and reaches competitive quality on image generation benchmarks without post-training.

YOLOv12: Attention-Centric Real-Time Object Detectors

cs.CV · 2025-02-18 · unverdicted · novelty 6.0

YOLOv12 is a new attention-based real-time object detector that reports higher accuracy than YOLOv10, YOLOv11, and RT-DETR variants at comparable or better speed and efficiency.

Qwen-Image Technical Report

cs.CV · 2025-08-04 · unverdicted · novelty 5.0

Qwen-Image is a foundation model that reaches state-of-the-art results in image generation and editing by combining a large-scale text-focused data pipeline with curriculum learning and dual semantic-reconstructive encoding for editing consistency.

citing papers explorer

Showing 14 of 14 citing papers.