dMLLM-TTS delivers up to 6x more efficient test-time scaling for diffusion MLLMs via O(N+T) hierarchical search and self-verified feedback, improving generation quality on GenEval across three models.
Flux.https://github.com/ black-forest-labs/flux, 2024
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2representative citing papers
A scalable pipeline generates an intra-consistent, inter-diverse 1.4M style image dataset from text-to-image models and uses it to train a style encoder and generalizable style transfer model.
citing papers explorer
-
dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models
dMLLM-TTS delivers up to 6x more efficient test-time scaling for diffusion MLLMs via O(N+T) hierarchical search and self-verified feedback, improving generation quality on GenEval across three models.
-
MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping
A scalable pipeline generates an intra-consistent, inter-diverse 1.4M style image dataset from text-to-image models and uses it to train a style encoder and generalizable style transfer model.