pith. sign in
Pith Number

pith:PADQJG66

pith:2026:PADQJG66RQ73J2J5GHB5OIIE7W
not attested not anchored not stored refs resolved

StyleTextGen: Style-Conditioned Multilingual Scene Text Generation

Fangmin Zhao, Liu Yu, Yan Shu, Yichao Liu, Yu Zhou, Zeyu Chen

StyleTextGen generates scene text that matches reference visual styles across languages using a dedicated dual-branch encoder.

arxiv:2605.14708 v1 · 2026-05-14 · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{PADQJG66RQ73J2J5GHB5OIIE7W}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

StyleTextGen significantly outperforms existing methods in style consistency and cross-lingual generalization, establishing new state-of-the-art performance in multilingual style-conditioned text generation.

C2weakest assumption

The dual-branch style encoder and consistency loss successfully extract and maintain precise, fine-grained text styles from complex real-world backgrounds across languages without post-hoc tuning or dataset-specific adjustments.

C3one line summary

StyleTextGen proposes a dual-branch style encoder, text style consistency loss, and mask-guided inference to achieve superior style consistency and cross-lingual performance in multilingual scene text generation on a new bilingual benchmark.

References

61 extracted · 61 resolved · 3 Pith anchors

[1] In- structpix2pix: Learning to follow image editing instructions 2023
[2] The devil is in fine-tuning and long-tailed prob- lems: a new benchmark for scene text detection.arXiv preprint arXiv:2505.15649, 2025 2025
[3] Posta: A go-to framework for customized artistic poster gen- eration 2025
[4] Textdiffuser-2: Unleashing the power of language models for text rendering.arXiv preprint arXiv:2311.16465, 2023 2023
[5] arXiv preprint arXiv:2305.10855 (2023) 7 2023

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-17T23:38:59.246059Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

7807049bde8c3fb4e93d31c3d72104fdbb4253a1b27fec316ee732cc27f285e8

Aliases

arxiv: 2605.14708 · arxiv_version: 2605.14708v1 · doi: 10.48550/arxiv.2605.14708 · pith_short_12: PADQJG66RQ73 · pith_short_16: PADQJG66RQ73J2J5 · pith_short_8: PADQJG66
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/PADQJG66RQ73J2J5GHB5OIIE7W \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 7807049bde8c3fb4e93d31c3d72104fdbb4253a1b27fec316ee732cc27f285e8
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "0667db19a66b31850f0dd9d8cdeb6d60eb029b8f0216dd0ab767a44e95764e7a",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-14T11:24:44Z",
    "title_canon_sha256": "7e9d295f264f56563720097f63d96edad649716d38bfc6814486f3553f853f77"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.14708",
    "kind": "arxiv",
    "version": 1
  }
}