pith. sign in
Pith Number

pith:2Y5C4DL6

pith:2026:2Y5C4DL6PAZY4EYIRJ3M436ME4
not attested not anchored not stored refs resolved

MiVE: Multiscale Vision-language features for reference-guided video Editing

Chengjing Wu, Luoqi Liu, Meng Zou, Ting Liu, Tong Wang, Xiaochao Qu, Xiaolin Hu

MiVE pulls multiscale features from a single vision-language model to guide accurate reference-based video edits.

arxiv:2605.14664 v1 · 2026-05-14 · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{2Y5C4DL6PAZY4EYIRJ3M436ME4}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Experiments demonstrate that MiVE achieves state-of-the-art performance by ranking highest in human preference, outperforming both academic methods and commercial systems.

C2weakest assumption

VLM layers encode complementary information hierarchically -- early layers capture localized spatial details essential for precise editing, while deeper layers encode global semantics for instruction comprehension.

C3one line summary

MiVE repurposes VLMs as multiscale feature extractors integrated into a unified self-attention Diffusion Transformer, achieving top human preference in reference-guided video editing.

References

34 extracted · 34 resolved · 9 Pith anchors

[1] VACE: All-in-One Video Creation and Editing 2025 · doi:10.48550/arxiv.2503.07598
[2] VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control , booktitle = 2025 · doi:10.1145/3721238.3730673
[3] 2025 , url = 2025
[4] CoRR , volume = 2025 · doi:10.48550/arxiv.2512.02933
[5] VideoCoF: Unified Video Editing with Temporal Reasoner 2025 · doi:10.48550/arxiv.2512.07469

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-17T23:39:02.164730Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

d63a2e0d7e78338e13088a76ce6fcc27059ec6f3051ee4ab8722391e4dc5b822

Aliases

arxiv: 2605.14664 · arxiv_version: 2605.14664v1 · doi: 10.48550/arxiv.2605.14664 · pith_short_12: 2Y5C4DL6PAZY · pith_short_16: 2Y5C4DL6PAZY4EYI · pith_short_8: 2Y5C4DL6
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/2Y5C4DL6PAZY4EYIRJ3M436ME4 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: d63a2e0d7e78338e13088a76ce6fcc27059ec6f3051ee4ab8722391e4dc5b822
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "5999aef1ee86eaf70f9cf0035c14dbdc6f1970b9529e05624f4cf96364068314",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-05-14T10:19:19Z",
    "title_canon_sha256": "327da6f86814c6f71f62bd764413f8dc9a6036e19e7f1918754756fa3cd60b41"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.14664",
    "kind": "arxiv",
    "version": 1
  }
}