pith. sign in
Pith Number

pith:345QV4PT

pith:2026:345QV4PTXZL7TF33V6CPMWQTLM
not attested not anchored not stored refs resolved

MERVIN: A Unified Framework for Multimodal Event Retrieval in Vietnamese News Videos

Anh-Duy Le, Anh-Tai Pham-Nguyen, Trung-Hieu Truong-Le, Tung-Duong Le-Duc

A framework unifies visual frames, enhanced transcripts, and summaries for retrieving events in Vietnamese news videos.

arxiv:2605.16120 v1 · 2026-05-15 · cs.IR

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{345QV4PTXZL7TF33V6CPMWQTLM}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

MERVIN achieving 79 out of 88 points in AI Challenge HCMC 2025 qualification phase and successfully retrieved all results for every query in the final round.

C2weakest assumption

That combining keyframes, Gemini-enhanced transcripts, and video summaries via separate visual and textual embeddings will produce meaningfully better semantic retrieval than simpler single-modality baselines for Vietnamese news content.

C3one line summary

MERVIN is a multimodal retrieval system for Vietnamese news videos that integrates visual and textual features with LLM-enhanced transcripts and reports strong results on a 2025 AI challenge.

References

16 extracted · 16 resolved · 2 Pith anchors

[1] In: ICCV (2021) 2021
[2] In: Proceedings of NAACL-HLT (2019) 2019
[3] In: Proceedings of the 14th International Symposium on Information and Communication Technology (SOICT 2025) 2025
[4] https://github.com/mlfoundations/open_clip (2021) 2021
[5] In: NeurIPS (2022) 2022

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:01:53.715193Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

df3b0af1f3be57f9977baf84f65a135b24c9c488d058cfb96228afeed66ff06d

Aliases

arxiv: 2605.16120 · arxiv_version: 2605.16120v1 · doi: 10.48550/arxiv.2605.16120 · pith_short_12: 345QV4PTXZL7 · pith_short_16: 345QV4PTXZL7TF33 · pith_short_8: 345QV4PT
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/345QV4PTXZL7TF33V6CPMWQTLM \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: df3b0af1f3be57f9977baf84f65a135b24c9c488d058cfb96228afeed66ff06d
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "15b8f30bb571cdd6c6e6cc90edfa601e3ce04f54f935a5145a6de89e396b69cc",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.IR",
    "submitted_at": "2026-05-15T16:02:48Z",
    "title_canon_sha256": "cf4c6d90698f5cb344f26f51bcc7e188c0c6986665f14ab2c73a4596dbc0506c"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.16120",
    "kind": "arxiv",
    "version": 1
  }
}