Pith Number

pith:RNKXLNDJ

pith:2026:RNKXLNDJ2QDKTSDATG4OXGQADK

not attested not anchored not stored refs resolved

IndicMedDialog: A Parallel Multi-Turn Medical Dialogue Dataset for Accessible Healthcare in Indic Languages

Piyush Patel, Shubham Kumar Nigam, Suparnojit Sarkar

IndicMedDialog supplies parallel multi-turn medical dialogues in English and nine Indic languages to support personalized symptom-elicitation models.

arxiv:2605.13292 v1 · 2026-05-13 · cs.CL · cs.AI · cs.IR · cs.LG

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{RNKXLNDJ2QDKTSDATG4OXGQADK}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We introduce IndicMedDialog, a parallel multi-turn medical dialogue dataset spanning English and nine Indic languages... Building on this dataset, we fine-tune IndicMedLM via parameter-efficient adaptation... and validate clinical plausibility through medical expert evaluation.

C2weakest assumption

The assumption that LLM-generated synthetic consultations, after translation with TranslateGemma and native-speaker verification, produce clinically plausible multi-turn dialogues that faithfully represent real patient-provider interactions without introducing systematic biases or factual errors.

C3one line summary

A parallel multi-turn medical dialogue dataset spanning English and nine Indic languages is created from synthetic consultations to enable personalized AI healthcare interactions.

References

44 extracted · 44 resolved · 8 Pith anchors

[1] Findings of the Association for Computational Linguistics: EMNLP 2024 , pages= 2024

[2] arXiv preprint arXiv:2308.08147 , year=

[3] Real-World Doctor Agent with Proactive Consultation through Multi-Agent Reinforcement Learning · arXiv:2505.19630

[4] Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP) , pages= 2020

[5] CCF International Conference on Natural Language Processing and Chinese Computing , pages= 2022

Receipt and verification

First computed	2026-05-18T02:44:49.111874Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

8b5575b469d406a9c86099b8eb9a001a98f95bafcb88901c0008de9639d7f4d2

Aliases

arxiv: 2605.13292 · arxiv_version: 2605.13292v1 · doi: 10.48550/arxiv.2605.13292 · pith_short_12: RNKXLNDJ2QDK · pith_short_16: RNKXLNDJ2QDKTSDA · pith_short_8: RNKXLNDJ

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/RNKXLNDJ2QDKTSDATG4OXGQADK \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 8b5575b469d406a9c86099b8eb9a001a98f95bafcb88901c0008de9639d7f4d2

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "c6d9e7e83fdcada655b8c37716b8a9ee5232d129d28c8bbb00fe95ea4891bc80",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.IR",
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-05-13T10:06:38Z",
    "title_canon_sha256": "9897394a88de9f227fdf8de1a9f4fe89e4f681206ce22204e0a58a01c6eadc41"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13292",
    "kind": "arxiv",
    "version": 1
  }
}