The JDDC 2.0 Corpus: A Large-Scale Multimodal Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service

Bowen Zhou; Haoran Li; Nan Zhao; Xiaodong He; Youzheng Wu

arxiv: 2109.12913 · v1 · pith:X7LD6RFFnew · submitted 2021-09-27 · 💻 cs.CL

The JDDC 2.0 Corpus: A Large-Scale Multimodal Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service

Nan Zhao , Haoran Li , Youzheng Wu , Xiaodong He , Bowen Zhou This is my paper

classification 💻 cs.CL

keywords dialoguemultimodalcustomerimagesservicedatasete-commercejddc

0 comments

read the original abstract

With the development of the Internet, more and more people get accustomed to online shopping. When communicating with customer service, users may express their requirements by means of text, images, and videos, which precipitates the need for understanding these multimodal information for automatic customer service systems. Images usually act as discriminators for product models, or indicators of product failures, which play important roles in the E-commerce scenario. On the other hand, detailed information provided by the images is limited, and typically, customer service systems cannot understand the intents of users without the input text. Thus, bridging the gap of the image and text is crucial for the multimodal dialogue task. To handle this problem, we construct JDDC 2.0, a large-scale multimodal multi-turn dialogue dataset collected from a mainstream Chinese E-commerce platform (JD.com), containing about 246 thousand dialogue sessions, 3 million utterances, and 507 thousand images, along with product knowledge bases and image category annotations. We present the solutions of top-5 teams participating in the JDDC multimodal dialogue challenge based on this dataset, which provides valuable insights for further researches on the multimodal dialogue task.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces
cs.CL 2026-04 unverdicted novelty 7.0

OmniBehavior benchmark demonstrates that LLMs simulating real human behavior converge on hyper-active positive average personas, losing long-tail individual differences.
Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces
cs.CL 2026-04 unverdicted novelty 7.0

Introduces OmniBehavior benchmark from real-world data and shows LLMs exhibit hyper-activity, persona homogenization, and utopian bias in behavior simulation.
SKG-VLA: Scene Knowledge Graph Priors for Structured Scene Semantics and Multimodal Reasoning for Decision Making
cs.AI 2026-05 unverdicted novelty 5.0

SKG-VLA models each complaint as a structured scene via a Scene Knowledge Graph to improve policy-grounded multimodal reasoning and decision accuracy.