BDIViz in Action: Interactive Curation and Benchmarking for Schema Matching Methods

Christos Koutras; Cl\'audio T. Silva; Eden Wu; Juliana Freire

arxiv: 2604.10763 · v1 · submitted 2026-04-12 · 💻 cs.IR · cs.HC

BDIViz in Action: Interactive Curation and Benchmarking for Schema Matching Methods

Eden Wu , Christos Koutras , Cl\'audio T. Silva , Juliana Freire This is my paper

Pith reviewed 2026-05-10 15:04 UTC · model grok-4.3

classification 💻 cs.IR cs.HC

keywords schema matchinginteractive visualizationbenchmarkingdata integrationLLM assistancehuman-in-the-loopground truth curationvisual analytics

0 comments

The pith

BDIViz lets users validate schema matches interactively with LLM help, turning those validations into ground truth for benchmarking new matching algorithms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

BDIViz is an interactive visualization system that takes source and target datasets, runs automatic matchers, and displays candidate matches in a navigable heatmap. Users can validate or reject matches on the spot while viewing attribute descriptions, example values, and distributions in linked panels, with an LLM assistant supplying structured explanations for difficult cases. The system records these validations as evolving ground truth and immediately computes performance metrics for any matcher added through its standardized interface. Demonstrations show the approach in data-harmonization tasks and in developer loops where custom matchers are refined after seeing live results across different schemas.

Core claim

BDIViz applies automatic matching methods to source and target datasets and visualizes the candidates in an interactive heatmap with hierarchical navigation, zoom, and filtering. Users validate matches directly while inspecting ambiguous cases through coordinated views and LLM-generated explanations; these validations become ground truth that supports real-time benchmarking and iterative improvement of matchers integrated via a standard interface.

What carries the argument

Interactive heatmap visualization paired with coordinated detail views and LLM explanations that convert user validations into live ground truth for matcher evaluation.

If this is right

New matchers plug in through a single interface and receive immediate performance scores based on live user validations.
Ground-truth datasets grow incrementally with expert input rather than remaining fixed and limited.
Matcher behavior can be compared across multiple domains and schema types within the same session.
Developers observe concrete metrics and refine algorithms in a closed loop before final evaluation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could reduce dependence on static, small-scale benchmarks that often fail to reflect real integration tasks.
Wider adoption might encourage matching research to prioritize human-validated, domain-specific ground truth over purely synthetic test sets.
Similar interactive loops could be adapted to other data-integration steps such as entity resolution or schema mapping.
Multi-user versions might allow distributed expert communities to curate large, shared ground-truth collections.

Load-bearing premise

That interactive validations combined with LLM explanations will reliably create higher-quality ground truth and enable effective iterative improvement of matching methods.

What would settle it

A controlled comparison in which independent experts annotate the same schema pairs with and without BDIViz, then measure agreement rates and downstream integration accuracy of matchers trained on each resulting ground-truth set.

Figures

Figures reproduced from arXiv: 2604.10763 by Christos Koutras, Cl\'audio T. Silva, Eden Wu, Juliana Freire.

**Figure 1.** Figure 1: BDIViz system overview suggest improvements. Both scenarios support export of mapping specifications and curated ground truth for offline reuse. 2 System Overview In this section, we provide an overview of BDIViz, as illustrated in [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Demonstration Scenario 1: BDIViz offers a set of interactive and LLM-assisted visual tools to users towards data [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Demonstration Scenario 2: BDIViz enables develop [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

read the original abstract

Schema matching remains fundamental to data integration, yet evaluating and comparing matching methods is hindered by limited benchmark diversity and lack of interactive validation frameworks. BDIViz, recently published at IEEE VIS 2025, is an interactive visualization system for schema matching with LLM-assisted validation. Given source and target datasets, BDIViz applies automatic matching methods and visualizes candidates in an interactive heatmap with hierarchical navigation, zoom, and filtering. Users validate matches directly in the heatmap and inspect ambiguous cases using coordinated views that show attribute descriptions, example values, and distributions. An LLM assistant generates structured explanations for selected candidates to support decision-making. This demonstration showcases a new extension to BDIViz that addresses a critical need in data integration research: human-in-the-loop benchmarking and iterative matcher development. New matchers can be integrated through a standardized interface, while user validations become evolving ground truth for real-time performance evaluation. This enables benchmarking new algorithms, constructing high-quality ground-truth datasets through expert validation, and comparing matcher behavior across diverse schemas and domains. We demonstrate two complementary scenarios: (i) data harmonization, where users map a large tabular dataset to a target schema with value-level inspection and LLM-generated explanations; and (ii) developer-in-the-loop benchmarking, where developers integrate custom matchers, observe performance metrics, and refine their algorithms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a clean system demo extending BDIViz for interactive schema matching benchmarking, but it adds no new results or evidence that the features deliver better ground truth or matcher improvements.

read the letter

The core of this paper is a demonstration of updates to the BDIViz tool, originally shown at IEEE VIS 2025. The extension lets users validate matches in an interactive heatmap, pull in LLM explanations for tricky cases, plug in new matchers through a standard interface, and watch live performance numbers based on those validations. It walks through two scenarios: one for mapping a big table to a target schema with value checks, and one for developers testing and tweaking their own algorithms on the fly. That setup addresses a real pain point in schema matching work, where benchmarks are often static and hard to iterate on with human input. The UI details, like hierarchical navigation, coordinated attribute views, and filtering, read as straightforward and relevant for the task. The standardized matcher integration is a practical touch that could let others try their methods without heavy lifting. The main limitation is that everything stays at the level of feature description. There are no user studies, no quality metrics on the resulting ground truth, and no before-after comparisons showing that the LLM assists or real-time feedback actually lead to better datasets or stronger matchers. The paper treats these outcomes as enabled by the design rather than demonstrated. That fits a demo paper, but it leaves the central promise untested. This is worth a look for researchers in data integration or visualization who build or evaluate matching tools and want a hands-on way to curate benchmarks. It is not a source of new algorithms or findings, so I would not cite it for technical advances. A serious editor should send it to peer review as a systems demonstration, with the expectation that reviewers will focus on usability and integration details rather than empirical claims.

Referee Report

0 major / 3 minor

Summary. The manuscript describes BDIViz, an interactive visualization system for schema matching that applies automatic matchers, renders candidates in a hierarchical heatmap with zoom/filtering, supports direct user validation, and provides coordinated views plus LLM-generated explanations for ambiguous cases. It presents a standardized interface for integrating new matchers and using accumulated user validations as evolving ground truth for real-time performance feedback. Two demonstration scenarios are shown: data harmonization of a large tabular dataset with value-level inspection, and developer-in-the-loop benchmarking where custom matchers are integrated, observed, and refined.

Significance. If the described capabilities operate as presented, the work offers a practical contribution to the schema-matching and data-integration communities by supplying an interactive human-in-the-loop platform that can expand benchmark diversity and support iterative matcher development. The standardized integration interface and real-time feedback loop are concrete strengths that lower the barrier for researchers to test new algorithms against expert-validated ground truth.

minor comments (3)

[Abstract and §3] Abstract and §3 (Demonstration Scenarios): the claim that user validations 'become evolving ground truth for real-time performance evaluation' would benefit from an explicit description of the performance metrics computed and how they are updated when new validations arrive.
[§2] §2 (System Description): the standardized matcher interface is introduced but lacks a concrete example of the API signature or data format required for integration; adding a short code snippet or table would improve reproducibility for developers.
[Figure captions and §4] Figure captions and §4: several figures show heatmaps and coordinated views but do not indicate the exact schema sizes or domains used in the live demonstrations, making it harder to assess the claimed diversity.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of our manuscript on BDIViz. We appreciate the recognition of its significance in providing an interactive human-in-the-loop platform for schema matching and benchmarking. Given that no specific major comments were provided, we have no revisions to incorporate at this stage and look forward to any additional feedback from the editor.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The manuscript is a system demonstration paper describing BDIViz capabilities for interactive schema matching, LLM-assisted validation, and benchmarking. It presents design features, integration interfaces, and usage scenarios without any equations, derivations, predictions, fitted parameters, or first-principles claims. The single self-citation to the prior IEEE VIS 2025 BDIViz paper simply identifies the base system being extended; it does not serve as load-bearing justification for any result or forbid alternatives. No step reduces by construction to its inputs, and the central claims are descriptive of tool functionality rather than empirically derived quantities.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a software demonstration paper with no mathematical model, empirical claims, or theoretical constructs; therefore no free parameters, axioms, or invented entities are introduced.

pith-pipeline@v0.9.0 · 5543 in / 1108 out tokens · 100223 ms · 2026-05-10T15:04:22.397343+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

10 extracted references · 10 canonical work pages

[1]

David Aumueller, Hong-Hai Do, Sabine Massmann, and Erhard Rahm. 2005. Schema and Ontology Matching with COMA++. InSIGMOD ’05. 906–908. doi:10.1 145/1066157.1066283

work page arXiv 2005
[2]

Juliana Freire, Grace Fan, Benjamin Feuer, Christos Koutras, Yurong Liu, et al

work page
[3]

Bull.49, 1 (2025), 3–31

Large Language Models for Data Discovery and Integration: Challenges and Opportunities.IEEE Data Eng. Bull.49, 1 (2025), 3–31

work page 2025
[4]

Christos Koutras, George Siachamis, Andra Ionescu, Kyriakos Psarakis, Jerry Brons, et al. 2021. Valentine: Evaluating Matching Techniques for Dataset Discovery. In ICDE ’21. 468–479

work page 2021
[5]

Yurong Liu, Eduardo Peña, Aécio S. R. Santos, Eden Wu, and Juliana Freire. 2025. Magneto: Combining Small and Large Language Models for Schema Matching. Proc. VLDB Endow.18, 8 (2025), 2681–2694. doi:10.14778/3742728.3742757

work page doi:10.14778/3742728.3742757 2025
[6]

Roque Lopez, A’ecio Santos, Christos Koutras, and Juliana Freire. 2026. BDI-Kit: An AI-Powered Toolkit for Biomedical Data Harmonization.Patterns7 (2026)

work page 2026
[7]

Aécio Santos, Eden Wu, Roque Lopez, Sarah Keegan, Eduardo Pena, et al. 2025. GDC-SM: The GDC Schema Matching Benchmark. doi:10.5281/zenodo.14963588

work page doi:10.5281/zenodo.14963588 2025
[8]

Durbin, et al

Shiqiang Tao, Ningzhou Zeng, Isaac Hands, Joseph Hurt-Mueller, Eric B. Durbin, et al. 2020. Web-based interactive mapping from data dictionaries to ontologies, with an application to cancer registry.BMC Med. Inform. Decis. Mak.20, 10 (2020),

work page 2020
[9]

doi:10.1186/s12911-020-01288-7

work page doi:10.1186/s12911-020-01288-7
[10]

In other words, if B does not satisfy these conditions, then 𝒑∗ is a Nash equilibrium, but not an ESS

Eden Wu, Dishita G. Turakhia, Guande Wu, Christos Koutras, Sarah Keegan, et al. 2025. BDIViz: An Interactive Visualization System for Biomedical Schema Matching with LLM-Powered Validation.IEEE Trans. Vis. Comput. Graph.(2025), 1–11. doi:10.1109/TVCG.2025.3634843

work page doi:10.1109/tvcg.2025.3634843 2025

[1] [1]

David Aumueller, Hong-Hai Do, Sabine Massmann, and Erhard Rahm. 2005. Schema and Ontology Matching with COMA++. InSIGMOD ’05. 906–908. doi:10.1 145/1066157.1066283

work page arXiv 2005

[2] [2]

Juliana Freire, Grace Fan, Benjamin Feuer, Christos Koutras, Yurong Liu, et al

work page

[3] [3]

Bull.49, 1 (2025), 3–31

Large Language Models for Data Discovery and Integration: Challenges and Opportunities.IEEE Data Eng. Bull.49, 1 (2025), 3–31

work page 2025

[4] [4]

Christos Koutras, George Siachamis, Andra Ionescu, Kyriakos Psarakis, Jerry Brons, et al. 2021. Valentine: Evaluating Matching Techniques for Dataset Discovery. In ICDE ’21. 468–479

work page 2021

[5] [5]

Yurong Liu, Eduardo Peña, Aécio S. R. Santos, Eden Wu, and Juliana Freire. 2025. Magneto: Combining Small and Large Language Models for Schema Matching. Proc. VLDB Endow.18, 8 (2025), 2681–2694. doi:10.14778/3742728.3742757

work page doi:10.14778/3742728.3742757 2025

[6] [6]

Roque Lopez, A’ecio Santos, Christos Koutras, and Juliana Freire. 2026. BDI-Kit: An AI-Powered Toolkit for Biomedical Data Harmonization.Patterns7 (2026)

work page 2026

[7] [7]

Aécio Santos, Eden Wu, Roque Lopez, Sarah Keegan, Eduardo Pena, et al. 2025. GDC-SM: The GDC Schema Matching Benchmark. doi:10.5281/zenodo.14963588

work page doi:10.5281/zenodo.14963588 2025

[8] [8]

Durbin, et al

Shiqiang Tao, Ningzhou Zeng, Isaac Hands, Joseph Hurt-Mueller, Eric B. Durbin, et al. 2020. Web-based interactive mapping from data dictionaries to ontologies, with an application to cancer registry.BMC Med. Inform. Decis. Mak.20, 10 (2020),

work page 2020

[9] [9]

doi:10.1186/s12911-020-01288-7

work page doi:10.1186/s12911-020-01288-7

[10] [10]

In other words, if B does not satisfy these conditions, then 𝒑∗ is a Nash equilibrium, but not an ESS

Eden Wu, Dishita G. Turakhia, Guande Wu, Christos Koutras, Sarah Keegan, et al. 2025. BDIViz: An Interactive Visualization System for Biomedical Schema Matching with LLM-Powered Validation.IEEE Trans. Vis. Comput. Graph.(2025), 1–11. doi:10.1109/TVCG.2025.3634843

work page doi:10.1109/tvcg.2025.3634843 2025