pith. sign in

arxiv: 1907.01183 · v1 · pith:ED3G5PXPnew · submitted 2019-07-02 · 💻 cs.IR · cs.DB

A Framework for Evaluating Snippet Generation for Dataset Search

Pith reviewed 2026-05-25 11:11 UTC · model grok-4.3

classification 💻 cs.IR cs.DB
keywords dataset searchsnippet generationevaluation frameworkquery intentcontent coverageinformation retrievaluser study
0
0 comments X

The pith

A framework with two metrics evaluates how well dataset search snippets match query intent and cover dataset content.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a quantitative evaluation framework for snippets shown in dataset search results. It defines metrics that check whether a snippet captures the user's query goals and represents the dataset's core information. Researchers and developers reuse datasets often, so better snippets could reduce time spent reviewing irrelevant results. The work adapts existing methods from other search domains as starting points, tests them on real datasets and queries, and includes a user study to check the metrics.

Core claim

The paper claims that snippet quality for dataset search can be measured by the degree of query intent match and the extent of main content coverage, and that an evaluation framework built on these two metrics supplies a reproducible basis for comparing generation methods, as shown by baseline adaptations and empirical results on real-world data.

What carries the argument

The evaluation framework consisting of a query-intent-matching metric and a content-coverage metric.

If this is right

  • Different snippet generation approaches can be compared quantitatively using the two metrics.
  • Methods adapted from document summarization and other retrieval tasks provide usable initial baselines for dataset snippets.
  • Empirical results on real datasets and queries demonstrate measurable differences among the adapted baselines.
  • User study outcomes can confirm or adjust the metrics before wider adoption.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the metrics prove stable across domains, search engines could incorporate them directly into ranking or snippet selection.
  • The same two-aspect structure might transfer to evaluating snippets for other structured data such as knowledge graphs.
  • Further work could test whether adding dataset-specific features like schema elements improves the coverage metric.

Load-bearing premise

Metrics defined to measure query intent match and content coverage will reliably indicate snippet quality for dataset search.

What would settle it

A controlled user study in which snippets that score highest on the proposed metrics receive lower preference ratings than lower-scoring snippets.

Figures

Figures reproduced from arXiv: 1907.01183 by Evgeny Kharlamov, Gong Cheng, Jeff Z. Pan, Jinchi Chen, Shuxin Li, Xiaxia Wang, Yuzhong Qu.

Figure 1
Figure 1. Figure 1: (a) An example dataset and (b)(c)(d) three of its snippets generated by different [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Runtime on each query￾dataset pair, in ascending order. (a) data.gov.uk (b) DMOZ-1 (c) DMOZ-2 (d) DMOZ-3 (e) DMOZ-4 IlluSnip TA+C PrunedDP++ CES 0 0.5 1 coKyw coCnx coSkm coDat 0 0.5 1 coKyw coCnx coSkm coDat 0 0.5 1 coKyw coCnx coSkm coDat 0 0.5 1 coKyw coCnx coSkm coDat 0 0.5 1 coKyw coCnx coSkm coDat [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Average scores of evaluation metrics on each group of query-dataset pairs. [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Correlation between evaluation metrics and user ratings [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗
read the original abstract

Reusing existing datasets is of considerable significance to researchers and developers. Dataset search engines help a user find relevant datasets for reuse. They can present a snippet for each retrieved dataset to explain its relevance to the user's data needs. This emerging problem of snippet generation for dataset search has not received much research attention. To provide a basis for future research, we introduce a framework for quantitatively evaluating the quality of a dataset snippet. The proposed metrics assess the extent to which a snippet matches the query intent and covers the main content of the dataset. To establish a baseline, we adapt four state-of-the-art methods from related fields to our problem, and perform an empirical evaluation based on real-world datasets and queries. We also conduct a user study to verify our findings. The results demonstrate the effectiveness of our evaluation framework, and suggest directions for future research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper introduces a framework for quantitatively evaluating snippet quality in dataset search. The framework defines metrics to assess how well a snippet matches query intent and covers the dataset's main content. It adapts four state-of-the-art methods from related fields as baselines, conducts an empirical evaluation on real-world datasets and queries, and validates results via a user study. The abstract reports that the results demonstrate the framework's effectiveness and suggest future research directions.

Significance. If the metrics prove well-defined and the evaluation rigorous, this could establish a useful foundation for an emerging IR task with little prior work. The combination of quantitative baselines and user validation is a strength, as is the focus on dataset reuse. However, the absence of metric definitions, equations, baseline adaptation details, or quantitative results in the provided abstract limits assessment of whether the claims hold.

major comments (2)
  1. [Abstract] Abstract: The central claim that the proposed metrics assess query intent match and content coverage, and that the evaluation demonstrates effectiveness, cannot be verified because no metric definitions, equations, or quantitative outcomes (e.g., scores, statistical significance) are provided. This is load-bearing for the framework's contribution.
  2. [Abstract] Abstract / evaluation section: No details are given on how the four baselines from related fields were adapted to dataset snippets (e.g., what modifications were made for dataset-specific features like metadata or schema). Without this, it is unclear whether the baselines are valid or if domain adjustments were needed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their review. The full manuscript provides detailed metric definitions, equations, quantitative results, and baseline adaptation information in the body text (Sections 3-6). We respond to each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that the proposed metrics assess query intent match and content coverage, and that the evaluation demonstrates effectiveness, cannot be verified because no metric definitions, equations, or quantitative outcomes (e.g., scores, statistical significance) are provided. This is load-bearing for the framework's contribution.

    Authors: The abstract is intentionally concise and summarizes the contribution at a high level. The metrics for assessing query intent match and content coverage are formally defined with equations in Section 3. The empirical evaluation on real-world datasets and queries, including specific quantitative scores and statistical significance, is reported in Section 5, with user study validation in Section 6. These sections allow full verification of the claims. revision: no

  2. Referee: [Abstract] Abstract / evaluation section: No details are given on how the four baselines from related fields were adapted to dataset snippets (e.g., what modifications were made for dataset-specific features like metadata or schema). Without this, it is unclear whether the baselines are valid or if domain adjustments were needed.

    Authors: Details on adapting the four state-of-the-art methods from related fields (e.g., web search snippet generation and document summarization) to dataset search are provided in Section 4. This includes the specific modifications made to incorporate dataset metadata, schema, and other domain features, along with justification for why the adaptations preserve validity. revision: no

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper defines an evaluation framework by introducing new metrics for query-intent match and content coverage, adapts four external baselines from related fields, runs an empirical study on real datasets/queries, and validates via user study. No equations, derivations, fitted parameters presented as predictions, or load-bearing self-citations appear in the provided text. The central claims rest on independent metric definitions and external validation rather than reducing to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical derivations, free parameters, axioms, or invented entities are described in the abstract; the contribution is an applied evaluation framework rather than a formal model.

pith-pipeline@v0.9.0 · 5685 in / 1152 out tokens · 23038 ms · 2026-05-25T11:11:10.469077+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages

  1. [1]

    In: OTM, Part II

    Bai, X., Delbru, R., Tummarello, G.: RDF snippets for semantic web search engines. In: OTM, Part II. pp. 1304–1318 (2008)

  2. [2]

    Brickley, D., Burgess, M., Noy, N.F.: Google dataset search: Building a search engine for datasets in an open web ecosystem. In: WWW. pp. 1365–1375 (2019)

  3. [3]

    Se- mant

    Butt, A.S., Haller, A., Xie, L.: Dwrank: Learning concept ranking for ontology search. Se- mant. Web 7(4), 447–461 (2016) 16 X. Wang et al

  4. [4]

    PVLDB 8(12), 2012–2015 (2015)

    Cebiric, S., Goasdou ´e, F., Manolescu, I.: Query-oriented summarization of RDF graphs. PVLDB 8(12), 2012–2015 (2015)

  5. [5]

    In: WWW (Compan- ion V olume)

    Cheng, G., Ge, W., Qu, Y .: Generating summaries for ontology search. In: WWW (Compan- ion V olume). pp. 27–28 (2011)

  6. [6]

    In: JIST

    Cheng, G., Ji, F., Luo, S., Ge, W., Qu, Y .: Biprank: Ranking and summarizing RDF vocabu- lary descriptions. In: JIST. pp. 226–241 (2011)

  7. [7]

    In: WSDM

    Cheng, G., Jin, C., Ding, W., Xu, D., Qu, Y .: Generating illustrative snippets for open data on the web. In: WSDM. pp. 151–159 (2017)

  8. [8]

    In: IJCAI

    Cheng, G., Jin, C., Qu, Y .: HIEDS: A generic and efficient approach to hierarchical dataset summarization. In: IJCAI. pp. 3705–3711 (2016)

  9. [9]

    In: IEEE BigData

    Cheng, G., Kharlamov, E.: Towards a semantic keyword search over industrial knowledge graphs (extended abstract). In: IEEE BigData. pp. 1698–1700 (2017)

  10. [10]

    IEEE Trans

    Coffman, J., Weaver, A.C.: An empirical performance evaluation of relational keyword search techniques. IEEE Trans. Knowl. Data Eng. 26(1), 30–42 (2014)

  11. [11]

    In: ICDE

    Ding, B., Yu, J.X., Wang, S., Qin, L., Zhang, X., Lin, X.: Finding top-k min-cost connected trees in databases. In: ICDE. pp. 836–845 (2007)

  12. [12]

    In: AAAI

    Dolby, J., Fokoue, A., Kalyanpur, A., Kershenbaum, A., Schonberg, E., Srinivas, K., Ma, L.: Scalable semantic retrieval through summarization and refinement. In: AAAI. pp. 299–304 (2007)

  13. [13]

    Ellefi, M.B., Bellahsene, Z., Breslin, J.G., Demidova, E., Dietze, S., Szymanski, J., Todorov, K.: RDF dataset profiling - a survey of features, methods, vocabularies and applications. Semant. Web 9(5), 677–705 (2018)

  14. [14]

    In: SIGIR

    Feigenblat, G., Roitman, H., Boni, O., Konopnicki, D.: Unsupervised query-focused multi- document summarization using the cross entropy method. In: SIGIR. pp. 961–964 (2017)

  15. [15]

    In: AAAI (2012)

    Fkoue, A., Meneguzzi, F., Sensoy, M., Pan, J.Z.: Querying linked ontological data through distributed summarization. In: AAAI (2012)

  16. [16]

    Gambhir, M., Gupta, V .: Recent automatic text summarization techniques: a survey. Artif. Intell. Rev. 47(1), 1–66 (2017)

  17. [17]

    Ge, W., Cheng, G., Li, H., Qu, Y .: Incorporating compactness to generate term-association view snippets for ontology search. Inf. Process. Manage. 49(2), 513–528 (2013)

  18. [18]

    IEEE Internet Comput

    Horrocks, I., Giese, M., Kharlamov, E., Waaler, A.: Using semantic technology to tame the data variety challenge. IEEE Internet Comput. 20(6), 62–66 (2016)

  19. [19]

    In: ISWC, Part II

    Jim ´enez-Ruiz, E., Kharlamov, E., Zheleznyakov, D., Horrocks, I., Pinkel, C., Skjæveland, M.G., Thorstensen, E., Mora, J.: Bootox: Practical mapping of rdbs to OWL 2. In: ISWC, Part II. pp. 113–132 (2015)

  20. [20]

    Kacprzak, E., Koesten, L., Ib ´a˜nez, L.D., Blount, T., Tennison, J., Simperl, E.: Characterising dataset search - an analysis of search logs and data requests. J. Web Semant.55, 37–55 (2019)

  21. [21]

    In: ICDE

    Kasneci, G., Ramanath, M., Sozio, M., Suchanek, F.M., Weikum, G.: STAR: steiner-tree approximation in relationship graphs. In: ICDE. pp. 868–879 (2009)

  22. [22]

    In: ISWC, Part II

    Kharlamov, E., Grau, B.C., Jim ´enez-Ruiz, E., Lamparter, S., Mehdi, G., Ringsquandl, M., Nenov, Y ., Grimm, S., Roshchin, M., Horrocks, I.: Capturing industrial information models with ontologies and constraints. In: ISWC, Part II. pp. 325–343 (2016)

  23. [23]

    Kharlamov, E., Hovland, D., Skjæveland, M.G., Bilidas, D., Jim ´enez-Ruiz, E., Xiao, G., Soylu, A., Lanti, D., Rezk, M., Zheleznyakov, D., Giese, M., Lie, H., Ioannidis, Y .E., Kotidis, Y ., Koubarakis, M., Waaler, A.: Ontology Based Data Access in Statoil. J. Web Semant.44, 3–36 (2017)

  24. [24]

    Kharlamov, E., Kotidis, Y ., Mailis, T., Neuenstadt, C., Nikolaou, C., ¨Ozg¨ur ¨Ozc ¸ep, Svingos, C., Zheleznyakov, D., Ioannidis, Y ., Lamparter, S., M ¨oller, R., Waaler, A.: An ontology- mediated analytics-aware approach to support monitoring and diagnostics of static and streaming data. J. Web Semant. 56, 30 – 55 (2019) A Framework for Evaluating Snip...

  25. [25]

    Kharlamov, E., Mailis, T., Mehdi, G., Neuenstadt, C., ¨Ozc ¸ep,¨O.L., Roshchin, M., Solo- makhina, N., Soylu, A., Svingos, C., Brandt, S., Giese, M., Ioannidis, Y .E., Lamparter, S., M¨oller, R., Kotidis, Y ., Waaler, A.: Semantic access to streaming and static data at Siemens. J. Web Semant. 44, 54–74 (2017)

  26. [26]

    Kharlamov, E., Mehdi, G., Savkovi ´c, O., Xiao, G., Kalayci, E.G., Roshchin, M.: Semantically-enhanced rule-based diagnostics for industrial internet of things: The sdrl lan- guage and case study for siemens trains and turbines. J. Web Semant. 56, 11 – 29 (2019)

  27. [27]

    IEEE Trans

    Le, W., Li, F., Kementsietsidis, A., Duan, S.: Scalable keyword search on large RDF data. IEEE Trans. Knowl. Data Eng. 26(11), 2774–2788 (2014)

  28. [28]

    In: IWEST (2010)

    Li, N., Motta, E., d’Aquin, M.: Ontology summarization: an analysis and an evaluation. In: IWEST (2010)

  29. [29]

    In: SIGMOD

    Li, R., Qin, L., Yu, J.X., Mao, R.: Efficient and progressive group steiner tree search. In: SIGMOD. pp. 91–106 (2016)

  30. [30]

    (eds.): Exploiting Linked Data and Knowledge Graphs for Large Organisations

    Pan, J., Vetere, G., Gomez-Perez, J., Wu, H. (eds.): Exploiting Linked Data and Knowledge Graphs for Large Organisations. Springer (2016)

  31. [31]

    In: ASWC

    Penin, T., Wang, H., Tran, T., Yu, Y .: Snippet generation for semantic web search engines. In: ASWC. pp. 493–507 (2008)

  32. [32]

    In: ISWC, Part II

    Pietriga, E., G ¨oz¨ukan, H., Appert, C., Destandau, M., Cebiric, S., Goasdou´e, F., Manolescu, I.: Browsing linked data catalogs with lodatlas. In: ISWC, Part II. pp. 137–153 (2018)

  33. [33]

    Pinkel, C., Binnig, C., Jim ´enez-Ruiz, E., Kharlamov, E., May, W., Nikolov, A., Bastinos, A.S., Skjæveland, M.G., Solimando, A., Taheriyan, M., Heupel, C., Horrocks, I.: RODI: benchmarking relational-to-ontology mapping generation quality. Semant. Web 9(1), 25–52 (2018)

  34. [34]

    In: AIKE

    Pouriyeh, S., Allahyari, M., Liu, Q., Cheng, G., Arabnia, H.R., Atzori, M., Kochut, K.: Graph-based methods for ontology summarization: A survey. In: AIKE. pp. 85–92 (2018)

  35. [35]

    Pouriyeh, S., Allahyari, M., Liu, Q., Cheng, G., Arabnia, H.R., Atzori, M., Mohammadi, F.G., Kochut, K.: Ontology summarization: Graph-based methods and beyond. Int’l J. Se- mant. Comput. 13(2), 259–283 (2019)

  36. [36]

    In: ISWC, Part II

    Rietveld, L., Hoekstra, R., Schlobach, S., Gu ´eret, C.: Structural properties as proxy for se- mantic relevance in RDF graph sampling. In: ISWC, Part II. pp. 81–96 (2014)

  37. [37]

    In: ESWC

    Ringsquandl, M., Kharlamov, E., Stepanova, D., Hildebrandt, M., Lamparter, S., Lepratti, R., Horrocks, I., Kr ¨oger, P.: Event-enhanced learning for KG completion. In: ESWC. pp. 541–559 (2018)

  38. [38]

    IEEE Trans

    Song, Q., Wu, Y ., Lin, P., Dong, X., Sun, H.: Mining summaries for knowledge graph search. IEEE Trans. Knowl. Data Eng. 30(10), 1887–1900 (2018)

  39. [39]

    In: ISWC, Part I

    Troullinou, G., Kondylakis, H., Stefanidis, K., Plexousakis, D.: Exploring RDFS kbs using summaries. In: ISWC, Part I. pp. 268–284 (2018)

  40. [40]

    In: SIGIR

    Turpin, A., Tsegay, Y ., Hawking, D., Williams, H.E.: Fast generation of result snippets in web search. In: SIGIR. pp. 127–134 (2007)

  41. [41]

    In: Managing and Mining Graph Data, pp

    Wang, H., Aggarwal, C.C.: A survey of algorithms for keyword search on graph data. In: Managing and Mining Graph Data, pp. 249–273. Springer (2010)

  42. [42]

    Zhang, X., Cheng, G., Ge, W., Qu, Y .: Summarizing vocabularies in the global semantic web. J. Comput. Sci. Technol. 24(1), 165–174 (2009)

  43. [43]

    Zhang, X., Cheng, G., Qu, Y .: Ontology summarization based on rdf sentence graph. In: WWW. pp. 707–716 (2007)

  44. [44]

    In: ASWC

    Zhang, X., Li, H., Qu, Y .: Finding important vocabulary within ontology. In: ASWC. pp. 106–112 (2006)

  45. [45]

    Se- mant

    Zneika, M., V odislav, D., Kotzinos, D.: Quality metrics for RDF graph summarization. Se- mant. Web 10(3), 555–584 (2019)