Recommending Related Tables

Krisztian Balog; Shuo Zhang

arxiv: 1907.03595 · v2 · pith:JQ5MABQZnew · submitted 2019-07-08 · 💻 cs.IR

Recommending Related Tables

Shuo Zhang , Krisztian Balog This is my paper

Pith reviewed 2026-05-25 00:57 UTC · model grok-4.3

classification 💻 cs.IR

keywords related table recommendationtable matchingsemantic spacesdiscriminative learningWikipedia tablesinformation retrieval

0 comments

The pith

Tables are recommended by embedding their elements in multiple semantic spaces and learning to combine the similarities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper addresses the task of recommending related tables to a given input table. The approach represents table elements in multiple semantic spaces and uses a discriminative learning model to combine element-level similarities for computing table similarity. It is evaluated on a test collection of Wikipedia tables where it achieves state-of-the-art performance. If correct, this enables applications like providing web-based related content recommendations within spreadsheet programs.

Core claim

The paper establishes a theoretically sound framework for table matching based on multi-space element representations combined via discriminative learning, which outperforms prior methods on Wikipedia table data.

What carries the argument

Representation of table elements in multiple semantic spaces combined using a discriminative learning model to compute table similarity.

If this is right

Proactive recommendations of related structured content can be provided to spreadsheet users.
Table similarity computation becomes more accurate by leveraging multiple semantic views.
Ranked lists of relevant tables can be generated effectively from large collections like Wikipedia.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This approach might generalize to matching other structured data formats beyond tables.
Deployment in enterprise environments would require validating the method on non-Wikipedia data.
Future work could explore additional semantic spaces or different learning models for combination.

Load-bearing premise

The purpose-built test collection from Wikipedia tables is representative of real-world table recommendation scenarios.

What would settle it

Demonstrating that the method does not outperform baselines on a collection of enterprise spreadsheets would falsify the claim of state-of-the-art performance in practical settings.

Figures

Figures reproduced from arXiv: 1907.03595 by Krisztian Balog, Shuo Zhang.

**Figure 2.** Figure 2: Representation of a table element Tx in the term and in a given semantic space y. 3.1 Element-Level Table Matching Framework We combine multiple table quality indicators and table similarity measures in a discriminative learning framework. Input and candidate table pairs are described as a feature vector, shown in Eq. (1). e main novelty lies in how table similarity is estimated. Instead of relying on ha… view at source ↗

**Figure 3.** Figure 3: Illustration of element-level similarity methods. [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Performance dierence between InfoGather (base [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Performance in terms of NDCG with dierent [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 8.** Figure 8: Performance of CRAB-2 with respect to (relative) [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 7.** Figure 7: Performance analysis using only a portion of the [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

read the original abstract

Tables are an extremely powerful visual and interactive tool for structuring and manipulating data, making spreadsheet programs one of the most popular computer applications. In this paper we introduce and address the task of recommending related tables: given an input table, identifying and returning a ranked list of relevant tables. One of the many possible application scenarios for this task is to provide users of a spreadsheet program proactively with recommendations for related structured content on the Web. At its core, the related table recommendation task boils down to computing the similarity between a pair of tables. We develop a theoretically sound framework for performing table matching. Our approach hinges on the idea of representing table elements in multiple semantic spaces, and then combining element-level similarities using a discriminative learning model. Using a purpose-built test collection from Wikipedia tables, we demonstrate that the proposed approach delivers state-of-the-art performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines a new related-table recommendation task and shows a multi-space representation plus discriminative combiner beats baselines on their Wikipedia collection, but external validity to real spreadsheets remains untested.

read the letter

The main takeaway is that this work carves out the task of recommending related tables to a given input table and offers a method that puts table elements into multiple semantic spaces before feeding element-wise similarities into a discriminative learning model. On a purpose-built Wikipedia collection it reports state-of-the-art numbers for the new task. That framing and the multi-space idea are the clearest contributions; the spreadsheet-proactive-recommendation scenario is a reasonable motivation and the framework is presented as theoretically grounded. The approach itself looks like a sensible engineering combination rather than a radical departure, but it is executed cleanly enough to be usable by others working on table matching. The soft spot is the evaluation. Everything rests on one Wikipedia-derived test collection whose construction details, relevance criteria, and schema variability are not cross-checked against enterprise spreadsheets or noisier user-generated tables. Without that, the SOTA claim stays tied to the particular data distribution and does not yet speak to the broader applications mentioned. The paper is aimed at information-retrieval researchers who deal with structured web content or table similarity. Someone in that niche would find the task definition and the method worth reading and extending. It is coherent on its own terms and shows honest engagement with the problem, so it deserves a serious referee even if the experiments will need extra scrutiny on generalizability.

Referee Report

1 major / 1 minor

Summary. The paper introduces the task of recommending related tables to a given input table, motivated by applications such as proactive recommendations in spreadsheet programs. It develops a framework for table matching that represents table elements in multiple semantic spaces and combines element-level similarities via a discriminative learning model. Using a purpose-built test collection derived from Wikipedia tables, the approach is shown to achieve state-of-the-art performance.

Significance. If the results hold, the multi-semantic-space representation offers a principled way to capture different facets of table similarity, which could benefit structured data recommendation systems. The construction of a purpose-built test collection from Wikipedia tables is a positive contribution that enables future work on this task.

major comments (1)

[Experiments] Experiments section: The SOTA claim and applicability to the scenarios in the introduction (proactive spreadsheet recommendations, web structured content) rest on results from the purpose-built Wikipedia test collection, but no details are given on collection construction, relevance judgment protocol, inter-annotator agreement, or any cross-domain validation. This is load-bearing because the collection's element distributions, schema variability, and relevance criteria may not match enterprise spreadsheets or user-generated content.

minor comments (1)

The abstract would benefit from specifying the evaluation metrics (e.g., MAP or NDCG) used to establish state-of-the-art performance.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. The primary concern raised is the level of detail provided on the test collection and its implications for the SOTA claims and applicability. We address this point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: Experiments section: The SOTA claim and applicability to the scenarios in the introduction (proactive spreadsheet recommendations, web structured content) rest on results from the purpose-built Wikipedia test collection, but no details are given on collection construction, relevance judgment protocol, inter-annotator agreement, or any cross-domain validation. This is load-bearing because the collection's element distributions, schema variability, and relevance criteria may not match enterprise spreadsheets or user-generated content.

Authors: We agree that expanded details on the test collection are warranted to support the claims. Section 4 describes the Wikipedia table sampling process and pairing strategy, but the relevance judgment protocol, inter-annotator agreement statistics, and explicit discussion of schema variability were not elaborated sufficiently. In the revised version we will add a dedicated subsection detailing the judgment guidelines, report agreement measures, and include a limitations paragraph addressing differences from enterprise spreadsheets and user-generated content. We maintain that the collection serves as a valid proxy for web structured data (consistent with prior table corpora), but acknowledge the absence of cross-domain experiments and will frame the results accordingly without overgeneralizing applicability. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation is self-contained empirical framework

full rationale

The paper introduces a table matching framework based on multi-semantic-space element representations combined via a discriminative learning model, then reports empirical SOTA results on a purpose-built Wikipedia test collection. No equations, parameter-fitting procedures, or self-citations are visible that would reduce the claimed similarity computation or performance result to the inputs by construction. The central claim rests on standard representation and supervised combination techniques evaluated externally on held-out data rather than any self-definitional, fitted-input-renamed-as-prediction, or self-citation-load-bearing step.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is provided; no free parameters, axioms, or invented entities can be identified from the given text.

pith-pipeline@v0.9.0 · 5659 in / 1059 out tokens · 17368 ms · 2026-05-25T00:57:04.834054+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 2 internal anchors

[1]

Ahmad Ahmadov, Maik /T_hiele, Julian Eberius, Wolfgang Lehner, and Robert Wrembel. 2015. Towards a Hybrid Imputation Approach Using Web Tables.. In Proc. of BDC ’15 . 21–30

work page 2015
[2]

Anonymous. 2017. Removed to Protect Anonymity. (2017)

work page 2017
[3]

Halevy, Boulos Harb, Hongrae Lee, Jayant Mad- havan, Afshin Rostamizadeh, Warren Shen, Kenneth Wilder, Fei Wu, and Cong Yu

Sreeram Balakrishnan, Alon Y. Halevy, Boulos Harb, Hongrae Lee, Jayant Mad- havan, Afshin Rostamizadeh, Warren Shen, Kenneth Wilder, Fei Wu, and Cong Yu. 2015. Applying WebTables in Practice. In Proc. of CIDR ’15

work page 2015
[4]

Somnath Banerjee, Soumen Chakrabarti, and Ganesh Ramakrishnan. 2009. Learn- ing to Rank for /Q_uantity Consensus /Q_ueries. InProc. of SIGIR ’09 . 243–250

work page 2009
[5]

Chandra Sekhar Bhagavatula, /T_hanapon Noraset, and Doug Downey. 2013. Meth- ods for Exploring and Mining Tables on Wikipedia. In Proc. of IDEA ’13 . 18–26

work page 2013
[6]

Chandra Sekhar Bhagavatula, /T_hanapon Noraset, and Doug Downey. 2015. TabEL: Entity Linking in Web Tables. InProc. of ISWC 2015. 425–441

work page 2015
[7]

Cafarella, Alon Halevy, and Nodira Khoussainova

Michael J. Cafarella, Alon Halevy, and Nodira Khoussainova. 2009. Data Integra- tion for the Relational Web. Proc. of VLDB Endow. 2 (2009), 1090–1101

work page 2009
[8]

Cafarella, Alon Halevy, and Jayant Madhavan

Michael J. Cafarella, Alon Halevy, and Jayant Madhavan. 2011. Structured Data on the Web. Commun. ACM 54 (2011), 72–79

work page 2011
[9]

Cafarella, Alon Halevy, Daisy Zhe Wang, Eugene Wu, and Yang Zhang

Michael J. Cafarella, Alon Halevy, Daisy Zhe Wang, Eugene Wu, and Yang Zhang

work page
[10]

WebTables: Exploring the Power of Tables on the Web. Proc. of VLDB Endow. 1 (2008), 538–549

work page 2008
[11]

Fernando Chirigati, Jialu Liu, Flip Korn, You (Will) Wu, Cong Yu, and Hao Zhang

work page
[12]

Knowledge Exploration Using Tables on the Web. Proc. of VLDB Endow. 10 (2016), 193–204

work page 2016
[13]

Eric Crestan and Patrick Pantel. 2011. Web-scale Table Census and Classi/f_ication. In Proc. of WSDM ’11 . 545–554

work page 2011
[14]

Anish Das Sarma, Lujun Fang, Nitin Gupta, Alon Halevy, Hongrae Lee, Fei Wu, Reynold Xin, and Cong Yu. 2012. Finding Related Tables. In Proc. of SIGMOD ’12 . 817–828

work page 2012
[15]

Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, /T_homas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge Vault: A Web-scale Approach to Probabilistic Knowledge Fusion. InProc. of KDD ’14. 601–610

work page 2014
[16]

Fleiss et al

J.L. Fleiss et al. 1971. Measuring nominal scale agreement among many raters. Psychological Bulletin 76 (1971), 378–382

work page 1971
[17]

Faegheh Hasibi, Krisztian Balog, Dar ´ıo Gariglio/t_ti, and Shuo Zhang. 2017. Nordlys: A Toolkit for Entity-Oriented and Semantic Search. In Proc. of SIGIR ’17. 1289–1292

work page 2017
[18]

Yusra Ibrahim, Mirek Riedewald, and Gerhard Weikum. 2016. Making Sense of Entities and /Q_uantities in Web Tables. InProc. of CIKM ’16 . 1703–1712

work page 2016
[19]

Oliver Lehmberg, Dominique Ritze, Robert Meusel, and Christian Bizer. 2016. A Large Public Corpus of Web Tables Containing Time and Context Metadata. In Proc. of WWW ’16 Companion . 75–76

work page 2016
[20]

Oliver Lehmberg, Dominique Ritze, Petar Ristoski, Robert Meusel, Heiko Paul- heim, and Christian Bizer. 2015. /T_he Mannheim Search Join Engine.Web Semant. 35 (2015), 159–166

work page 2015
[21]

Girija Limaye, Sunita Sarawagi, and Soumen Chakrabarti. 2010. Annotating and Searching Web Tables Using Entities, Types and Relationships. Proc. of VLDB Endow. 3 (2010), 1338–1347

work page 2010
[22]

Craig Macdonald, Rodrygo L T Santos, and Iadh Ounis. 2012. On the Usefulness of /Q_uery Features for Learning to Rank. InProc. of CIKM ’12 . 2559–2562

work page 2012
[23]

Jayant Madhavan, Loredana Afanasiev, Lyublena Antova, and Alon Y. Halevy

work page
[24]

Harnessing the Deep Web: Present and Future

Harnessing the Deep Web: Present and Future. CoRR abs/0909.1785 (2009)

work page internal anchor Pith review Pith/arXiv arXiv 2009
[25]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeﬀrey Dean. 2013. Distributed Representations of Words and Phrases and /T_heir Compositionality. In Proc. of NIPS ’13 . 3111–3119

work page 2013
[26]

Emir Munoz, Aidan Hogan, and Alessandra Mileo. 2014. Using Linked Data to Mine RDF from Wikipedia’s Tables. In Proc. of WSDM ’14 . 533–542

work page 2014
[27]

Neural Programmer: Inducing Latent Programs with Gradient Descent

Arvind Neelakantan, /Q_uoc V. Le, and Ilya Sutskever. 2015. Neural Programmer: Inducing Latent Programs with Gradient Descent. CoRR abs/1511.04834 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015
[28]

/T_hanh Tam Nguyen, /Q_uoc Viet Hung Nguyen, Weidlich Ma/t_thias, and Aberer Karl. 2015. Result Selection and Summarization for Web Table Search. In ISDE ’15. 425–441

work page 2015
[29]

Jeﬀrey Pennington, Richard Socher, and Christopher D Manning. 2014. GloVe: Global Vectors for Word Representation. InProc. of EMNLP ’14 . 1532–1543

work page 2014
[30]

Rakesh Pimplikar and Sunita Sarawagi. 2012. Answering Table /Q_ueries on the Web Using Column Keywords. Proc. of VLDB Endow. 5 (2012), 908–919

work page 2012
[31]

Tao Qin, Tie-Yan Liu, Jun Xu, and Hang Li. 2010. LETOR: A Benchmark Collection for Research on Learning to Rank for Information Retrieval. Inf. Retr. 13, 4 (Aug 2010), 346–374

work page 2010
[32]

Petar Ristoski and Heiko Paulheim. 2016. RDF2vec: RDF Graph Embeddings for Data Mining. In Proc. of ISWC ’16. 498–514

work page 2016
[33]

Dominique Ritze, Oliver Lehmberg, Yaser Oulabi, and Christian Bizer. 2016. Pro/f_iling the Potential of Web Tables for Augmenting Cross-domain Knowledge Bases. In Proc. of WWW ’16 . 251–261

work page 2016
[34]

Sunita Sarawagi and Soumen Chakrabarti. 2014. Open-domain /Q_uantity /Q_ueries on Web Tables: Annotation, Response, and Consensus Models. In Proc. of KDD ’14. 711–720

work page 2014
[35]

Sekhavat, Francesco Di Paolo, Denilson Barbosa, and Paolo Merialdo

Yoones A. Sekhavat, Francesco Di Paolo, Denilson Barbosa, and Paolo Merialdo

work page
[36]

Knowledge Base Augmentation using Tabular Data. In Proc. of LDOW ’14

work page
[37]

Wei Shen, Jianyong Wang, and Jiawei Han. 2015. Entity Linking with a Knowl- edge Base: Issues, Techniques, and Solutions. IEEE Trans. Knowl. Data Eng. 27, 2 (feb 2015), 443–460

work page 2015
[38]

Cees G. M. Snoek, Marcel Worring, and Arnold W. M. Smeulders. 2005. Early Versus Late Fusion in Semantic Video Analysis. In Proc. of MULTIMEDIA ’05 . 399–402

work page 2005
[39]

Petros Venetis, Alon Halevy, Jayant Madhavan, Marius Pas ¸ca, Warren Shen, Fei Wu, Gengxin Miao, and Chung Wu. 2011. Recovering Semantics of Tables on the Web. Proc. of VLDB Endow. 4 (2011), 528–538

work page 2011
[40]

Jiannan Wang, Guoliang Li, and Jianhua Fe. 2011. Fast-join: An Eﬃcient Method for Fuzzy Token Matching Based String Similarity Join. In Proc. of ICDE ’11 . 458–469

work page 2011
[41]

Mohamed Yakout, Kris Ganjam, Kaushik Chakrabarti, and Surajit Chaudhuri

work page
[42]

InfoGather: Entity Augmentation and A/t_tribute Discovery by Holistic Matching with Web Tables. In Proc. of SIGMOD ’12 . 97–108

work page
[43]

Pengcheng Yin, Zhengdong Lu, Hang Li, and Ben Kao. 2016. Neural Enquirer: Learning to /Q_uery Tables in Natural Language. InProc. of IJCAI ’16 . 2308–2314

work page 2016
[44]

Meihui Zhang and Kaushik Chakrabarti. 2013. InfoGather+: Semantic Matching and Annotation of Numeric and Time-varying A/t_tributes in Web Tables. InProc. of SIGMOD ’13. 145–156

work page 2013
[45]

Shuo Zhang and Krisztian Balog. 2017. Design Pa/t_terns for Fusion-Based Ob- ject Retrieval. In Proceedings of the 39th European conference on Advances in Information Retrieval (ECIR ’17) . Springer, 684–690

work page 2017
[46]

Shuo Zhang and Krisztian Balog. 2017. EntiTables: Smart Assistance for Entity- Focused Tables. In Proc. of SIGIR ’17 . 255–264

work page 2017
[47]

Shuo Zhang and Krisztian Balog. 2018. Ad Hoc Table Retrieval Using Semantic Similarity. In Proceedings of /T_he Web Conference (WWW ’18). 1553–1562

work page 2018
[48]

Shuo Zhang and Krisztian Balog. 2018. On-the-/f_ly Table Generation. InProceed- ings of 41st International ACM SIGIR Conference on Research and Development in Information Retrieval

work page 2018
[49]

Stefan Zwicklbauer, Christoph Einsiedler, Michael Granitzer, and Christin Seifert

work page
[50]

Towards Disambiguating Web Tables. In Proc. of ISWC-PD’ 13. 205–208

work page

[1] [1]

Ahmad Ahmadov, Maik /T_hiele, Julian Eberius, Wolfgang Lehner, and Robert Wrembel. 2015. Towards a Hybrid Imputation Approach Using Web Tables.. In Proc. of BDC ’15 . 21–30

work page 2015

[2] [2]

Anonymous. 2017. Removed to Protect Anonymity. (2017)

work page 2017

[3] [3]

Halevy, Boulos Harb, Hongrae Lee, Jayant Mad- havan, Afshin Rostamizadeh, Warren Shen, Kenneth Wilder, Fei Wu, and Cong Yu

Sreeram Balakrishnan, Alon Y. Halevy, Boulos Harb, Hongrae Lee, Jayant Mad- havan, Afshin Rostamizadeh, Warren Shen, Kenneth Wilder, Fei Wu, and Cong Yu. 2015. Applying WebTables in Practice. In Proc. of CIDR ’15

work page 2015

[4] [4]

Somnath Banerjee, Soumen Chakrabarti, and Ganesh Ramakrishnan. 2009. Learn- ing to Rank for /Q_uantity Consensus /Q_ueries. InProc. of SIGIR ’09 . 243–250

work page 2009

[5] [5]

Chandra Sekhar Bhagavatula, /T_hanapon Noraset, and Doug Downey. 2013. Meth- ods for Exploring and Mining Tables on Wikipedia. In Proc. of IDEA ’13 . 18–26

work page 2013

[6] [6]

Chandra Sekhar Bhagavatula, /T_hanapon Noraset, and Doug Downey. 2015. TabEL: Entity Linking in Web Tables. InProc. of ISWC 2015. 425–441

work page 2015

[7] [7]

Cafarella, Alon Halevy, and Nodira Khoussainova

Michael J. Cafarella, Alon Halevy, and Nodira Khoussainova. 2009. Data Integra- tion for the Relational Web. Proc. of VLDB Endow. 2 (2009), 1090–1101

work page 2009

[8] [8]

Cafarella, Alon Halevy, and Jayant Madhavan

Michael J. Cafarella, Alon Halevy, and Jayant Madhavan. 2011. Structured Data on the Web. Commun. ACM 54 (2011), 72–79

work page 2011

[9] [9]

Cafarella, Alon Halevy, Daisy Zhe Wang, Eugene Wu, and Yang Zhang

Michael J. Cafarella, Alon Halevy, Daisy Zhe Wang, Eugene Wu, and Yang Zhang

work page

[10] [10]

WebTables: Exploring the Power of Tables on the Web. Proc. of VLDB Endow. 1 (2008), 538–549

work page 2008

[11] [11]

Fernando Chirigati, Jialu Liu, Flip Korn, You (Will) Wu, Cong Yu, and Hao Zhang

work page

[12] [12]

Knowledge Exploration Using Tables on the Web. Proc. of VLDB Endow. 10 (2016), 193–204

work page 2016

[13] [13]

Eric Crestan and Patrick Pantel. 2011. Web-scale Table Census and Classi/f_ication. In Proc. of WSDM ’11 . 545–554

work page 2011

[14] [14]

Anish Das Sarma, Lujun Fang, Nitin Gupta, Alon Halevy, Hongrae Lee, Fei Wu, Reynold Xin, and Cong Yu. 2012. Finding Related Tables. In Proc. of SIGMOD ’12 . 817–828

work page 2012

[15] [15]

Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, /T_homas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge Vault: A Web-scale Approach to Probabilistic Knowledge Fusion. InProc. of KDD ’14. 601–610

work page 2014

[16] [16]

Fleiss et al

J.L. Fleiss et al. 1971. Measuring nominal scale agreement among many raters. Psychological Bulletin 76 (1971), 378–382

work page 1971

[17] [17]

Faegheh Hasibi, Krisztian Balog, Dar ´ıo Gariglio/t_ti, and Shuo Zhang. 2017. Nordlys: A Toolkit for Entity-Oriented and Semantic Search. In Proc. of SIGIR ’17. 1289–1292

work page 2017

[18] [18]

Yusra Ibrahim, Mirek Riedewald, and Gerhard Weikum. 2016. Making Sense of Entities and /Q_uantities in Web Tables. InProc. of CIKM ’16 . 1703–1712

work page 2016

[19] [19]

Oliver Lehmberg, Dominique Ritze, Robert Meusel, and Christian Bizer. 2016. A Large Public Corpus of Web Tables Containing Time and Context Metadata. In Proc. of WWW ’16 Companion . 75–76

work page 2016

[20] [20]

Oliver Lehmberg, Dominique Ritze, Petar Ristoski, Robert Meusel, Heiko Paul- heim, and Christian Bizer. 2015. /T_he Mannheim Search Join Engine.Web Semant. 35 (2015), 159–166

work page 2015

[21] [21]

Girija Limaye, Sunita Sarawagi, and Soumen Chakrabarti. 2010. Annotating and Searching Web Tables Using Entities, Types and Relationships. Proc. of VLDB Endow. 3 (2010), 1338–1347

work page 2010

[22] [22]

Craig Macdonald, Rodrygo L T Santos, and Iadh Ounis. 2012. On the Usefulness of /Q_uery Features for Learning to Rank. InProc. of CIKM ’12 . 2559–2562

work page 2012

[23] [23]

Jayant Madhavan, Loredana Afanasiev, Lyublena Antova, and Alon Y. Halevy

work page

[24] [24]

Harnessing the Deep Web: Present and Future

Harnessing the Deep Web: Present and Future. CoRR abs/0909.1785 (2009)

work page internal anchor Pith review Pith/arXiv arXiv 2009

[25] [25]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeﬀrey Dean. 2013. Distributed Representations of Words and Phrases and /T_heir Compositionality. In Proc. of NIPS ’13 . 3111–3119

work page 2013

[26] [26]

Emir Munoz, Aidan Hogan, and Alessandra Mileo. 2014. Using Linked Data to Mine RDF from Wikipedia’s Tables. In Proc. of WSDM ’14 . 533–542

work page 2014

[27] [27]

Neural Programmer: Inducing Latent Programs with Gradient Descent

Arvind Neelakantan, /Q_uoc V. Le, and Ilya Sutskever. 2015. Neural Programmer: Inducing Latent Programs with Gradient Descent. CoRR abs/1511.04834 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015

[28] [28]

/T_hanh Tam Nguyen, /Q_uoc Viet Hung Nguyen, Weidlich Ma/t_thias, and Aberer Karl. 2015. Result Selection and Summarization for Web Table Search. In ISDE ’15. 425–441

work page 2015

[29] [29]

Jeﬀrey Pennington, Richard Socher, and Christopher D Manning. 2014. GloVe: Global Vectors for Word Representation. InProc. of EMNLP ’14 . 1532–1543

work page 2014

[30] [30]

Rakesh Pimplikar and Sunita Sarawagi. 2012. Answering Table /Q_ueries on the Web Using Column Keywords. Proc. of VLDB Endow. 5 (2012), 908–919

work page 2012

[31] [31]

Tao Qin, Tie-Yan Liu, Jun Xu, and Hang Li. 2010. LETOR: A Benchmark Collection for Research on Learning to Rank for Information Retrieval. Inf. Retr. 13, 4 (Aug 2010), 346–374

work page 2010

[32] [32]

Petar Ristoski and Heiko Paulheim. 2016. RDF2vec: RDF Graph Embeddings for Data Mining. In Proc. of ISWC ’16. 498–514

work page 2016

[33] [33]

Dominique Ritze, Oliver Lehmberg, Yaser Oulabi, and Christian Bizer. 2016. Pro/f_iling the Potential of Web Tables for Augmenting Cross-domain Knowledge Bases. In Proc. of WWW ’16 . 251–261

work page 2016

[34] [34]

Sunita Sarawagi and Soumen Chakrabarti. 2014. Open-domain /Q_uantity /Q_ueries on Web Tables: Annotation, Response, and Consensus Models. In Proc. of KDD ’14. 711–720

work page 2014

[35] [35]

Sekhavat, Francesco Di Paolo, Denilson Barbosa, and Paolo Merialdo

Yoones A. Sekhavat, Francesco Di Paolo, Denilson Barbosa, and Paolo Merialdo

work page

[36] [36]

Knowledge Base Augmentation using Tabular Data. In Proc. of LDOW ’14

work page

[37] [37]

Wei Shen, Jianyong Wang, and Jiawei Han. 2015. Entity Linking with a Knowl- edge Base: Issues, Techniques, and Solutions. IEEE Trans. Knowl. Data Eng. 27, 2 (feb 2015), 443–460

work page 2015

[38] [38]

Cees G. M. Snoek, Marcel Worring, and Arnold W. M. Smeulders. 2005. Early Versus Late Fusion in Semantic Video Analysis. In Proc. of MULTIMEDIA ’05 . 399–402

work page 2005

[39] [39]

Petros Venetis, Alon Halevy, Jayant Madhavan, Marius Pas ¸ca, Warren Shen, Fei Wu, Gengxin Miao, and Chung Wu. 2011. Recovering Semantics of Tables on the Web. Proc. of VLDB Endow. 4 (2011), 528–538

work page 2011

[40] [40]

Jiannan Wang, Guoliang Li, and Jianhua Fe. 2011. Fast-join: An Eﬃcient Method for Fuzzy Token Matching Based String Similarity Join. In Proc. of ICDE ’11 . 458–469

work page 2011

[41] [41]

Mohamed Yakout, Kris Ganjam, Kaushik Chakrabarti, and Surajit Chaudhuri

work page

[42] [42]

InfoGather: Entity Augmentation and A/t_tribute Discovery by Holistic Matching with Web Tables. In Proc. of SIGMOD ’12 . 97–108

work page

[43] [43]

Pengcheng Yin, Zhengdong Lu, Hang Li, and Ben Kao. 2016. Neural Enquirer: Learning to /Q_uery Tables in Natural Language. InProc. of IJCAI ’16 . 2308–2314

work page 2016

[44] [44]

Meihui Zhang and Kaushik Chakrabarti. 2013. InfoGather+: Semantic Matching and Annotation of Numeric and Time-varying A/t_tributes in Web Tables. InProc. of SIGMOD ’13. 145–156

work page 2013

[45] [45]

Shuo Zhang and Krisztian Balog. 2017. Design Pa/t_terns for Fusion-Based Ob- ject Retrieval. In Proceedings of the 39th European conference on Advances in Information Retrieval (ECIR ’17) . Springer, 684–690

work page 2017

[46] [46]

Shuo Zhang and Krisztian Balog. 2017. EntiTables: Smart Assistance for Entity- Focused Tables. In Proc. of SIGIR ’17 . 255–264

work page 2017

[47] [47]

Shuo Zhang and Krisztian Balog. 2018. Ad Hoc Table Retrieval Using Semantic Similarity. In Proceedings of /T_he Web Conference (WWW ’18). 1553–1562

work page 2018

[48] [48]

Shuo Zhang and Krisztian Balog. 2018. On-the-/f_ly Table Generation. InProceed- ings of 41st International ACM SIGIR Conference on Research and Development in Information Retrieval

work page 2018

[49] [49]

Stefan Zwicklbauer, Christoph Einsiedler, Michael Granitzer, and Christin Seifert

work page

[50] [50]

Towards Disambiguating Web Tables. In Proc. of ISWC-PD’ 13. 205–208

work page