arxiv: 2604.18939 · v1 · submitted 2026-04-21 · 💻 cs.LG

Recognition: unknown

TabEmb: Joint Semantic-Structure Embedding for Table Annotation

Ehsan Hoseinzade , Ke Wang , Anandharaju Durai Raju

Authors on Pith no claims yet

Pith reviewed 2026-05-10 03:15 UTC · model grok-4.3

classification 💻 cs.LG

keywords table annotationsemantic embeddingsgraph-based modelinglarge language modelsjoint representationscolumn relationshipsstructural injection

0 comments

The pith

TabEmb creates joint semantic-structural table embeddings by using an LLM for column semantics and a graph for relationships.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Tables need representations that capture both column meanings and their interconnections for effective annotation. Existing methods linearize tables into text sequences for language models, but this restricts semantic richness from modern LLMs and weakens structural understanding. TabEmb addresses this by first generating independent semantic embeddings for each column with a large language model. It then applies a graph-based module across columns to incorporate the relationships between them. This yields improved representations that outperform baselines on table annotation tasks, showing the value of explicit separation between semantics and structure.

Core claim

The central discovery is that joint semantic-structural representations for tables can be obtained by decoupling the process: an LLM produces semantically rich embeddings for each column, after which a graph-based module injects the inter-column relationships, resulting in embeddings suitable for table annotation that surpass those from linearized sequence models.

What carries the argument

The TabEmb two-stage encoder consisting of an LLM for per-column semantic embeddings followed by a graph module for relationship injection.

Load-bearing premise

That using separate modules for semantics via LLM and structure via graph will capture joint information more effectively than an integrated end-to-end linearization approach without missing key relationships.

What would settle it

If an end-to-end model that linearizes the table and processes it with a sufficiently long-context LLM achieves comparable or better results on the same annotation tasks, the benefit of decoupling would be called into question.

Figures

Figures reproduced from arXiv: 2604.18939 by Anandharaju Durai Raju, Ehsan Hoseinzade, Ke Wang.

**Figure 1.** Figure 1: The Impact of joint semantic-structure Modeling. t-SNE visualization of column embeddings from three paradigms: (Left) PLM (BERT), (Middle) LLM (Mistral), and (Right) TabEmb (Mistral + Trainable GNN) on the SOTABdbp dataset. For readability, we visualize only a subset of classes. While BERT produces weakly separated clusters and the LLM improves semantic coherence, the LLM still exhibits significant overla… view at source ↗

**Figure 2.** Figure 2: Overview of TabEmb. A frozen LLM encodes each column into a semantic embedding, a GNN refines [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: class–class attention heatmap on SOTABdbp (CTA), aggregated from GAT attention weights across tables. tral achieves the highest average score (92.1), but the improvement over other LLMs is limited, suggesting that there is not much difference between LLMs. In contrast, BERT performs worst with an overall average of 89.4, nearly 2.7 points lower than Mistral, indicating that older small PLMs like BERT capt… view at source ↗

read the original abstract

Table annotation is crucial for making web and enterprise tables usable in downstream NLP applications. Unlike textual data where learning semantically rich token or sentence embeddings often suffice, tables are structured combinations of columns wherein useful representations must jointly capture column's semantics and the inter-column relationships. Existing models learn by linearizing the 2D table into a 1D token sequence and encoding it with pretrained language models (PLMs) such as BERT. However, this leads to limited semantic quality and weaker generalization to unseen or rare values compared to modern LLMs, and degraded structural modeling due to 2D-to-1D flattening and context-length constraints. We propose TabEmb, which directly targets these limitations by decoupling semantic encoding from structural modeling. An LLM first produces semantically rich embeddings for each column, and a graph-based module over columns then injects relationships into the embeddings, yielding joint semantic-tructural representations for table annotation. Experiments show that TabEmb consistently outperforms strong baselines on different table annotation tasks. Source code and datasets are available at https://github.com/hoseinzadeehsan/TabEmb

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TabEmb decouples LLM column embeddings from a graph module for structure, which is a clean idea but the abstract leaves the reconstruction of missing cross-column signals unconvincing.

read the letter

The core move in this paper is to stop flattening tables into token sequences for PLMs and instead run an LLM on each column separately for semantics, then feed those embeddings into a graph layer that adds inter-column relations. That split is the actual novelty here, and it directly targets the context-length and flattening problems the authors flag in prior work. Releasing code and datasets is also useful for anyone who wants to test the claim themselves. The approach feels like a natural extension of recent LLM-plus-graph work in other structured domains, and the motivation is stated plainly without overclaiming. What is less clear is whether the graph module can reliably recover the relational signals that the per-column LLM never saw. Column embeddings are built in isolation, so things like value co-occurrence or type compatibility across columns have to be reconstructed downstream. The abstract gives no detail on edge construction, whether the graph is trained end-to-end with gradients back to the LLM, or how the joint representation is actually formed. Without those pieces, it is hard to judge if the reported gains over baselines are robust or mainly from the stronger LLM backbone. The experiments are described only at the level of “consistent outperformance,” with no numbers, error analysis, or ablation on the graph component. That makes the central claim hard to evaluate from the text alone. This paper is aimed at researchers doing table annotation, semantic parsing, or data integration who already work with LLMs and graphs. It is worth sending to peer review because the architecture is well-motivated, the code is public, and the idea can be checked empirically even if the current write-up is light on implementation specifics.

Referee Report

3 major / 2 minor

Summary. The paper proposes TabEmb, a method for table annotation that decouples semantic encoding from structural modeling: an LLM generates per-column embeddings, and a subsequent graph-based module over columns injects inter-column relationships to produce joint representations. It contrasts this with existing approaches that linearize tables into 1D sequences for PLMs like BERT, which the authors argue suffer from limited semantic quality, poor generalization to rare values, and degraded structural modeling due to flattening and context limits. Experiments are reported to show consistent outperformance over strong baselines on multiple table annotation tasks, with code and datasets released publicly.

Significance. If the central claim holds, TabEmb could advance table representation learning by leveraging LLMs' semantic strengths while explicitly addressing structure via graphs, improving usability of web and enterprise tables in downstream NLP. The public release of source code and datasets is a clear strength that supports reproducibility and community follow-up work.

major comments (3)

[§3] §3 (Method, Semantic Encoding): The core assumption that independent per-column LLM embeddings already encode sufficient semantics for the graph module to inject accurate relationships is load-bearing but under-supported. Because columns are encoded separately, joint signals such as value co-occurrence, type compatibility, or distributional alignment across columns are absent from the inputs; the manuscript provides no analysis or ablation demonstrating that these can be recovered downstream.
[§3.2] §3.2 (Graph-based module): Edge construction, message-passing details, and whether the graph is learned end-to-end (with gradients flowing back to the LLM) are not specified at a level that allows assessment of whether the module can reconstruct missing relational signals. Without such mechanisms, the joint representation may be weaker than the end-to-end attention in linearized PLMs, directly undermining the decoupling claim.
[Experiments] Experiments (results tables): While consistent outperformance is claimed, the absence of error analysis, cases highlighting joint-context requirements, or ablations isolating the graph module's contribution versus LLM strength alone makes it difficult to attribute gains to the proposed architecture rather than other factors such as model scale.

minor comments (2)

[Abstract] Abstract: Typo in 'joint semantic-tructural representations' (should be 'structural').
[§3] Notation: Column embeddings and graph node features use overlapping symbols without explicit disambiguation in early sections.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. We have carefully considered each comment and revised the manuscript to address the concerns regarding methodological details, support for the core assumptions, and experimental analyses. Our responses are provided point by point below.

read point-by-point responses

Referee: [§3] §3 (Method, Semantic Encoding): The core assumption that independent per-column LLM embeddings already encode sufficient semantics for the graph module to inject accurate relationships is load-bearing but under-supported. Because columns are encoded separately, joint signals such as value co-occurrence, type compatibility, or distributional alignment across columns are absent from the inputs; the manuscript provides no analysis or ablation demonstrating that these can be recovered downstream.

Authors: We appreciate this observation on the load-bearing assumption. The design choice of independent per-column encoding leverages the LLM's pretraining on massive corpora to produce high-quality semantic representations for individual columns, including implicit knowledge of common co-occurrences and types from training data. The graph module is then responsible for explicitly injecting relational structure. To strengthen support for recoverability, we have added a new ablation study (now in §4.3) comparing TabEmb against a no-graph baseline that uses only the LLM column embeddings, along with qualitative examples illustrating recovery of joint signals such as type compatibility in the graph-augmented representations. These additions directly address the request for analysis. revision: yes
Referee: [§3.2] §3.2 (Graph-based module): Edge construction, message-passing details, and whether the graph is learned end-to-end (with gradients flowing back to the LLM) are not specified at a level that allows assessment of whether the module can reconstruct missing relational signals. Without such mechanisms, the joint representation may be weaker than the end-to-end attention in linearized PLMs, directly undermining the decoupling claim.

Authors: We agree that the original description in §3.2 was insufficiently detailed. In the revised manuscript, we have expanded this section to specify: (i) edge construction via a combination of schema-based type matching and value-overlap heuristics between columns; (ii) the use of a graph attention network (GAT) for message passing with multi-head attention; and (iii) end-to-end training where gradients flow through the graph module back to the LLM embeddings (with the LLM kept frozen only during initial embedding generation for efficiency, but fine-tunable). This setup enables the graph to adaptively reconstruct relational signals, and we have added a diagram and pseudocode for clarity. We believe this strengthens rather than undermines the decoupling claim by making the structural component explicit and trainable. revision: yes
Referee: [Experiments] Experiments (results tables): While consistent outperformance is claimed, the absence of error analysis, cases highlighting joint-context requirements, or ablations isolating the graph module's contribution versus LLM strength alone makes it difficult to attribute gains to the proposed architecture rather than other factors such as model scale.

Authors: We acknowledge the value of these additional analyses for attributing performance gains. The revised manuscript now includes: (i) an error analysis section (§4.4) breaking down failure cases by task and comparing error patterns between TabEmb and baselines; (ii) specific case studies highlighting scenarios where joint-context (e.g., column co-occurrence for entity linking) is required and how the graph module improves over pure LLM embeddings; and (iii) an ablation isolating the graph module by reporting results for LLM-only column embeddings versus the full model. Regarding model scale, all comparisons use LLMs of matched parameter counts to the baselines (e.g., Llama-7B variants), and gains hold across different backbone scales, suggesting the architecture contributes beyond scale alone. revision: yes

Circularity Check

0 steps flagged

No circularity: new architecture with empirical validation

full rationale

The paper introduces TabEmb as a proposed model architecture that decouples LLM column embeddings from a subsequent graph module for inter-column relationships. No derivation chain, first-principles result, or prediction reduces to its own inputs by construction. There are no fitted parameters renamed as predictions, no self-definitional loops in equations, and no load-bearing self-citations or uniqueness theorems invoked. The central claims rest on experimental outperformance versus baselines rather than any closed mathematical reduction. This is a standard empirical proposal and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical axioms, free parameters, or invented entities are described in the abstract; the contribution is an empirical architecture combining existing LLM and graph components.

pith-pipeline@v0.9.0 · 5491 in / 1032 out tokens · 26347 ms · 2026-05-10T03:15:25.529191+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

241 extracted references · 56 canonical work pages · 10 internal anchors

[1]

Column Type Annotation Using Large Language Models , author=
[2]

Proceedings of the ACM on Management of Data , volume=

Retrieve-and-verify: A table context selection framework for accurate column annotations , author=. Proceedings of the ACM on Management of Data , volume=. 2025 , publisher=

2025
[3]

arXiv preprint arXiv:2603.11436 , year=

ZTab: Domain-based Zero-shot Annotation for Table Columns , author=. arXiv preprint arXiv:2603.11436 , year=

work page arXiv
[4]

arXiv preprint arXiv:2406.10922 , year=

Generating Tables from the Parametric Knowledge of Language Models , author=. arXiv preprint arXiv:2406.10922 , year=

work page arXiv
[5]

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=

TableLlama: Towards Open Large Generalist Models for Tables , author=. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=

2024
[6]

Proceedings of the VLDB Endowment , volume=

Cents: A Flexible and Cost-Effective Framework for LLM-Based Table Understanding , author=. Proceedings of the VLDB Endowment , volume=. 2025 , publisher=

2025
[7]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

Jellyfish: Instruction-tuning local large language models for data preprocessing , author=. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing , pages=

2024
[8]

Task-oriented gnns training on large knowledge graphs for accurate and efficient modeling,

Wang, Yubo and Xin, Hao and Chen, Lei , booktitle =. 2024 , volume =. doi:10.1109/ICDE60146.2024.00083 , url =

work page doi:10.1109/icde60146.2024.00083 2024
[9]

Advances in neural information processing systems , volume=

Language models are few-shot learners , author=. Advances in neural information processing systems , volume=
[10]

IEEE Transactions on Knowledge and Data Engineering , volume=

Data lakes: A survey of functions and systems , author=. IEEE Transactions on Knowledge and Data Engineering , volume=. 2023 , publisher=

2023
[11]

Proceedings of the VLDB Endowment , volume=

Chorus: Foundation Models for Unified Data Discovery and Exploration , author=. Proceedings of the VLDB Endowment , volume=. 2024 , publisher=

2024
[12]

International Conference on Learning Representations (ICLR 2023) , year=

Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning , author=. International Conference on Learning Representations (ICLR 2023) , year=

2023
[13]

Qwen Technical Report

Qwen technical report , author=. arXiv preprint arXiv:2309.16609 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[14]

Pacific-Asia Conference on Knowledge Discovery and Data Mining , pages=

Graph Neural Network Approach to Semantic Type Detection in Tables , author=. Pacific-Asia Conference on Knowledge Discovery and Data Mining , pages=. 2024 , organization=

2024
[15]

InProceedingsofthe 8th Global WordNet Conference (GWC), pages 302–310

Does synthetic data generation of llms help clinical text mining? , author=. arXiv preprint arXiv:2303.04360 , year=

work page arXiv
[16]

arXiv preprint arXiv:2109.09193 , year=

Towards zero-label language learning , author=. arXiv preprint arXiv:2109.09193 , year=

work page arXiv
[17]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing , pages=

ZeroGen: Efficient Zero-shot Learning via Dataset Generation , author=. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing , pages=

2022
[18]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=

Synthetic Data Generation with Large Language Models for Text Classification: Potential and Limitations , author=. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=

2023
[19]

arXiv preprint arXiv:2306.00745 , year=

Column type annotation using chatgpt , author=. arXiv preprint arXiv:2306.00745 , year=

work page arXiv
[20]

Proceedings of the 17th ACM International Conference on Web Search and Data Mining , pages=

Table meets llm: Can large language models understand structured table data? a benchmark and empirical study , author=. Proceedings of the 17th ACM International Conference on Web Search and Data Mining , pages=
[21]

Proceedings of the ACM on Management of Data , volume=

Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks , author=. Proceedings of the ACM on Management of Data , volume=. 2024 , publisher=

2024
[22]

CSV2KG: Transforming Tabular Data into Semantic Knowledge , author=
[23]

ACM Computing Surveys , volume=

A comprehensive survey on automatic knowledge graph construction , author=. ACM Computing Surveys , volume=. 2023 , publisher=

2023
[24]

Proceedings of the VLDB Endowment , volume=

ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models , author=. Proceedings of the VLDB Endowment , volume=. 2024 , publisher=

2024
[25]

Crete, Greece , year=

Column Property Annotation using Large Language Models , author=. Crete, Greece , year=
[26]

IEEE Transactions on Knowledge and Data Engineering , volume=

Inference of regular expressions for text extraction from examples , author=. IEEE Transactions on Knowledge and Data Engineering , volume=. 2016 , publisher=

2016
[27]

International Conference on Learning Representations , year=

LoRA: Low-Rank Adaptation of Large Language Models , author=. International Conference on Learning Representations , year=
[28]

Mistral 7B

Mistral 7B , author=. arXiv preprint arXiv:2310.06825 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[29]

, author=

Towards Disambiguating Web Tables. , author=. ISWC (Posters & Demos) , pages=
[30]

Proceedings of the VLDB Endowment , volume=

Reca: Related tables enhanced column semantic type annotation framework , author=. Proceedings of the VLDB Endowment , volume=. 2023 , publisher=

2023
[31]

arXiv preprint arXiv:2012.01721 , year=

Learning Class-Transductive Intent Representations for Zero-shot Intent Detection , author=. arXiv preprint arXiv:2012.01721 , year=

work page arXiv 2012
[32]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Class-wise Thresholding for Robust Out-of-Distribution Detection , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
[33]

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension , author=. arXiv preprint arXiv:1910.13461 , year=

work page internal anchor Pith review arXiv 1910
[34]

IEEE transactions on pattern analysis and machine intelligence , volume=

Zero-shot learning—a comprehensive evaluation of the good, the bad and the ugly , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2018 , publisher=

2018
[35]

IEEE transactions on pattern analysis and machine intelligence , year=

A review of generalized zero-shot learning methods , author=. IEEE transactions on pattern analysis and machine intelligence , year=
[36]

Adapt- ing language models for zero-shot learning by meta- tuning on dataset and prompt collections.arXiv preprint arXiv:2104.04670,

Adapting language models for zero-shot learning by meta-tuning on dataset and prompt collections , author=. arXiv preprint arXiv:2104.04670 , year=

work page arXiv
[37]

arXiv preprint arXiv:1909.00161 , year=

Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach , author=. arXiv preprint arXiv:1909.00161 , year=

work page arXiv 1909
[38]

Proceedings of the 2022 International Conference on Management of Data , pages=

Annotating columns with pre-trained language models , author=. Proceedings of the 2022 International Conference on Management of Data , pages=

2022
[39]

arXiv preprint arXiv:2205.09310 , year=

Mitigating Neural Network Overconfidence with Logit Normalization , author=. arXiv preprint arXiv:2205.09310 , year=

work page arXiv
[40]

, author=

Uncovering the Relational Web. , author=. WebDB , year=
[41]

Transactions of the Association for Computational Linguistics , volume=

How can we know what language models know? , author=. Transactions of the Association for Computational Linguistics , volume=. 2020 , publisher=

2020
[42]

Tinybert: Distilling bert for natural language understanding

Tinybert: Distilling bert for natural language understanding , author=. arXiv preprint arXiv:1909.10351 , year=

work page arXiv 1909
[43]

Proceedings of the 25th international conference companion on world wide web , pages=

A large public corpus of web tables containing time and context metadata , author=. Proceedings of the 25th international conference companion on world wide web , pages=
[44]

Mixtral of Experts

Mixtral of experts , author=. arXiv preprint arXiv:2401.04088 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[45]

arXiv e-prints , pages=

The llama 3 herd of models , author=. arXiv e-prints , pages=
[46]

TinyLlama: An Open-Source Small Language Model

Tinyllama: An open-source small language model , author=. arXiv preprint arXiv:2401.02385 , year=

work page internal anchor Pith review arXiv
[47]

NeurIPS 2024 Third Table Representation Learning Workshop , year=

RACOON: An LLM-based Framework for Retrieval-Augmented Column Type Annotation with a Knowledge Graph , author=. NeurIPS 2024 Third Table Representation Learning Workshop , year=

2024
[48]

1st ICML Workshop on Foundation Models for Structured Data , year=

Efficient Table Generation for Zero-Shot Column Type Annotation , author=. 1st ICML Workshop on Foundation Models for Structured Data , year=
[49]

Companion Proceedings of the ACM on Web Conference 2025 , pages=

Evaluating Column Type Annotation Models and Benchmarks , author=. Companion Proceedings of the ACM on Web Conference 2025 , pages=

2025
[50]

Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=

Knowledge vault: A web-scale approach to probabilistic knowledge fusion , author=. Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=
[51]

Proceedings of the VLDB Endowment , volume=

Knowledge exploration using tables on the web , author=. Proceedings of the VLDB Endowment , volume=. 2016 , publisher=

2016
[52]

arXiv preprint arXiv:2109.05173 , year=

Making Table Understanding Work in Practice , author=. arXiv preprint arXiv:2109.05173 , year=

work page arXiv
[53]

arXiv preprint arXiv:2104.01785 , year=

Annotating Columns with Pre-trained Language Models , author=. arXiv preprint arXiv:2104.01785 , year=

work page arXiv
[54]

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages=

EXACTA: Explainable Column Annotation , author=. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages=
[55]

International Conference on Web Information Systems and Applications , pages=

Web Table Column Type Detection Using Deep Learning and Probability Graph Model , author=. International Conference on Web Information Systems and Applications , pages=. 2020 , organization=

2020
[56]

International Conference on Current Trends in Theory and Practice of Informatics , pages=

Semi-automatic column type inference for CSV table understanding , author=. International Conference on Current Trends in Theory and Practice of Informatics , pages=. 2021 , organization=

2021
[57]

2021 International Joint Conference on Neural Networks (IJCNN) , pages=

SeLaB: Semantic Labeling with BERT , author=. 2021 International Joint Conference on Neural Networks (IJCNN) , pages=. 2021 , organization=

2021
[58]

arXiv preprint arXiv:2106.04441 , year=

CLTR: An End-to-End, Transformer-Based System for Cell Level TableRetrieval and Table Question Answering , author=. arXiv preprint arXiv:2106.04441 , year=

work page arXiv
[59]

arXiv preprint arXiv:2105.01736 , year=

Retrieving Complex Tables with Multi-Granular Graph Representation Learning , author=. arXiv preprint arXiv:2105.01736 , year=

work page arXiv
[60]

2020 IEEE International Conference on Big Data (Big Data) , pages=

Tabsim: A siamese neural network for accurate estimation of table similarity , author=. 2020 IEEE International Conference on Big Data (Big Data) , pages=. 2020 , organization=

2020
[61]

Tapas: Weakly supervised table parsing via pre-training,

TaPas: Weakly supervised table parsing via pre-training , author=. arXiv preprint arXiv:2004.02349 , year=

work page arXiv 2004
[62]

2020 IEEE International Conference on Big Data (Big Data) , pages=

Towards Tabular Embeddings, Training the Relational Models , author=. 2020 IEEE International Conference on Big Data (Big Data) , pages=. 2020 , organization=

2020
[63]

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume , pages=

Joint Learning of Representations for Web-tables, Entities and Types using Graph Convolutional Network , author=. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume , pages=
[64]

ACM Transactions on the Web (TWEB) , volume=

Semantic Table Retrieval Using Keyword and Table Queries , author=. ACM Transactions on the Web (TWEB) , volume=. 2021 , publisher=

2021
[65]

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume , pages=

Structural Encoding and Pre-training Matter: Adapting BERT for Table-Based Fact Verification , author=. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume , pages=
[66]

8th ACM IKDD CODS and 26th COMAD , pages=

Learning Knowledge Graph for Target-driven Schema Matching , author=. 8th ACM IKDD CODS and 26th COMAD , pages=
[67]

2020 IEEE International Conference on Big Data (Big Data) , pages=

A hybrid deep model for learning to rank data tables , author=. 2020 IEEE International Conference on Big Data (Big Data) , pages=. 2020 , organization=

2020
[68]

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages=

TUTA: Tree-based Transformers for Generally Structured Table Pre-training , author=. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages=
[69]

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages=

TabularNet: A Neural Network Architecture for Understanding Semantic Structures of Tabular Data , author=. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages=
[70]

Proceedings of The Web Conference 2020 , pages=

Ad hoc table retrieval using intrinsic and extrinsic similarities , author=. Proceedings of The Web Conference 2020 , pages=

2020
[71]

Proceedings of the 2018 world wide web conference , pages=

Ad hoc table retrieval using semantic similarity , author=. Proceedings of the 2018 world wide web conference , pages=

2018
[72]

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval , pages=

Web table retrieval using multimodal deep learning , author=. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval , pages=
[73]

Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages=

Entitables: Smart assistance for entity-focused tables , author=. Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval , pages=
[74]

Proceedings of the 28th ACM International Conference on Information and Knowledge Management , pages=

Auto-completion for data cells in relational tables , author=. Proceedings of the 28th ACM International Conference on Information and Knowledge Management , pages=
[75]

Semantic Web , volume=

Effective and efficient semantic table interpretation using tableminer+ , author=. Semantic Web , volume=. 2017 , publisher=

2017
[76]

Proceedings of the the First International Workshop on Consuming Linked Data , year=

Using linked data to interpret tables , author=. Proceedings of the the First International Workshop on Consuming Linked Data , year=
[77]

Proceedings of the 5th International Conference on Web Intelligence, Mining and Semantics , pages=

Matching html tables to dbpedia , author=. Proceedings of the 5th International Conference on Web Intelligence, Mining and Semantics , pages=
[78]

International Semantic Web Conference , pages=

Tabel: Entity linking in web tables , author=. International Semantic Web Conference , pages=. 2015 , organization=

2015
[79]

International Semantic Web Conference , pages=

Matching web tables with knowledge base entities: from entity lookups to entity embeddings , author=. International Semantic Web Conference , pages=. 2017 , organization=

2017
[80]

Companion Proceedings of the The Web Conference 2018 , pages=

Generating schema labels through dataset content analysis , author=. Companion Proceedings of the The Web Conference 2018 , pages=

2018

Showing first 80 references.