An Empirical Study on Information Extraction using Large Language Models

Benyou Wang; Chaohao Yang; Lu Liu; Prayag Tiwari; Ridong Han; Tao Peng; Xiang Wan

arxiv: 2409.00369 · v3 · pith:62INGKW6new · submitted 2024-08-31 · 💻 cs.CL

An Empirical Study on Information Extraction using Large Language Models

Ridong Han , Chaohao Yang , Tao Peng , Prayag Tiwari , Xiang Wan , Lu Liu , Benyou Wang This is my paper

classification 💻 cs.CL

keywords informationextractionllmsabilitygpt-4languagemethodshuman-like

0 comments

read the original abstract

Human-like large language models (LLMs), especially the most powerful and popular ones in OpenAI's GPT family, have proven to be very helpful for many natural language processing (NLP) related tasks. Therefore, various attempts have been made to apply LLMs to information extraction (IE), which is a fundamental NLP task that involves extracting information from unstructured plain text. To demonstrate the latest representative progress in LLMs' information extraction ability, we assess the information extraction ability of GPT-4 (the latest version of GPT at the time of writing this paper) from four perspectives: Performance, Evaluation Criteria, Robustness, and Error Types. Our results suggest a visible performance gap between GPT-4 and state-of-the-art (SOTA) IE methods. To alleviate this problem, considering the LLMs' human-like characteristics, we propose and analyze the effects of a series of simple prompt-based methods, which can be generalized to other LLMs and NLP tasks. Rich experiments show our methods' effectiveness and some of their remaining issues in improving GPT-4's information extraction ability.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SMADE-IE: Sparse Multi-Agent Framework with Evidence-Driven Debate for Zero-Shot Information Extraction
cs.CL 2026-06 unverdicted novelty 5.0

SMADE-IE introduces an adaptive mode selector and Toulmin-style evidence-driven debate to outperform prior zero-shot IE methods on NER, RE, and JERE tasks while reducing token use.