LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model

Jiang-Xin Shi; Lan-Zhe Guo; Peng-Xiao Song; Xiao-Wen Yang; Yi-Xuan Jin; Yu-Feng Li; Zhi Zhou

arxiv: 2406.04614 · v1 · pith:55NMDJ4Znew · submitted 2024-06-07 · 💻 cs.CL · cs.AI

LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model

Zhi Zhou , Jiang-Xin Shi , Peng-Xiao Song , Xiao-Wen Yang , Yi-Xuan Jin , Lan-Zhe Guo , Yu-Feng Li This is my paper

classification 💻 cs.CL cs.AI

keywords legallawgptmodelschinesemodelopen-sourcetasksdemonstrate

0 comments

read the original abstract

Large language models (LLMs), including both proprietary and open-source models, have showcased remarkable capabilities in addressing a wide range of downstream tasks. Nonetheless, when it comes to practical Chinese legal tasks, these models fail to meet the actual requirements. Proprietary models do not ensure data privacy for sensitive legal cases, while open-source models demonstrate unsatisfactory performance due to their lack of legal knowledge. To address this problem, we introduce LawGPT, the first open-source model specifically designed for Chinese legal applications. LawGPT comprises two key components: legal-oriented pre-training and legal supervised fine-tuning. Specifically, we employ large-scale Chinese legal documents for legal-oriented pre-training to incorporate legal domain knowledge. To further improve the model's performance on downstream legal tasks, we create a knowledge-driven instruction dataset for legal supervised fine-tuning. Our experimental results demonstrate that LawGPT outperforms the open-source LLaMA 7B model. Our code and resources are publicly available at https://github.com/pengxiao-song/LaWGPT and have received 5.7K stars on GitHub.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

From Query to Counsel: Structured Reasoning with a Multi-Agent Framework and Dataset for Legal Consultation
cs.CL 2026-04 unverdicted novelty 7.0

A new dataset and multi-agent framework for legal consultation question answering that outperforms standard LLMs.
TaxPraBen: A Scalable Benchmark for Structured Evaluation of LLMs in Chinese Real-World Tax Practice
cs.CL 2026-04 unverdicted novelty 7.0

TaxPraBen is a new benchmark with 14 datasets and a structured evaluation method for measuring LLM performance on Chinese real-world tax tasks and scenarios.
Can LLMs Time Travel? Enhancing Temporal Consistency in Legal Agentic Search through Reinforcement Learning
cs.CL 2026-05 unverdicted novelty 6.0

LegalSearch-R1 trains a 7B agent via RL on multi-period legal data with hybrid RAG/web search to improve temporal consistency, reporting 12.9-29.8% gains over SOTA and 57.7-80.3% on consistency metrics across 13 tasks.
LLM Evolution as an Industry-Scale Ecosystem: A Lifecycle Perspective on Continual Learning
cs.LG 2026-06 unverdicted novelty 5.0

The paper reformulates industrial continual learning for LLMs as a closed-loop ecosystem problem, identifies three core challenges, and organizes solutions around five lifecycle design principles.