DeepRefine refines agent-compiled knowledge bases via multi-turn abductive diagnosis and RL training with a GBD reward, yielding consistent downstream task gains.
Paulheim, Knowledge graph refinement: A survey of approaches and evaluation methods, Semantic Web 8 (3) (2017) 489–508
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
unclear 1representative citing papers
KGI-Bench evaluates data integration pipelines into knowledge graphs using coverage, correctness, and consistency metrics on movie domain datasets with 12 pipelines tested.
EMERGE is a benchmark dataset of 233K Wikipedia passages paired with 1.45 million Wikidata edit operations across seven yearly snapshots from 2019 to 2025 for evaluating knowledge graph updates from emerging text.
BifrostRAG combines dual knowledge graphs with hybrid retrieval to improve multi-hop question answering on construction safety regulations, reporting 87.3% F1 on a custom dataset.
citing papers explorer
-
DeepRefine: Agent-Compiled Knowledge Refinement via Reinforcement Learning
DeepRefine refines agent-compiled knowledge bases via multi-turn abductive diagnosis and RL training with a GBD reward, yielding consistent downstream task gains.
-
Evaluation of Pipelines for Data Integration into Knowledge Graphs
KGI-Bench evaluates data integration pipelines into knowledge graphs using coverage, correctness, and consistency metrics on movie domain datasets with 12 pipelines tested.
-
EMERGE: A Benchmark for Updating Knowledge Graphs with Emerging Textual Knowledge
EMERGE is a benchmark dataset of 233K Wikipedia passages paired with 1.45 million Wikidata edit operations across seven yearly snapshots from 2019 to 2025 for evaluating knowledge graph updates from emerging text.
-
Bridging Dual Knowledge Graphs for Multi-Hop Question Answering in Construction Safety
BifrostRAG combines dual knowledge graphs with hybrid retrieval to improve multi-hop question answering on construction safety regulations, reporting 87.3% F1 on a custom dataset.