pith. sign in

arxiv: 2207.10080 · v3 · pith:ZWQBEXIPnew · submitted 2022-05-27 · 🧬 q-bio.QM · cs.AI· cs.CL· cs.IR· cs.LG

Multi-modal Protein Knowledge Graph Construction and Applications

classification 🧬 q-bio.QM cs.AIcs.CLcs.IRcs.LG
keywords proteinknowledgeproteinkg65graphapplicationsgeneontologyscience
0
0 comments X
read the original abstract

Existing data-centric methods for protein science generally cannot sufficiently capture and leverage biology knowledge, which may be crucial for many protein tasks. To facilitate research in this field, we create ProteinKG65, a knowledge graph for protein science. Using gene ontology and Uniprot knowledge base as a basis, we transform and integrate various kinds of knowledge with aligned descriptions and protein sequences, respectively, to GO terms and protein entities. ProteinKG65 is mainly dedicated to providing a specialized protein knowledge graph, bringing the knowledge of Gene Ontology to protein function and structure prediction. We also illustrate the potential applications of ProteinKG65 with a prototype. Our dataset can be downloaded at https://w3id.org/proteinkg65.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. AgentPLM: Agentic Protein Language Models with Reasoning-Augmented Decoding for Protein Sequence Design

    cs.AI 2026-06 unverdicted novelty 5.0

    AgentPLM equips pre-trained protein language models with reasoning-augmented decoding using tools such as ESMFold and FoldX plus contrastive agent policy optimization, reporting state-of-the-art gains on de novo enzym...

  2. OptimusKG: Unifying biomedical knowledge in a modern multimodal graph

    cs.AI 2026-04 accept novelty 5.0

    OptimusKG is a labeled property graph unifying biomedical knowledge from structured sources into 190,531 nodes of 10 types and 21.8 million edges of 26 types, with 70% of sampled edges supported by literature evidence...