ConTact decomposes CDR design into surface fingerprint learning, contact prediction, and contact-gated sequence generation using distance-biased attention and weighted loss, reporting 7% RMSD and 10% F1 gains on CHIMERA-Bench.
hub
arXiv preprint arXiv:1905.12265 , year=
21 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
scShapeBench supplies synthetic and real annotated single-cell datasets across four shape categories, with scReebTower outperforming PAGA and Mapper on topology-aware metrics.
SoftBlobGIN combines ESM-2 representations with protein contact graphs via a lightweight GNN and differentiable substructure pooling to achieve 92.8% accuracy on enzyme classification, raise binding-site AUROC to 0.983, and generate auditable structural explanations without retraining the language模型
AgForce improves antigen-conditioned antibody design by using framework dropout, gated bottlenecks, hyperbolic cross attention, MDN sequence head with Potts-like coupling, annealed MCL, and antigen cycle consistency to achieve 8% better amino acid recovery and superior binding metrics on CHIMERA-BEN
EvoStruct integrates evolutionary priors from a protein language model with structural priors from an E(3)-equivariant GNN to raise amino acid recovery by 16% and diversity by 2.3x on CHIMERA-Bench while cutting perplexity 43%.
MolCHG uses a multi-level compositional hierarchical graph with atom-bond cross-view contrastive learning, functional group prediction, and structure tasks to achieve top results on seven of nine MoleculeNet benchmarks.
SCOPE-BENCH shows state-of-the-art molecular models suffer up to 8x higher errors under extreme OOD, while POMA reduces mean absolute error by up to 11.2% via target-aware source selection and dual-scale adaptation.
A pre-trained interference-aware graph Transformer model for wireless resource allocation that achieves strong few-shot adaptation to new tasks and scenarios.
BiScale-GTR achieves claimed state-of-the-art results on MoleculeNet, PharmaBench and LRGB by combining improved fragment tokenization with a parallel GNN-Transformer architecture that operates at both atom and fragment scales.
MolDA is a multimodal molecular model that uses a discrete large language diffusion backbone plus a hybrid graph encoder to achieve better global coherence and validity than autoregressive approaches.
EnFlow integrates flow-based conformer generation with energy landscape modeling to enable joint ensemble generation and ground-state identification using only 1-2 ODE steps.
SSL4RL reformulates self-supervised learning objectives into dense, verifiable reward signals for RL-based fine-tuning of vision-language models, yielding performance gains on reasoning benchmarks.
GraphPINE is a GNN architecture that initializes node importance from prior knowledge graphs and propagates updates via an importance propagation layer for interpretable drug response prediction on over 5,000 genes and 952 drugs.
NaFM is a pretrained foundation model for natural products using scaffold-focused contrastive learning and masked graph objectives that achieves SOTA on taxonomy classification, gene/microbial analysis, and virtual screening tasks.
FARM adds atomic-level functional group annotations to create FG-enhanced SMILES and FG graphs, trains them with masked language modeling and GNNs plus contrastive alignment, and reports state-of-the-art results on 8 of 13 MoleculeNet tasks.
A unified training framework for mesh-based ML surrogates in CFD improves accuracy and long-horizon stability by enforcing spatial derivative consistency via multi-node prediction, using temporal cross-attention correction, and adding 3D rotary positional embeddings.
Hyformer jointly models molecule generation and property prediction via alternating attention and joint pre-training, showing synergistic gains in conditional sampling, OOD prediction, and a drug design case for antimicrobial peptides.
DiGGR introduces a self-supervised graph representation learning framework that disentangles latent factors to guide mask modeling and improve representation quality on graph tasks.
Pre-training GNNs on ECFP prediction produces statistically significant QSAR gains on five of six Biogen benchmarks with OOD splits, but underperforms on heterogeneous datasets and complex endpoints like binding affinity.
Neural network and TDA methods outperform PCA at detecting financial anomalies in the Canadian TSX-60 market.
citing papers explorer
-
ConTact: Contact-First Antibody CDR Design via Explicit Interface Reasoning
ConTact decomposes CDR design into surface fingerprint learning, contact prediction, and contact-gated sequence generation using distance-biased attention and weighted loss, reporting 7% RMSD and 10% F1 gains on CHIMERA-Bench.
-
scShapeBench: Discovering geometry from high dimensional scRNAseq data
scShapeBench supplies synthetic and real annotated single-cell datasets across four shape categories, with scReebTower outperforming PAGA and Mapper on topology-aware metrics.
-
Structural Interpretations of Protein Language Model Representations via Differentiable Graph Partitioning
SoftBlobGIN combines ESM-2 representations with protein contact graphs via a lightweight GNN and differentiable substructure pooling to achieve 92.8% accuracy on enzyme classification, raise binding-site AUROC to 0.983, and generate auditable structural explanations without retraining the language模型
-
AgForce Enables Antigen-conditioned Generative Antibody Design
AgForce improves antigen-conditioned antibody design by using framework dropout, gated bottlenecks, hyperbolic cross attention, MDN sequence head with Potts-like coupling, annealed MCL, and antigen cycle consistency to achieve 8% better amino acid recovery and superior binding metrics on CHIMERA-BEN
-
EvoStruct: Bridging Evolutionary and Structural Priors for Antibody CDR Design via Protein Language Model Adaptation
EvoStruct integrates evolutionary priors from a protein language model with structural priors from an E(3)-equivariant GNN to raise amino acid recovery by 16% and diversity by 2.3x on CHIMERA-Bench while cutting perplexity 43%.
-
Multi-level Self-supervised Pretraining on Compositional Hierarchical Graph for Molecular Property Prediction
MolCHG uses a multi-level compositional hierarchical graph with atom-bond cross-view contrastive learning, functional group prediction, and structure tasks to achieve top results on seven of nine MoleculeNet benchmarks.
-
Rethinking Molecular OOD Generalization via Target-Aware Source Selection
SCOPE-BENCH shows state-of-the-art molecular models suffer up to 8x higher errors under extreme OOD, while POMA reduces mean absolute error by up to 11.2% via target-aware source selection and dual-scale adaptation.
-
A Graph Foundation Model for Wireless Resource Allocation
A pre-trained interference-aware graph Transformer model for wireless resource allocation that achieves strong few-shot adaptation to new tasks and scenarios.
-
BiScale-GTR: Fragment-Aware Graph Transformers for Multi-Scale Molecular Representation Learning
BiScale-GTR achieves claimed state-of-the-art results on MoleculeNet, PharmaBench and LRGB by combining improved fragment tokenization with a parallel GNN-Transformer architecture that operates at both atom and fragment scales.
-
MolDA: Molecular Understanding and Generation via Large Language Diffusion Model
MolDA is a multimodal molecular model that uses a discrete large language diffusion backbone plus a hybrid graph encoder to achieve better global coherence and validity than autoregressive approaches.
-
Energy-Guided Generative Modeling for Low-Energy Molecular Structure Discovery
EnFlow integrates flow-based conformer generation with energy landscape modeling to enable joint ensemble generation and ground-state identification using only 1-2 ODE steps.
-
SSL4RL: Revisiting Self-supervised Learning as Intrinsic Reward for Visual-Language Reasoning
SSL4RL reformulates self-supervised learning objectives into dense, verifiable reward signals for RL-based fine-tuning of vision-language models, yielding performance gains on reasoning benchmarks.
-
GraphPINE: Graph Importance Propagation for Interpretable Drug Response Prediction
GraphPINE is a GNN architecture that initializes node importance from prior knowledge graphs and propagates updates via an importance propagation layer for interpretable drug response prediction on over 5,000 genes and 952 drugs.
-
Pretraining a Foundation Model for Small-Molecule Natural Products
NaFM is a pretrained foundation model for natural products using scaffold-focused contrastive learning and masked graph objectives that achieves SOTA on taxonomy classification, gene/microbial analysis, and virtual screening tasks.
-
FARM: Enhancing Molecular Representations with Functional Group Awareness
FARM adds atomic-level functional group annotations to create FG-enhanced SMILES and FG graphs, trains them with masked language modeling and GNNs plus contrastive alignment, and reports state-of-the-art results on 8 of 13 MoleculeNet tasks.
-
Mesh Based Simulations with Spatial and Temporal awareness
A unified training framework for mesh-based ML surrogates in CFD improves accuracy and long-horizon stability by enforcing spatial derivative consistency via multi-node prediction, using temporal cross-attention correction, and adding 3D rotary positional embeddings.
-
Synergistic Benefits of Joint Molecule Generation and Property Prediction
Hyformer jointly models molecule generation and property prediction via alternating attention and joint pre-training, showing synergistic gains in conditional sampling, OOD prediction, and a drug design case for antimicrobial peptides.
-
Disentangled Generative Graph Representation Learning
DiGGR introduces a self-supervised graph representation learning framework that disentangles latent factors to guide mask modeling and improve representation quality on graph tasks.
-
On Improving Graph Neural Networks for QSAR by Pre-training on Extended-Connectivity Fingerprints
Pre-training GNNs on ECFP prediction produces statistically significant QSAR gains on five of six Biogen benchmarks with OOD splits, but underperforms on heterogeneous datasets and complex endpoints like binding affinity.
-
Financial Anomaly Detection for the Canadian Market
Neural network and TDA methods outperform PCA at detecting financial anomalies in the Canadian TSX-60 market.
- Property Enhanced Instruction Tuning for Multi-task Molecule Generation with Large Language Models