Attention is all you need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, Illia Polosukhin · 2017

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

citation-role summary

method 2

citation-polarity summary

use method 2

representative citing papers

Can LLMs Predict Polymer Physics Just by Reading Synthesis and Processing Prose?

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

PolyLM fine-tunes a 9B-parameter LLM on 185k papers to predict polymer properties from text alone, achieving median R² of 0.74 on 68k held-out samples.

Patch-Effect Graph Kernels for LLM Interpretability

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

Patch-effect graphs built from causal mediation, partial correlation, and co-influence, when analyzed with graph kernels, preserve task-discriminative signals from activation patching that outperform global shape descriptors and raw baselines on GPT-2 Small.

From Single-Step Edit Response to Multi-Step Molecular Optimization

cs.AI · 2026-05-11 · unverdicted · novelty 5.0

A new method decomposes property differences between weakly related molecules into minimal chemical edits to train a directional evaluator that guides multi-step optimization with less oracle querying.

Comparative Analysis of Large Language Models in Healthcare

cs.CL · 2026-04-11 · unverdicted · novelty 3.0

Domain-specific models like ChatDoctor excel at medically accurate and contextually reliable text while general-purpose models like Grok and LLaMA perform better on structured medical question-answering tasks.

citing papers explorer

Showing 4 of 4 citing papers.

Can LLMs Predict Polymer Physics Just by Reading Synthesis and Processing Prose? cs.LG · 2026-05-07 · unverdicted · none · ref 23
PolyLM fine-tunes a 9B-parameter LLM on 185k papers to predict polymer properties from text alone, achieving median R² of 0.74 on 68k held-out samples.
Patch-Effect Graph Kernels for LLM Interpretability cs.AI · 2026-05-07 · unverdicted · none · ref 15
Patch-effect graphs built from causal mediation, partial correlation, and co-influence, when analyzed with graph kernels, preserve task-discriminative signals from activation patching that outperform global shape descriptors and raw baselines on GPT-2 Small.
From Single-Step Edit Response to Multi-Step Molecular Optimization cs.AI · 2026-05-11 · unverdicted · none · ref 42
A new method decomposes property differences between weakly related molecules into minimal chemical edits to train a directional evaluator that guides multi-step optimization with less oracle querying.
Comparative Analysis of Large Language Models in Healthcare cs.CL · 2026-04-11 · unverdicted · none · ref 42
Domain-specific models like ChatDoctor excel at medically accurate and contextually reliable text while general-purpose models like Grok and LLaMA perform better on structured medical question-answering tasks.

Attention is all you need

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer