pith. sign in

Progen: Language modeling for protein generation.arXiv preprint arXiv:2004.03497, 2020

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

citation-role summary

background 3

citation-polarity summary

roles

background 3

polarities

background 3

clear filters

representative citing papers

Rethinking Attention with Performers

cs.LG · 2020-09-30 · unverdicted · novelty 7.0

Performers approximate full-rank softmax attention in Transformers via FAVOR+ random features for linear complexity, with theoretical guarantees of unbiased estimation and competitive results on pixel, text, and protein tasks.

Scaling Data-Constrained Language Models

cs.CL · 2023-05-25 · conditional · novelty 6.0

Repeating training data up to 4 epochs yields negligible loss increase versus unique data for fixed compute, and a new scaling law accounts for the decaying value of repeated tokens and excess parameters.

Fundamental Trade-Offs in Multi-Bit Watermarking of Stochastic Processes

cs.IT · 2026-05-09 · unverdicted · novelty 5.0

Derives matched converse and achievability bounds that characterize optimal trade-offs among false-alarm probability, detection error probability, distortion, and information rate for multi-bit watermarking of stationary ergodic stochastic processes.

Co-Generative De Novo Functional Protein Design

q-bio.QM · 2026-05-01 · unverdicted · novelty 5.0

CodeFP jointly generates protein sequences and structures using functional local structures and auxiliary supervision, yielding 6.1% better functional consistency and 3.2% better foldability than prior baselines.

citing papers explorer

Showing 8 of 8 citing papers.

  • Rethinking Attention with Performers cs.LG · 2020-09-30 · unverdicted · none · ref 139

    Performers approximate full-rank softmax attention in Transformers via FAVOR+ random features for linear complexity, with theoretical guarantees of unbiased estimation and competitive results on pixel, text, and protein tasks.

  • Demystifying Multimodal Biomolecular Co-design With Intrinsic Geodesic Coupling q-bio.BM · 2026-06-01 · unverdicted · none · ref 156

    GeoCoupling optimizes temporal couplings between modalities in biomolecular generative models and outperforms synchronous baselines on drug design and protein design tasks.

  • AMix-2: Establishing Protein as a Native Modality in Large Language Models q-bio.BM · 2026-05-29 · unverdicted · none · ref 55

    AMix-2 unifies protein sequences and text in one LLM via shared tokens and block-wise diffusion modeling, introduces the ProteinArena benchmark, and reports competitive performance against task-specific protein models and frontier LLMs.

  • ProteinOPD: Towards Effective and Efficient Preference Alignment for Protein Design cs.LG · 2026-05-11 · unverdicted · none · ref 24

    ProteinOPD uses token-level on-policy distillation from multiple preference-specific teacher models into a shared student to balance competing objectives in protein design, delivering gains on targets without losing designability and an 8x speedup over RL baselines.

  • From Words to Amino Acids: Does the Curse of Depth Persist? cs.LG · 2026-02-25 · unverdicted · none · ref 23

    Protein language models exhibit consistent depth inefficiency where most task-relevant computation occurs in a subset of layers, mirroring patterns in large language models.

  • Scaling Data-Constrained Language Models cs.CL · 2023-05-25 · conditional · none · ref 71

    Repeating training data up to 4 epochs yields negligible loss increase versus unique data for fixed compute, and a new scaling law accounts for the decaying value of repeated tokens and excess parameters.

  • Fundamental Trade-Offs in Multi-Bit Watermarking of Stochastic Processes cs.IT · 2026-05-09 · unverdicted · none · ref 4

    Derives matched converse and achievability bounds that characterize optimal trade-offs among false-alarm probability, detection error probability, distortion, and information rate for multi-bit watermarking of stationary ergodic stochastic processes.

  • Co-Generative De Novo Functional Protein Design q-bio.QM · 2026-05-01 · unverdicted · none · ref 2

    CodeFP jointly generates protein sequences and structures using functional local structures and auxiliary supervision, yielding 6.1% better functional consistency and 3.2% better foldability than prior baselines.