pith. sign in

super hub Mixed citations

Advances in neural information processing systems , volume=

Mixed citation behavior. Most common role is background (67%).

130 Pith papers citing it
Background 67% of classified citations

hub tools

citation-role summary

background 10 method 1 other 1

citation-polarity summary

claims ledger

  • background phenomenon might further improve the performance of the co-scientist as a tool for scientific discovery. Improved multimodal reasoning and tool-use capabilities.Some of the most interesting data in scientific publications is not written in text but may be encoded visually in figures and charts. However, even state-of-the-art frontier models may not comprehensively utilize such data with optimal reasoning [89] and the AI co-scientist system is unlikely to be an exception. Stronger benchmarks and
  • other Bernoulli random variables with α(q) success probability. According to the Hoeffding's inequal- ity Hoeffding (1963) we have P(|m−α(q)N| ≥t)≤2 exp  −2t2 N  ,(8) wheret≥0. It implies P(|m−α(q)N| ≤t)≥1−2 exp  −2t2 N  ⇔P(−t≤m−α(q)N≤t)≥1−2 exp  −2t2 N  ⇒P(−t≤m−α(q)N)≥1−2 exp  −2t2 N  . By settingt= q N 2 log 2 ϵ from anyϵ >0, one may check, P m≥α(q)N− r N 2 log 1 ϵ ! ≥1−ϵ.(9) In order to ensurem≥N/2, we need to have α(q)N− r N 2 log 1 ϵ ≥ 1 2 N ⇒α(q)≥ 1 2 + r 1 2N log 1 ϵ (10) B Discussion T
  • background performance would not be significantly degraded if projBx were replaced with a nonzero constant value more representative of the training distribution. To correct for this, in our causality experiments we ablate directions by replacing them with theirmeanvalues computed across a dataset, instead of zeroing them out. Specifically, to ablate a directionu, we use the formula: x′ =x+P u(x−x)(21) whereP u is the projection matrix foruand xis the mean representation. E. Static interpretability analysi
  • background an All-Reduce operation so that they can use the identical gradient to update the model parameters. The All-Reduce operation accumulates distributed gradients (say Xi at i worker) from all workers (say P workers) using a reduction operation (typically sum or mean in training), which can be formally represented X=AllReduce(X 1, X2, ..., XP ) = PX i=1 Xi.(5) The gradients have the same dimensionality as the model weights, which means additional memory is required to store them for communication an

authors

co-cited works

representative citing papers

When Does Model Collapse Occur in Structured Interactive Learning?

cs.LG · 2026-05-19 · unverdicted · novelty 7.0

Model collapse occurs in structured interactive learning if and only if the directed interaction graph satisfies a specific topological condition, with finite-sample guarantees for linear regression and asymptotic results for M-estimators.

Pointwise Generalization in Deep Neural Networks

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

Proposes pointwise Riemannian Dimension from feature eigenvalues to derive tighter, representation-aware generalization bounds for deep networks in the nonlinear regime.

Language-Induced Priors for Domain Adaptation

cs.LG · 2026-05-14 · conditional · novelty 7.0

Language-Induced Priors from LLMs guide source selection in cold-start domain adaptation through an EM algorithm, matching oracle MSE under a correct prior and remaining asymptotically consistent.

BOOKMARKS: Efficient Active Storyline Memory for Role-playing

cs.CL · 2026-05-13 · unverdicted · novelty 7.0

BOOKMARKS introduces searchable bookmarks as reusable answers to storyline questions, enabling active initialization and passive synchronization for more consistent role-playing agent memory than recurrent summarization.

Variance-aware Reward Modeling with Anchor Guidance

stat.ML · 2026-05-12 · unverdicted · novelty 7.0

Anchor-guided variance-aware reward modeling uses two response-level anchors to resolve non-identifiability in Gaussian models of pluralistic preferences, yielding provable identification, a joint training objective, and improved RLHF performance.

citing papers explorer

Showing 50 of 130 citing papers.