hub

Pytorch: An imperative style, high-performance deep learning library

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al · 2019

16 Pith papers cite this work. Polarity classification is still indexing.

16 Pith papers citing it

browse 16 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Progress measures for grokking via mechanistic interpretability

cs.LG · 2023-01-12 · accept · novelty 8.0

Grokking arises from gradual amplification of a Fourier-based circuit in the weights followed by removal of memorizing components.

LeapTS: Rethinking Time Series Forecasting as Adaptive Multi-Horizon Scheduling

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

LeapTS reformulates forecasting as adaptive multi-horizon scheduling via hierarchical control and NCDEs, delivering at least 7.4% better performance and 2.6-5.3x faster inference than Transformer baselines while adapting to non-stationary dynamics.

Revisiting Mixture Policies in Entropy-Regularized Actor-Critic

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

A new marginalized reparameterization estimator allows low-variance training of mixture policies in entropy-regularized actor-critic algorithms, matching or exceeding Gaussian policy performance in several continuous control benchmarks.

FieryGS: In-the-Wild Fire Synthesis with Physics-Integrated Gaussian Splatting

cs.GR · 2026-04-30 · unverdicted · novelty 7.0

FieryGS integrates LLM-based material reasoning, volumetric combustion simulation, and a unified renderer with 3D Gaussian Splatting to generate physically plausible and user-controllable fire in in-the-wild scenes.

Modeling the Quantum Photon Statistics in Hybrid Light-Matter Integrated Circuits

quant-ph · 2026-05-22 · unverdicted · novelty 6.0

A new modeling framework represents pulsed polariton waveguide dynamics as a dissipative bosonic quantum circuit to predict antibunching and sub-Poissonian statistics in single and multimode integrated circuit configurations.

MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction

cs.IR · 2025-09-22 · unverdicted · novelty 6.0

MetaEmbed trains fixed learnable Meta Tokens to produce granularity-organized multi-vector embeddings that support test-time scaling in multimodal retrieval.

AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation

cs.CV · 2025-07-17 · unverdicted · novelty 6.0

AnyPos automates task-agnostic action collection and inverse-dynamics modeling with arm/end-effector decoupling plus a direction-aware decoder, delivering 51% higher test accuracy and 30-40% better success rates on bimanual tasks.

A Two-Phase Deep Learning Framework for Adaptive Time-Stepping in High-Speed Flow Modeling

cs.LG · 2025-06-09 · unverdicted · novelty 6.0

ShockCast is a two-phase ML method that predicts adaptive timestep sizes to model high-speed flows with shocks more efficiently than fixed-step approaches.

Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction

cs.CR · 2025-04-29 · unverdicted · novelty 6.0

The method prompts LLMs to output both answers and references to the executed instructions, then filters out any answers not linked to the original input instructions, reducing attack success rates to zero in tested scenarios while preserving utility.

Siamese Foundation Models for Crystal Structure Prediction

cond-mat.mtrl-sci · 2025-03-13 · unverdicted · novelty 6.0

DAO pretrains Siamese diffusion-based models on stable/unstable crystal data to achieve 100% experimental match on Cr6Os2 and 2000x speedup over DFT on real superconductors.

Analytical FFN-to-MoE Restructuring via Activation Pattern Analysis

cs.LG · 2025-02-06 · unverdicted · novelty 6.0

An analytical post-training method restructures FFNs into MoE by partitioning neurons based on activation patterns and building a router from statistics, achieving 1.17x speedup with minimal resources.

ConjNorm: Tractable Density Estimation for Out-of-Distribution Detection

cs.LG · 2024-02-27 · unverdicted · novelty 6.0

ConjNorm reframes OOD detection score design as optimizing norm p in an exponential family density model via a Bregman divergence theorem, with a tractable Monte Carlo estimator, claiming SOTA gains on CIFAR-100 and ImageNet-1K.

SGLang: Efficient Execution of Structured Language Model Programs

cs.AI · 2023-12-12 · conditional · novelty 6.0

SGLang is a new system that speeds up structured LLM programs by up to 6.4x using RadixAttention for KV cache reuse and compressed finite state machines for output decoding.

M$^2$FedAQI: Multimodal Federated Learning for Air Quality Prediction on Heterogeneous Edge Devices

cs.LG · 2026-05-10 · unverdicted · novelty 5.0

M²FedAQI is a lightweight multimodal federated framework that fuses visual and tabular data via feature modulation for improved AQI prediction and regression on heterogeneous edge devices.

CraftGraffiti: Exploring Human Identity with Custom Graffiti Art via Facial-Preserving Diffusion Models

cs.CV · 2025-08-28 · unverdicted · novelty 5.0

CraftGraffiti applies LoRA-tuned diffusion transformers followed by identity-augmented self-attention and CLIP-guided pose extension to generate graffiti while preserving facial features.

Zero-Shot Function Encoder-Based Differentiable Predictive Control

eess.SY · 2025-11-07 · unverdicted · novelty 4.0

A differentiable framework integrates function encoder-based neural ODEs with predictive control to enable zero-shot adaptation of explicit policies across families of nonlinear systems.

citing papers explorer

Showing 16 of 16 citing papers.

Progress measures for grokking via mechanistic interpretability cs.LG · 2023-01-12 · accept · none · ref 44
Grokking arises from gradual amplification of a Fourier-based circuit in the weights followed by removal of memorizing components.
LeapTS: Rethinking Time Series Forecasting as Adaptive Multi-Horizon Scheduling cs.LG · 2026-05-11 · unverdicted · none · ref 119
LeapTS reformulates forecasting as adaptive multi-horizon scheduling via hierarchical control and NCDEs, delivering at least 7.4% better performance and 2.6-5.3x faster inference than Transformer baselines while adapting to non-stationary dynamics.
Revisiting Mixture Policies in Entropy-Regularized Actor-Critic cs.LG · 2026-05-09 · unverdicted · none · ref 38
A new marginalized reparameterization estimator allows low-variance training of mixture policies in entropy-regularized actor-critic algorithms, matching or exceeding Gaussian policy performance in several continuous control benchmarks.
FieryGS: In-the-Wild Fire Synthesis with Physics-Integrated Gaussian Splatting cs.GR · 2026-04-30 · unverdicted · none · ref 125
FieryGS integrates LLM-based material reasoning, volumetric combustion simulation, and a unified renderer with 3D Gaussian Splatting to generate physically plausible and user-controllable fire in in-the-wild scenes.
Modeling the Quantum Photon Statistics in Hybrid Light-Matter Integrated Circuits quant-ph · 2026-05-22 · unverdicted · none · ref 35
A new modeling framework represents pulsed polariton waveguide dynamics as a dissipative bosonic quantum circuit to predict antibunching and sub-Poissonian statistics in single and multimode integrated circuit configurations.
MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction cs.IR · 2025-09-22 · unverdicted · none · ref 48
MetaEmbed trains fixed learnable Meta Tokens to produce granularity-organized multi-vector embeddings that support test-time scaling in multimodal retrieval.
AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation cs.CV · 2025-07-17 · unverdicted · none · ref 32
AnyPos automates task-agnostic action collection and inverse-dynamics modeling with arm/end-effector decoupling plus a direction-aware decoder, delivering 51% higher test accuracy and 30-40% better success rates on bimanual tasks.
A Two-Phase Deep Learning Framework for Adaptive Time-Stepping in High-Speed Flow Modeling cs.LG · 2025-06-09 · unverdicted · none · ref 111
ShockCast is a two-phase ML method that predicts adaptive timestep sizes to model high-speed flows with shocks more efficiently than fixed-step approaches.
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction cs.CR · 2025-04-29 · unverdicted · none · ref 29
The method prompts LLMs to output both answers and references to the executed instructions, then filters out any answers not linked to the original input instructions, reducing attack success rates to zero in tested scenarios while preserving utility.
Siamese Foundation Models for Crystal Structure Prediction cond-mat.mtrl-sci · 2025-03-13 · unverdicted · none · ref 62
DAO pretrains Siamese diffusion-based models on stable/unstable crystal data to achieve 100% experimental match on Cr6Os2 and 2000x speedup over DFT on real superconductors.
Analytical FFN-to-MoE Restructuring via Activation Pattern Analysis cs.LG · 2025-02-06 · unverdicted · none · ref 28
An analytical post-training method restructures FFNs into MoE by partitioning neurons based on activation patterns and building a router from statistics, achieving 1.17x speedup with minimal resources.
ConjNorm: Tractable Density Estimation for Out-of-Distribution Detection cs.LG · 2024-02-27 · unverdicted · none · ref 55
ConjNorm reframes OOD detection score design as optimizing norm p in an exponential family density model via a Bregman divergence theorem, with a tractable Monte Carlo estimator, claiming SOTA gains on CIFAR-100 and ImageNet-1K.
SGLang: Efficient Execution of Structured Language Model Programs cs.AI · 2023-12-12 · conditional · none · ref 37
SGLang is a new system that speeds up structured LLM programs by up to 6.4x using RadixAttention for KV cache reuse and compressed finite state machines for output decoding.
M$^2$FedAQI: Multimodal Federated Learning for Air Quality Prediction on Heterogeneous Edge Devices cs.LG · 2026-05-10 · unverdicted · none · ref 29
M²FedAQI is a lightweight multimodal federated framework that fuses visual and tabular data via feature modulation for improved AQI prediction and regression on heterogeneous edge devices.
CraftGraffiti: Exploring Human Identity with Custom Graffiti Art via Facial-Preserving Diffusion Models cs.CV · 2025-08-28 · unverdicted · none · ref 38
CraftGraffiti applies LoRA-tuned diffusion transformers followed by identity-augmented self-attention and CLIP-guided pose extension to generate graffiti while preserving facial features.
Zero-Shot Function Encoder-Based Differentiable Predictive Control eess.SY · 2025-11-07 · unverdicted · none · ref 46
A differentiable framework integrates function encoder-based neural ODEs with predictive control to enable zero-shot adaptation of explicit policies across families of nonlinear systems.

Pytorch: An imperative style, high-performance deep learning library

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer