R-DMesh generates high-fidelity 4D meshes aligned to video by disentangling base mesh, motion, and a learned rectification jump offset inside a VAE, then using Triflow Attention and rectified-flow diffusion.
hub
Proceedings of the AAAI conference on artificial intelligence , volume=
11 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
A hypernetwork maps style motion embeddings to LoRA updates that stylize text-driven motion diffusion models with improved generalization to unseen styles via contrastive structuring of the style space.
A graph-based neural operator trained on expert-validated race-car CFD data reaches accuracy levels usable for early-stage interactive aerodynamic design exploration.
HyperBones combines bone-driven neural dynamics with hypernetwork conditioning and a CNN wrinkle layer to deliver physically plausible real-time garment simulation across body shapes at 300+ FPS.
SwAIther-Precip uses lead-time-conditioned U-Net bias correction followed by diffusion-based super-resolution to downscale AIFS forecasts, achieving 48% CRPS reduction and ~4 km effective resolution up to 5 days lead time.
Observational and counterfactual distributions are linked by identical support and invariant features, enabling a flow-matching estimator with semiparametric efficiency correction to generate debiased counterfactuals from observations.
PixArt-α matches commercial text-to-image quality with a diffusion transformer trained in 675 A100 GPU days through decomposed training stages, cross-attention text injection, and vision-language model dense captions.
In a two-agent Almgren-Chriss liquidation game, deep RL agents given intra-episode history of prices and own actions achieve supra-competitive outcomes more frequently and persistently than agents without such memory.
DyGRO-VLA is a two-stage optimization framework for cross-task scaling of Vision-Language-Action models via dynamic grouped residual optimization in RL.
InfoGeo reformulates cross-view geo-localization as an information bottleneck that aligns object-centric structural relations while suppressing view-specific noise, outperforming prior methods on benchmarks.
SuperIgor uses iterative co-training of a language model planner and a goal-conditional RL agent to self-generate and refine plans, resulting in stricter instruction adherence and better generalization to unseen instructions.
citing papers explorer
-
R-DMesh: Video-Guided 3D Animation via Rectified Dynamic Mesh Flow
R-DMesh generates high-fidelity 4D meshes aligned to video by disentangling base mesh, motion, and a learned rectification jump offset inside a VAE, then using Triflow Attention and rectified-flow diffusion.
-
Stylized Text-to-Motion Generation via Hypernetwork-Driven Low-Rank Adaptation
A hypernetwork maps style motion embeddings to LoRA updates that stylize text-driven motion diffusion models with improved generalization to unseen styles via contrastive structuring of the style space.
-
Faster by Design: Interactive Aerodynamics via Neural Surrogates Trained on Expert-Validated CFD
A graph-based neural operator trained on expert-validated race-car CFD data reaches accuracy levels usable for early-stage interactive aerodynamic design exploration.
-
HyperBones: Realtime Bone-driven Neural Garment Simulation with Hypernetwork Conditioning
HyperBones combines bone-driven neural dynamics with hypernetwork conditioning and a CNN wrinkle layer to deliver physically plausible real-time garment simulation across body shapes at 300+ FPS.
-
SwAIther-Precip: Lead-Time-Aware Bias Correction Enables Kilometer-Scale Downscaling of Global AI Precipitation Forecasts over Switzerland
SwAIther-Precip uses lead-time-conditioned U-Net bias correction followed by diffusion-based super-resolution to downscale AIFS forecasts, achieving 48% CRPS reduction and ~4 km effective resolution up to 5 days lead time.
-
Debiased Counterfactual Generation via Flow Matching from Observations
Observational and counterfactual distributions are linked by identical support and invariant features, enabling a flow-matching estimator with semiparametric efficiency correction to generate debiased counterfactuals from observations.
-
PixArt-$\alpha$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
PixArt-α matches commercial text-to-image quality with a diffusion transformer trained in 675 A100 GPU days through decomposed training stages, cross-attention text injection, and vision-language model dense captions.
-
Memory-Induced Supra-Competitive Outcomes Between Deep Reinforcement Learning Agents in Optimal Trade Execution
In a two-agent Almgren-Chriss liquidation game, deep RL agents given intra-episode history of prices and own actions achieve supra-competitive outcomes more frequently and persistently than agents without such memory.
-
DyGRO-VLA: Cross-Task Scaling of Vision-Language-Action Models via Dynamic Grouped Residual Optimization
DyGRO-VLA is a two-stage optimization framework for cross-task scaling of Vision-Language-Action models via dynamic grouped residual optimization in RL.
-
InfoGeo: Information-Theoretic Object-Centric Learning for Cross-View Generalizable UAV Geo-Localization
InfoGeo reformulates cross-view geo-localization as an information bottleneck that aligns object-centric structural relations while suppressing view-specific noise, outperforming prior methods on benchmarks.
-
Self-Guided Plan Extraction for Instruction-Following Tasks with Goal-Conditional Reinforcement Learning
SuperIgor uses iterative co-training of a language model planner and a goal-conditional RL agent to self-generate and refine plans, resulting in stricter instruction adherence and better generalization to unseen instructions.