A hypernetwork maps style motion embeddings to LoRA updates that stylize text-driven motion diffusion models with improved generalization to unseen styles via contrastive structuring of the style space.
hub
Proceedings of the AAAI conference on artificial intelligence , volume=
12 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
A graph-based neural operator trained on expert-validated race-car CFD data reaches accuracy levels usable for early-stage interactive aerodynamic design exploration.
HyperBones trains a reduced-space neural dynamics model with bone-driven coarse simulation and CNN-based wrinkle recovery to produce plausible garment motion at 300+ FPS using physics supervision without an external simulator.
SwAIther-Precip uses lead-time-conditioned U-Net bias correction followed by diffusion-based generative downscaling to reduce CRPS by 48% and achieve ~4 km effective resolution from 0.25° AIFS forecasts.
R-DMesh proposes a VAE-based disentanglement of base mesh, motion trajectories, and rectification offset plus Triflow Attention and rectified-flow diffusion to produce 4D meshes aligned to video despite initial pose mismatch.
Observational and counterfactual distributions are linked by identical support and invariant features, enabling a flow-matching estimator with semiparametric efficiency correction to generate debiased counterfactuals from observations.
InfoGeo reformulates cross-view geo-localization as an information bottleneck that aligns object-centric structural relations across views while suppressing view-specific noise.
OGPO enables sample-efficient full-finetuning of generative control policies via off-policy critics and modified PPO, achieving SOTA on robot manipulation tasks while rescuing poorly initialized behavior cloning policies without expert data.
PixArt-α matches commercial text-to-image quality with a diffusion transformer trained in 675 A100 GPU days through decomposed training stages, cross-attention text injection, and vision-language model dense captions.
In a two-agent Almgren-Chriss liquidation game, deep RL agents given intra-episode history of prices and own actions achieve supra-competitive outcomes more frequently and persistently than agents without such memory.
DyGRO-VLA is a two-stage optimization framework for cross-task scaling of Vision-Language-Action models via dynamic grouped residual optimization in RL.
SuperIgor uses iterative co-training of a language model planner and a goal-conditional RL agent to self-generate and refine plans, resulting in stricter instruction adherence and better generalization to unseen instructions.
citing papers explorer
No citing papers match the current filters.