pith. sign in

Jan Kautz

Identifiers

  • name variant Jan Kautz 0.60 · backfill

Papers (86)

  1. GRAIL: Generating Humanoid Loco-Manipulation from 3D Assets and Video Priors cs.RO · 2026 · author #15
  2. Cosmos 3: Omnimodal World Models for Physical AI cs.CV · 2026 · author #120
  3. Scaling Parallel Sequence Models to Foundation-Scale Vision Encoders cs.CV · 2026 · author #17
  4. Grounded 3D-Aware Spatial Vision-Language Modeling cs.CV · 2026 · author #12
  5. LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding cs.CV · 2026 · author #11
  6. Polar: Agentic RL on Any Harness at Scale cs.DC · 2026 · author #11
  7. Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention cs.AI · 2026 · author #3
  8. D-Rex : Diffusion Rendering for Relightable Expressive Avatars cs.GR · 2026 · author #4
  9. Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence cs.LG · 2026 · author #209
  10. SpaCeFormer: Fast Proposal-Free Open-Vocabulary 3D Instance Segmentation cs.CV · 2026 · author #5
  11. Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning cs.LG · 2026 · author #212
  12. World Action Models are Zero-shot Policies cs.RO · 2026 · author #33
  13. Learning to Discover at Test Time cs.LG · 2026 · author #7
  14. GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization cs.CL · 2026 · author #12
  15. NVIDIA Nemotron 3: Efficient and Open Intelligence cs.CL · 2025 · author #135
  16. Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed cs.CL · 2025 · author #12
  17. SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control cs.RO · 2025 · author #24
  18. World Simulation with Video Foundation Models for Physical AI cs.CV · 2025 · author #36
  19. ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge cs.CL · 2025 · author #9
  20. NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model cs.CL · 2025 · author #77
  21. ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models cs.CL · 2025 · author #7
  22. FLARE: Robot Learning with Implicit World Modeling cs.RO · 2025 · author #18
  23. DreamGen: Unlocking Generalization in Robot Learning through Video World Models cs.RO · 2025 · author #25
  24. GR00T N1: An Open Foundation Model for Generalist Humanoid Robots cs.RO · 2025 · author #13
  25. Gated Delta Networks: Improving Mamba2 with Delta Rule cs.CL · 2024 · author #2
  26. NVILA: Efficient Frontier Visual Language Models cs.CV · 2024 · author #24
  27. LongVILA: Scaling Long-Context Visual Language Models for Long Videos cs.CV · 2024 · author #14
  28. An Empirical Study of Mamba-based Language Models cs.LG · 2024 · author #14
  29. Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation cs.CV · 2024 · author #9
  30. CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation cs.CV · 2024 · author #5
  31. Importance Estimation for Neural Network Pruning cs.LG · 2019 · author #5
  32. SCOPS: Self-Supervised Co-Part Segmentation cs.CV · 2019 · author #6
  33. STEP: Spatio-Temporal Progressive Learning for Video Action Detection cs.CV · 2019 · author #6
  34. Pixel-Adaptive Convolutional Neural Networks cs.CV · 2019 · author #6
  35. Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments cs.CV · 2019 · author #6
  36. NRMVS: Non-Rigid Multi-View Stereo cs.CV · 2019 · author #7
  37. Neural RGB->D Sensing: Depth and Uncertainty from a Video Camera cs.CV · 2019 · author #5
  38. PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image cs.CV · 2018 · author #5
  39. Context-Aware Synthesis and Placement of Object Instances cs.CV · 2018 · author #6
  40. A Fusion Approach for Multi-Frame Optical Flow Estimation cs.CV · 2018 · author #6
  41. Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation cs.CV · 2018 · author #4
  42. Video-to-Video Synthesis cs.CV · 2018 · author #6
  43. Learning Linear Transformations for Fast Arbitrary Style Transfer cs.CV · 2018 · author #3
  44. EOE: Expected Overlap Estimation over Unstructured Point Cloud Data cs.CV · 2018 · author #3
  45. Simultaneous Edge Alignment and Learning cs.CV · 2018 · author #7
  46. Tackling 3D ToF Artifacts Through Learning and the FLAT Dataset cs.CV · 2018 · author #5
  47. Superpixel Sampling Networks cs.CV · 2018 · author #5
  48. Domain Stylization: A Strong, Simple Baseline for Synthetic to Real Image Domain Adaptation cs.CV · 2018 · author #5
  49. Fast and Accurate Point Cloud Registration using Trees of Gaussian Mixtures cs.CV · 2018 · author #3
  50. Synthetically Trained Neural Networks for Learning Human-Readable Plans from Real-World Demonstrations cs.RO · 2018 · author #5
  51. IamNN: Iterative and Adaptive Mobile Neural Network for Efficient Image Classification cs.CV · 2018 · author #6
  52. Hand Pose Estimation via Latent 2.5D Heatmap Regression cs.CV · 2018 · author #5
  53. Switchable Temporal Propagation Network cs.CV · 2018 · author #7
  54. Light-weight Head Pose Invariant Gaze Tracking cs.CV · 2018 · author #3
  55. Multimodal Unsupervised Image-to-Image Translation cs.CV · 2018 · author #4
  56. Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation cs.CV · 2018 · author #6
  57. Deep Semantic Face Deblurring cs.CV · 2018 · author #4
  58. SPLATNet: Sparse Lattice Networks for Point Cloud Processing cs.CV · 2018 · author #7
  59. A Closed-form Solution to Photorealistic Image Stylization cs.CV · 2018 · author #5
  60. Reblur2Deblur: Deblurring Videos via Self-Supervised Learning cs.CV · 2018 · author #6
  61. Learning Binary Residual Representations for Domain-specific Video Streaming cs.CV · 2017 · author #5
  62. Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals cs.CV · 2017 · author #8
  63. Geometry-Aware Learning of Maps for Camera Localization cs.CV · 2017 · author #5
  64. Sim-to-Real Transfer of Accurate Grasping with Eye-In-Hand Observations and Continuous Control cs.RO · 2017 · author #4
  65. Separating Reflection and Transmission Images in the Wild cs.CV · 2017 · author #4
  66. Budget-Aware Activity Detection with A Recurrent Policy Network cs.CV · 2017 · author #4
  67. Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation cs.CV · 2017 · author #6
  68. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs cs.CV · 2017 · author #5
  69. On Nearest Neighbors in Non Local Means Denoising cs.CV · 2017 · author #2
  70. Multiframe Scene Flow with Piecewise Rigid Motion cs.CV · 2017 · author #6
  71. Learning Affinity via Spatial Propagation Networks cs.CV · 2017 · author #6
  72. Learning to Segment Instances in Videos with Spatial Propagation Network cs.CV · 2017 · author #7
  73. PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume cs.CV · 2017 · author #4
  74. Improving Landmark Localization with Semi-Supervised Learning cs.CV · 2017 · author #6
  75. Intrinsic3D: High-Quality 3D Reconstruction by Joint Appearance and Geometry Optimization with Spatially-Varying Lighting cs.CV · 2017 · author #4
  76. Cascaded Scene Flow Prediction using Semantic Segmentation cs.CV · 2017 · author #3
  77. MoCoGAN: Decomposing Motion and Content for Video Generation cs.CV · 2017 · author #4
  78. A Lightweight Approach for On-the-Fly Reflectance Estimation cs.CV · 2017 · author #6
  79. Unsupervised Image-to-Image Translation Networks cs.CV · 2017 · author #3
  80. Deep Learning with Energy-efficient Binary Gradient Cameras cs.CV · 2016 · author #4
  81. Pruning Convolutional Neural Networks for Resource Efficient Inference cs.LG · 2016 · author #5
  82. Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU cs.LG · 2016 · author #5
  83. Learning Adaptive Parameter Tuning for Image Processing cs.CV · 2016 · author #3
  84. Loss Functions for Neural Networks for Image Processing cs.CV · 2015 · author #4
  85. Hierarchical Subquery Evaluation for Active Learning on a Graph cs.CV · 2015 · author #3
  86. Speaker-following Video Subtitles cs.HC · 2014 · author #2

Mentions

  • 1511.08861 #4 · backfill · confidence 0.70 Jan Kautz
  • 2606.05160 #15 · arxiv_oai · confidence 0.70 Jan Kautz
  • 1504.08219 #3 · backfill · confidence 0.70 Jan Kautz
  • 2606.02800 #120 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2606.00746 #17 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2604.20395 #5 · arxiv_oai · confidence 0.70 Jan Kautz
  • 1407.5145 #2 · backfill · confidence 0.70 Jan Kautz
  • 2605.30307 #12 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2605.27365 #11 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2605.24220 #11 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2605.22791 #3 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2511.07820 #24 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2510.18941 #9 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2505.24864 #7 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2508.14444 #77 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2406.07887 #14 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2512.20856 #135 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2505.15659 #18 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2408.10188 #14 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2406.02509 #5 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2601.16175 #7 · arxiv_oai · confidence 0.70 Jan Kautz
  • 2505.12705 #25 · arxiv_oai · confidence 0.70 Jan Kautz

Frequent Coauthors