Recovering Physically Plausible Human-Object Interactions from Monocular Videos
Pith reviewed 2026-06-28 06:04 UTC · model grok-4.3
The pith
Reinforcement learning refines kinematic human-object tracks into physically consistent sequences from monocular videos.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
RePHO starts from a kinematic estimate and refines it by optimizing a reinforcement learning policy inside a physics simulator to reproduce the observed interaction. Because the kinematic input is typically noisy, an adaptive sampling strategy equipped with a dual self-updating mechanism identifies the frames that carry the most informative and reliable kinematic data, allowing the reconstruction to improve progressively and produce physically consistent human-object sequences.
What carries the argument
The adaptive sampling strategy with dual self-updating mechanism that selects reliable frames from noisy kinematic estimates for policy training in the physics simulator.
If this is right
- The iterative process progressively improves reconstruction quality.
- The output sequences contain no interpenetration or floating artifacts.
- Physical plausibility metrics improve over state-of-the-art methods on the two evaluated HOI benchmarks.
Where Pith is reading between the lines
- The same sampling and refinement loop could be tested on kinematic estimates from other motion domains such as multi-person scenes.
- Replacing the initial kinematic estimator with a stronger one might reduce the number of self-updating iterations required.
- Physically consistent output could serve as training data for downstream robot control policies that must imitate video demonstrations.
Load-bearing premise
Kinematic estimates contain enough reliable information that an adaptive sampling strategy can identify the most informative frames without circular dependence on the final physical quality.
What would settle it
Running RePHO on the two standard HOI benchmarks and measuring no improvement in physical plausibility metrics relative to existing kinematic state-of-the-art methods.
Figures
read the original abstract
In this paper, we propose RePHO, a method to reconstruct physically plausible human-object interactions (HOI) from monocular videos. While existing kinematic-based approaches produce visually plausible motion, they often result in physically implausible artifacts such as interpenetration and object floating. To overcome these issues, we introduce a physics-guided reconstruction framework. We begin with a kinematic estimate and then refine it by training a policy with reinforcement learning (RL). This policy is optimized to reproduce the interaction in a physics simulator. Because kinematic estimates are typically noisy, naive RL training can fail. Therefore, we propose an adaptive sampling strategy with a dual self-updating mechanism that can identify the frames with the most informative and reliable kinematic reconstruction. Our process progressively improves reconstruction quality and yields physically consistent HOI sequences. We demonstrate our approach on two standard HOI benchmarks and achieve clear improvements in physical plausibility metrics over state-of-the-art methods. Project Page: https://dingbang777.github.io/RePHO/
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes RePHO, a physics-guided framework to recover physically plausible human-object interactions (HOI) from monocular videos. It begins with a kinematic estimate, refines it by training an RL policy to reproduce the interaction inside a physics simulator, and introduces an adaptive sampling strategy with dual self-updating to select the most informative and reliable frames when kinematics are noisy. The authors claim that this process progressively improves reconstruction quality, yields physically consistent sequences, and achieves clear gains in physical-plausibility metrics over prior methods on two standard HOI benchmarks.
Significance. If the non-circularity of the frame-selection mechanism can be verified and the reported metric gains hold under controlled ablations, the work would provide a practical route to enforce physical consistency on top of existing kinematic pipelines. The combination of RL with an adaptive, self-updating sampler addresses a recognized failure mode of direct policy optimization on noisy pose estimates and could be relevant to downstream tasks such as animation and robotic manipulation.
major comments (1)
- [Abstract / Method overview] The abstract states that the dual self-updating mechanism identifies reliable frames without circular dependence on the final physical quality, yet provides no indication of how the reliability score is computed (e.g., solely from the initial pose-estimator confidence versus iterative feedback from simulator states or reward signals). Because this mechanism is presented as the key enabler that allows the RL stage to succeed where naïve training fails, the absence of an explicit, non-circular definition is load-bearing for the central claim of progressive, grounded improvement.
minor comments (1)
- [Abstract] The abstract mentions “clear improvements in physical plausibility metrics” but does not name the metrics, report numerical deltas, or reference the specific tables or figures that contain the quantitative results.
Simulated Author's Rebuttal
We thank the referee for highlighting the need for greater clarity on the dual self-updating mechanism. The comment correctly identifies that the abstract is insufficiently explicit about the reliability score computation. We will revise the abstract to include a concise, non-circular definition and ensure the method section cross-references are unambiguous.
read point-by-point responses
-
Referee: [Abstract / Method overview] The abstract states that the dual self-updating mechanism identifies reliable frames without circular dependence on the final physical quality, yet provides no indication of how the reliability score is computed (e.g., solely from the initial pose-estimator confidence versus iterative feedback from simulator states or reward signals). Because this mechanism is presented as the key enabler that allows the RL stage to succeed where naïve training fails, the absence of an explicit, non-circular definition is load-bearing for the central claim of progressive, grounded improvement.
Authors: We agree that the abstract lacks an explicit statement on the reliability score. In the full method (Section 3.3), the base reliability score for each frame is computed exclusively from the initial kinematic pose estimator's per-frame confidence values; no simulator states, reward signals, or policy performance metrics are used in this initial score. The 'dual self-updating' component then iteratively adjusts sampling probabilities by combining these fixed initial scores with a secondary update based on the current policy's reconstruction error on the selected frames, but the primary reliability anchor remains the initial estimator confidence and is therefore non-circular with respect to the final physical quality. We will add one sentence to the abstract: 'Reliability scores are derived solely from the initial pose estimator's confidence values, with dual self-updating iteratively refining sampling weights without feedback from simulator rewards or final physical metrics.' This revision directly addresses the concern while preserving the original claim. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper describes a pipeline that starts from independent kinematic estimates, applies RL policy optimization inside a physics simulator, and introduces an adaptive sampling mechanism to handle noise. No equations, parameter-fitting procedures, or self-referential definitions are shown that would reduce any claimed output (e.g., physically consistent sequences) to the inputs by construction. No self-citation load-bearing steps, uniqueness theorems, or ansatz smuggling appear in the abstract or described method. The process is presented as using external simulator dynamics and initial kinematic data, with evaluation on standard benchmarks, making the chain self-contained rather than circular.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
PMP: Learning to physically interact with environments using part-wise motion priors
Jinseok Bae, Jungdam Won, Donggeun Lim, Cheol-Hui Min, and Young Min Kim. PMP: Learning to physically interact with environments using part-wise motion priors. InACM SIGGRAPH 2023 Conference Proceedings, 2023. 2
2023
-
[2]
HOT3D: Hand and object tracking in 3D from ego- centric multi-view videos
Prithviraj Banerjee, Sindi Shkodrani, Pierre Moulon, Shreyas Hampali, Shangchen Han, Fan Zhang, Linguang Zhang, Jade Fountain, Edward Miller, Selen Basol, Richard Newcombe, Robert Wang, Jakob Julian Engel, and Hodan Tomas. HOT3D: Hand and object tracking in 3D from ego- centric multi-view videos. InCVPR, 2025. 2
2025
-
[3]
BEHA VE: Dataset and method for tracking human object in- teractions
Bharat Lal Bhatnagar, Xianghui Xie, Ilya A Petrov, Cristian Sminchisescu, Christian Theobalt, and Gerard Pons-Moll. BEHA VE: Dataset and method for tracking human object in- teractions. InCVPR, 2022. 2, 5
2022
-
[4]
Physically plausible full-body hand-object interaction synthesis
Jona Braun, Sammy Christen, Muhammed Kocabas, Emre Aksan, and Otmar Hilliges. Physically plausible full-body hand-object interaction synthesis. In3DV, 2024. 2
2024
-
[5]
PICO: Reconstructing 3D people in con- tact with objects
Alp ´ar Cseke, Shashank Tripathi, Sai Kumar Dwivedi, Ar- jun S Lakshmipathy, Agniv Chatterjee, Michael J Black, and Dimitrios Tzionas. PICO: Reconstructing 3D people in con- tact with objects. InCVPR, 2025. 2
2025
-
[6]
Interactive simulation of stylized human locomotion.ACM Transactions on Graphics, 27(3):1–10, 2008
Marco da Silva, Yeuhi Abe, and Jovan Popovi ´c. Interactive simulation of stylized human locomotion.ACM Transactions on Graphics, 27(3):1–10, 2008. 2
2008
-
[7]
InteractVLM: 3D interaction reasoning from 2D foundational models
Sai Kumar Dwivedi, Dimitrije Anti ´c, Shashank Tripathi, Omid Taheri, Cordelia Schmid, Michael J Black, and Dim- itrios Tzionas. InteractVLM: 3D interaction reasoning from 2D foundational models. InCVPR, 2025. 2
2025
-
[8]
PhysHMR: Learning humanoid control poli- cies from vision for physically plausible human motion re- construction
Qiao Feng, Yiming Huang, Yufu Wang, Jiatao Gu, and Lingjie Liu. PhysHMR: Learning humanoid control poli- cies from vision for physically plausible human motion re- construction. InProceedings of the SIGGRAPH Asia 2025 Conference Papers, 2025. 2
2025
-
[9]
CooHOI: Learning cooperative human-object interac- tion with manipulated object dynamics.NeurIPS, 2024
Jiawei Gao, Ziqin Wang, Zeqi Xiao, Jingbo Wang, Tai Wang, Jinkun Cao, Xiaolin Hu, Si Liu, Jifeng Dai, and Jiangmiao Pang. CooHOI: Learning cooperative human-object interac- tion with manipulated object dynamics.NeurIPS, 2024. 2
2024
-
[10]
Synthesizing phys- ical character-scene interactions
Mohamed Hassan, Yunrong Guo, Tingwu Wang, Michael Black, Sanja Fidler, and Xue Bin Peng. Synthesizing phys- ical character-scene interactions. InACM SIGGRAPH 2023 Conference Proceedings, 2023. 2
2023
-
[11]
Animating human athletics
Jessica K Hodgins, Wayne L Wooten, David C Brogan, and James F O’Brien. Animating human athletics. InProceed- ings of the 22nd annual conference on Computer graphics and interactive techniques, pages 71–78, 1995. 2
1995
-
[12]
Jhen Hsieh, Kuan-Hsun Tu, Kuo-Han Hung, and Tsung- Wei Ke. DexMan: Learning bimanual dexterous manip- ulation from human and generated videos.arXiv preprint arXiv:2510.08475, 2025. 6
arXiv 2025
-
[13]
Diffuse-CLoC: Guided diffusion for physics-based character look-ahead control.ACM Transac- tions on Graphics (TOG), 44(4):1–12, 2025
Xiaoyu Huang, Takara Truong, Yunbo Zhang, Fangzhou Yu, Jean Pierre Sleiman, Jessica Hodgins, Koushil Sreenath, and Farbod Farshidian. Diffuse-CLoC: Guided diffusion for physics-based character look-ahead control.ACM Transac- tions on Graphics (TOG), 44(4):1–12, 2025. 2
2025
-
[14]
InterCap: Joint markerless 3D tracking of humans and objects in interaction
Yinghao Huang, Omid Taheri, Michael J Black, and Dim- itrios Tzionas. InterCap: Joint markerless 3D tracking of humans and objects in interaction. InDAGM German Con- ference on Pattern Recognition, 2022. 2, 5
2022
-
[15]
Monocular human- object reconstruction in the wild
Chaofan Huo, Ye Shi, and Jingya Wang. Monocular human- object reconstruction in the wild. InProceedings of the 32nd ACM International Conference on Multimedia, pages 5547– 5555, 2024. 2
2024
-
[16]
Full-body articulated human-object interaction
Nan Jiang, Tengyu Liu, Zhexuan Cao, Jieming Cui, Zhiyuan Zhang, Yixin Chen, He Wang, Yixin Zhu, and Siyuan Huang. Full-body articulated human-object interaction. InICCV,
-
[17]
DA ViD: Modeling dynamic affordance of 3D objects using pre- trained video diffusion models
Hyeonwoo Kim, Sangwon Baik, and Hanbyul Joo. DA ViD: Modeling dynamic affordance of 3D objects using pre- trained video diffusion models. InICCV, 2025. 2
2025
-
[18]
ParaHome: Parameterizing everyday home activities to- wards 3D generative modeling of human-object interactions
Jeonghwan Kim, Jisoo Kim, Jeonghyeon Na, and Hanbyul Joo. ParaHome: Parameterizing everyday home activities to- wards 3D generative modeling of human-object interactions. InCVPR, 2025. 2
2025
-
[19]
Anylift: Scaling mo- tion reconstruction from internet videos via 2d diffusion
Hongjie Li, Heng Yu, Jiaman Li, Hong-Xing Yu, Ehsan Adeli, C Karen Liu, and Jiajun Wu. Anylift: Scaling mo- tion reconstruction from internet videos via 2d diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 13876–13886, 2026. 2
2026
-
[20]
Ze- roHSI: Zero-shot 4D human-scene interaction by video gen- eration
Hongjie Li, Hong-Xing Yu, Jiaman Li, and Jiajun Wu. Ze- roHSI: Zero-shot 4D human-scene interaction by video gen- eration. In3DV, 2026. 2
2026
-
[21]
Object motion guided human motion synthesis.ACM Transactions on Graphics (TOG), 42(6):1–11, 2023
Jiaman Li, Jiajun Wu, and C Karen Liu. Object motion guided human motion synthesis.ACM Transactions on Graphics (TOG), 42(6):1–11, 2023. 7
2023
-
[22]
ManipTrans: Efficient dexterous bimanual manip- ulation transfer via residual learning
Kailin Li, Puhao Li, Tengyu Liu, Yuyang Li, and Siyuan Huang. ManipTrans: Efficient dexterous bimanual manip- ulation transfer via residual learning. InCVPR, 2025. 6
2025
-
[23]
Lei Li and Angela Dai. HOI-PAGE: Zero-shot human-object interaction generation with part affordance guidance.arXiv preprint arXiv:2506.07209, 2025. 2
Pith/arXiv arXiv 2025
-
[24]
Qiayuan Liao, Takara E Truong, Xiaoyu Huang, Guy Tevet, Koushil Sreenath, and C Karen Liu. BeyondMimic: From motion tracking to versatile humanoid control via guided dif- fusion.arXiv preprint arXiv:2508.08241, 2025. 2
Pith/arXiv arXiv 2025
-
[25]
HOI4D: A 4D egocentric dataset for category-level human- object interaction
Yunze Liu, Yun Liu, Che Jiang, Kangbo Lyu, Weikang Wan, Hao Shen, Boqiang Liang, Zhoujie Fu, He Wang, and Li Yi. HOI4D: A 4D egocentric dataset for category-level human- object interaction. InCVPR, 2022. 2
2022
-
[26]
Yuke Lou, Yiming Wang, Zhen Wu, Rui Zhao, Wenjia Wang, Mingyi Shi, and Taku Komura. Zero-shot human-object in- teraction synthesis with multimodal priors.arXiv preprint arXiv:2503.20118, 2025. 2
arXiv 2025
-
[27]
HUMOTO: A 4D dataset of mocap human object interactions
Jiaxin Lu, Chun-Hao Paul Huang, Uttaran Bhattacharya, Qixing Huang, and Yi Zhou. HUMOTO: A 4D dataset of mocap human object interactions. InICCV, 2025. 2
2025
-
[28]
Em- bodied scene-aware human pose estimation.NeurIPS, 2022
Zhengyi Luo, Shun Iwase, Ye Yuan, and Kris Kitani. Em- bodied scene-aware human pose estimation.NeurIPS, 2022. 2
2022
-
[29]
Perpetual humanoid control for real-time simulated avatars
Zhengyi Luo, Jinkun Cao, Alexander Winkler, Kris Kitani, and Weipeng Xu. Perpetual humanoid control for real-time simulated avatars. InICCV, 2023. 2
2023
-
[30]
Zhengyi Luo, Ye Yuan, Tingwu Wang, Chenran Li, Sirui Chen, Fernando Casta ˜neda, Zi-Ang Cao, Jiefeng Li, David 9 Minor, Qingwei Ben, et al. SONIC: Supersizing motion tracking for natural humanoid whole-body control.arXiv preprint arXiv:2511.07820, 2025. 2
Pith/arXiv arXiv 2025
-
[31]
To- kenHSI: Unified synthesis of physical human-scene interac- tions through task tokenization
Liang Pan, Zeshi Yang, Zhiyang Dou, Wenjia Wang, Buzhen Huang, Bo Dai, Taku Komura, and Jingbo Wang. To- kenHSI: Unified synthesis of physical human-scene interac- tions through task tokenization. InCVPR, 2025. 2
2025
-
[32]
DeepMimic: Example-guided deep reinforce- ment learning of physics-based character skills.ACM Trans- actions On Graphics (TOG), 37(4):1–14, 2018
Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel Van de Panne. DeepMimic: Example-guided deep reinforce- ment learning of physics-based character skills.ACM Trans- actions On Graphics (TOG), 37(4):1–14, 2018. 2, 4
2018
-
[33]
AMP: Adversarial motion priors for styl- ized physics-based character control.ACM Transactions on Graphics (ToG), 40(4):1–20, 2021
Xue Bin Peng, Ze Ma, Pieter Abbeel, Sergey Levine, and Angjoo Kanazawa. AMP: Adversarial motion priors for styl- ized physics-based character control.ACM Transactions on Graphics (ToG), 40(4):1–20, 2021
2021
-
[34]
ASE: Large-scale reusable adversarial skill embeddings for physically simulated characters.ACM Transactions On Graphics (TOG), 41(4):1–17, 2022
Xue Bin Peng, Yunrong Guo, Lina Halper, Sergey Levine, and Sanja Fidler. ASE: Large-scale reusable adversarial skill embeddings for physically simulated characters.ACM Transactions On Graphics (TOG), 41(4):1–17, 2022. 2
2022
-
[35]
Javier Romero, Dimitrios Tzionas, and Michael J. Black. Embodied hands: Modeling and capturing hands and bod- ies together.ACM TOG, 36(6), 2017. 3
2017
-
[36]
GRAB: A dataset of whole-body human grasping of objects
Omid Taheri, Nima Ghorbani, Michael J Black, and Dim- itrios Tzionas. GRAB: A dataset of whole-body human grasping of objects. InECCV, 2020. 2
2020
-
[37]
MaskedMimic: Unified physics-based char- acter control through masked motion inpainting.ACM Trans- actions on Graphics (TOG), 43(6):1–21, 2024
Chen Tessler, Yunrong Guo, Ofir Nabati, Gal Chechik, and Xue Bin Peng. MaskedMimic: Unified physics-based char- acter control through masked motion inpainting.ACM Trans- actions on Graphics (TOG), 43(6):1–21, 2024. 2
2024
-
[38]
PDP: Physics-based character animation via dif- fusion policy
Takara Everest Truong, Michael Piseno, Zhaoming Xie, and Karen Liu. PDP: Physics-based character animation via dif- fusion policy. InSIGGRAPH Asia 2024 Conference Papers,
2024
-
[39]
Multi- Phys: Multi-person physics-aware 3D motion estimation
Nicolas Ugrinovic, Boxiao Pan, Georgios Pavlakos, De- spoina Paschalidou, Bokui Shen, Jordi Sanchez-Riera, Francesc Moreno-Noguer, and Leonidas Guibas. Multi- Phys: Multi-person physics-aware 3D motion estimation. In CVPR, 2024. 2
2024
-
[40]
Yinhuai Wang, Jing Lin, Ailing Zeng, Zhengyi Luo, Jian Zhang, and Lei Zhang. PhysHOI: Physics-based imita- tion of dynamic human-object interaction.arXiv preprint arXiv:2312.04393, 2023. 2, 3
arXiv 2023
-
[41]
SkillMimic: Learning basketball interaction skills from demonstrations
Yinhuai Wang, Qihan Zhao, Runyi Yu, Hok Wai Tsui, Ail- ing Zeng, Jing Lin, Zhengyi Luo, Jiwen Yu, Xiu Li, Qifeng Chen, Jian Zhang, Lei Zhang, and Ping Tan. SkillMimic: Learning basketball interaction skills from demonstrations. InCVPR, 2025. 2, 3
2025
-
[42]
Reconstructing in-the-wild open-vocabulary human-object interactions
Boran Wen, Dingbang Huang, Zichen Zhang, Jiahong Zhou, Jianbin Deng, Jingyu Gong, Yulong Chen, Lizhuang Ma, and Yong-Lu Li. Reconstructing in-the-wild open-vocabulary human-object interactions. InCVPR, 2025. 2
2025
-
[43]
Boran Wen, Ye Lu, Sirui Wang, Keyan Wan, Jiahong Zhou, Junxuan Liang, Xinpeng Liu, Bang Xiao, Ruiyang Liu, and Yong-Lu Li. Efficient and scalable monocular human- object interaction motion reconstruction.arXiv preprint arXiv:2512.00960, 2025. 2
arXiv 2025
-
[44]
UniPhys: Unified planner and controller with diffusion for flexible physics-based character control
Yan Wu, Korrawe Karunratanakul, Zhengyi Luo, and Siyu Tang. UniPhys: Unified planner and controller with diffusion for flexible physics-based character control. InICCV, 2025. 2
2025
-
[45]
CHORE: Contact, human and object reconstruction from a single RGB image
Xianghui Xie, Bharat Lal Bhatnagar, and Gerard Pons-Moll. CHORE: Contact, human and object reconstruction from a single RGB image. InECCV, 2022. 2
2022
-
[46]
Visibility aware human-object interaction tracking from sin- gle RGB camera
Xianghui Xie, Bharat Lal Bhatnagar, and Gerard Pons-Moll. Visibility aware human-object interaction tracking from sin- gle RGB camera. InCVPR, 2023. 1, 2, 3, 4, 5, 6, 7
2023
-
[47]
Template free reconstruction of human- object interaction with procedural interaction generation
Xianghui Xie, Bharat Lal Bhatnagar, Jan Eric Lenssen, and Gerard Pons-Moll. Template free reconstruction of human- object interaction with procedural interaction generation. In CVPR, 2024. 2
2024
-
[48]
In- terTrack: Tracking human object interaction without object templates
Xianghui Xie, Jan Eric Lenssen, and Gerard Pons-Moll. In- terTrack: Tracking human object interaction without object templates. In3DV, 2025. 1, 2
2025
-
[49]
Cari4d: Category agnostic 4d reconstruction of human- object interaction
Xianghui Xie, Bowen Wen, Yan Chang, Hesam Rabeti, Jiefeng Li, Ye Yuan, Gerard Pons-Moll, and Stan Birch- field. Cari4d: Category agnostic 4d reconstruction of human- object interaction. InConference on Computer Vision and Pattern Recognition (CVPR), 2026. 2
2026
-
[50]
PARC: Physics-based augmentation with reinforcement learning for character controllers
Michael Xu, Yi Shi, KangKang Yin, and Xue Bin Peng. PARC: Physics-based augmentation with reinforcement learning for character controllers. InProceedings of the Special Interest Group on Computer Graphics and Interac- tive Techniques Conference Conference Papers, pages 1–11,
-
[51]
InterMimic: Towards universal whole-body control for physics-based human-object interactions
Sirui Xu, Hung Yu Ling, Yu-Xiong Wang, and Liang-Yan Gui. InterMimic: Towards universal whole-body control for physics-based human-object interactions. InCVPR, 2025. 2, 3, 5, 6, 7, 8
2025
-
[52]
SIMBICON: Simple biped locomotion control.ACM Trans- actions on Graphics (TOG), 26(3):105–es, 2007
KangKang Yin, Kevin Loken, and Michiel Van de Panne. SIMBICON: Simple biped locomotion control.ACM Trans- actions on Graphics (TOG), 26(3):105–es, 2007. 2
2007
-
[53]
Learning physically simulated tennis skills from broadcast videos.ACM Trans
Ye Yuan, Viktor Makoviychuk, Y Guo, S Fidler, X Peng, and K Fatahalian. Learning physically simulated tennis skills from broadcast videos.ACM Trans. Graph, 42(4), 2023. 2
2023
-
[54]
Neural- Dome: A neural modeling pipeline on multi-view human- object interactions
Juze Zhang, Haimin Luo, Hongdi Yang, Xinru Xu, Qianyang Wu, Ye Shi, Jingyi Yu, Lan Xu, and Jingya Wang. Neural- Dome: A neural modeling pipeline on multi-view human- object interactions. InCVPR, 2023. 2
2023
-
[55]
Perceiving 3D human-object spatial arrangements from a single image in the wild
Jason Y Zhang, Sam Pepose, Hanbyul Joo, Deva Ramanan, Jitendra Malik, and Angjoo Kanazawa. Perceiving 3D human-object spatial arrangements from a single image in the wild. InECCV, 2020. 2
2020
-
[56]
I’M HOI: Inertia-aware monocular capture of 3D human-object interactions
Chengfeng Zhao, Juze Zhang, Jiashen Du, Ziwei Shan, Junye Wang, Jingyi Yu, Jingya Wang, and Lan Xu. I’M HOI: Inertia-aware monocular capture of 3D human-object interactions. InCVPR, 2024. 2 10
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.