FCBV-Net: Category-Level Robotic Garment Smoothing via Feature-Conditioned Bimanual Value Prediction

Jing Qiu; Mohammed Daba

arxiv: 2508.05153 · v2 · submitted 2025-08-07 · 💻 cs.RO · cs.AI

FCBV-Net: Category-Level Robotic Garment Smoothing via Feature-Conditioned Bimanual Value Prediction

Mohammed Daba , Jing Qiu This is my paper

Pith reviewed 2026-05-19 01:12 UTC · model grok-4.3

classification 💻 cs.RO cs.AI

keywords category-level generalizationbimanual manipulationgarment smoothingpoint cloud featuresvalue predictionrobotic policy learningdeformable objectssimulation evaluation

0 comments

The pith

Conditioning bimanual action values on frozen 3D geometric features lets robots smooth unseen garments with far smaller performance loss than baselines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FCBV-Net to tackle category-level generalization in robotic bimanual garment smoothing. It keeps pre-trained dense geometric features from 3D point clouds frozen while training only the downstream value prediction components on those fixed features. This separation aims to avoid overfitting to specific garment instances and to better capture synergistic bimanual actions. In PyFlex simulations with the CLOTH3D dataset, the method shows an 11.5 percent efficiency drop on unseen garments versus 96.2 percent for a 2D baseline and reaches 89 percent final coverage compared with 83 percent for a correspondence-based 3D baseline using the same features but a fixed primitive. A reader would care because the results suggest a practical route to making manipulation policies work across varied clothing shapes without retraining the entire perception stack each time.

Core claim

FCBV-Net conditions bimanual action value prediction on pre-trained and frozen dense geometric features extracted from 3D point clouds. The geometric front end stays static across intra-category variations while trainable downstream modules learn task-specific bimanual policies. This decoupling produces superior category-level generalization in simulated garment smoothing, evidenced by substantially lower efficiency degradation and higher final coverage on garments never seen during training.

What carries the argument

The Feature-Conditioned Bimanual Value Network (FCBV-Net), which feeds fixed geometric features from 3D point clouds into a trainable value head to separate perception from action evaluation.

If this is right

Bimanual policies can be learned more reliably once geometric structure is supplied by a static feature extractor.
Category-level performance improves when geometric understanding is not updated jointly with action values.
3D point-cloud conditioning outperforms both 2D image baselines and fixed-primitive 3D methods on coverage and generalization metrics.
Downstream task-specific training on frozen features suffices for high final coverage in simulation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same separation of frozen geometry from value learning could be tested on related tasks such as folding or unfolding.
Real-robot deployment would reveal whether simulation gains survive contact dynamics and sensor noise.
Replacing the frozen feature extractor with newer pre-trained models might further reduce the observed 11.5 percent drop.

Load-bearing premise

Pre-trained dense geometric features from 3D point clouds remain informative enough for bimanual value prediction even when garment shapes vary within a category and receive no further adaptation.

What would settle it

Running the same trained FCBV-Net on a new collection of garments whose shape variations exceed those in the CLOTH3D unseen set and finding that its efficiency drop matches or exceeds the 96 percent drop seen in the 2D baseline.

Figures

Figures reproduced from arXiv: 2508.05153 by Jing Qiu, Mohammed Daba.

**Figure 2.** Figure 2: Overview of the FCBV-Net Architecture. Dense geometr [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Progression of normalized garment coverage over num [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

read the original abstract

Category-level generalization for robotic garment manipulation, such as bimanual smoothing, remains a significant hurdle due to high dimensionality, complex dynamics, and intra-category variations. Current approaches often struggle, either overfitting with concurrently learned visual features for a specific instance or, despite Category-level perceptual generalization, failing to predict the value of synergistic bimanual actions. We propose the Feature-Conditioned bimanual Value Network (FCBV-Net), operating on 3D point clouds to specifically enhance category-level policy generalization for garment smoothing. FCBV-Net conditions bimanual action value prediction on pre-trained, frozen dense geometric features, ensuring robustness to intra-category garment variations. Trainable downstream components then learn a task-specific policy using these static features. In simulated PyFlex environments using the CLOTH3D dataset, FCBV-Net demonstrated superior category-level generalization. It exhibited only an 11.5% efficiency drop (Steps80) on unseen garments compared to 96.2% for a 2D image-based baseline, and achieved 89% final coverage, outperforming an 83% coverage from a 3D correspondence-based baseline that uses identical per-point geometric features but a fixed primitive. These results highlight that the decoupling of geometric understanding from bimanual action value learning enables better category-level generalization. Code, videos, and supplementary materials are available at the project website: https://dabaspark.github.io/fcbvnet/.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FCBV-Net shows that freezing pre-trained 3D geometric features and training only the bimanual value head cuts the efficiency drop on unseen CLOTH3D garments to 11.5% in simulation, a clear step over the 2D and fixed-primitive baselines.

read the letter

FCBV-Net freezes pre-trained dense geometric features from 3D point clouds and trains only the downstream bimanual value prediction network for garment smoothing. The main result is that this produces much stronger category-level generalization in PyFlex simulation on held-out CLOTH3D garments: an 11.5% efficiency drop versus 96.2% for the 2D image baseline, plus 89% final coverage against 83% for the 3D correspondence baseline that uses the same per-point features but a fixed primitive. The code release helps with checking the implementation details. The decoupling itself is a reasonable incremental idea over prior work that either learns visual features jointly or relies on instance-specific tuning. It keeps the geometry fixed so the value head can focus on learning synergistic bimanual actions without having to solve the full high-dimensional problem at once. The quantitative comparisons are the strongest part of the evidence. The soft spots are straightforward. Everything stays in simulation with no real-robot transfer results, and the reported numbers lack error bars, statistical tests, or hyperparameter sensitivity checks, so the size of the gains is harder to judge. The stress-test concern about the frozen features is worth watching: if the pre-trained extractor was not exposed to fabric deformation modes, the observed improvements could trace more to feature choice than to the decoupling strategy. That said, the held-out garment split and the fact that features are not re-fit to the task give the evaluation some independent grounding. This paper is for researchers working on robotic manipulation of deformable objects who want practical ways to handle category-level variation without full end-to-end learning from scratch. A reader looking for sim-based policy ideas with point-cloud inputs would find the architecture and the baseline comparisons useful. The work is coherent enough on its own terms, with clear empirical grounding, to deserve a serious referee even though revisions for real-world validation and more analysis would be expected.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes FCBV-Net, a Feature-Conditioned Bimanual Value Network that operates on 3D point clouds for category-level robotic garment smoothing. It conditions bimanual action value prediction on pre-trained, frozen dense geometric features to promote robustness to intra-category variations, with only the downstream components trained for the task-specific policy. In PyFlex simulations on the CLOTH3D dataset, FCBV-Net reports an 11.5% efficiency drop on unseen garments (compared to 96.2% for a 2D image-based baseline) and 89% final coverage (outperforming 83% for a 3D correspondence-based baseline using identical features but a fixed primitive), attributing the gains to the decoupling of geometric understanding from bimanual action value learning.

Significance. If the central claim holds, the work provides a useful separation of concerns for deformable object manipulation, allowing pre-trained geometric features to handle category-level variations while learning focuses on synergistic bimanual value prediction. The public release of code, videos, and supplementary materials is a clear strength for reproducibility. The simulation results on held-out garments offer moderate evidence for improved generalization over the reported baselines, though the overall significance for robotic applications remains limited by the absence of real-world validation and detailed statistical analysis.

major comments (2)

Method section: The central claim that decoupling geometric understanding from bimanual action value learning enables better category-level generalization rests on the assumption that pre-trained frozen dense geometric features from 3D point clouds are sufficiently informative and robust across intra-category CLOTH3D variations without adaptation. No ablations are described that compare frozen vs. fine-tuned features or test alternative backbones, leaving open the possibility that performance (11.5% efficiency drop, 89% coverage) is an artifact of the specific feature extractor rather than the proposed conditioning architecture.
Experiments section: The quantitative gains over baselines are presented as evidence for superior generalization, but the results lack reported statistical significance, variance across runs, number of trials, or hyperparameter sensitivity analysis. This weakens the ability to interpret the 11.5% vs. 96.2% efficiency drop and 89% vs. 83% coverage as robust support for the decoupling hypothesis.

minor comments (2)

Abstract: The metric 'Steps80' is referenced without definition or pointer to its precise formulation; this should be clarified when first introduced in the main text.
The manuscript would benefit from an explicit statement of the pre-training dataset and backbone architecture used for the dense geometric features to allow readers to assess transfer assumptions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for their insightful comments, which have helped us identify areas to improve the clarity and rigor of our work. We address the major comments point by point below and indicate the revisions we plan to make.

read point-by-point responses

Referee: Method section: The central claim that decoupling geometric understanding from bimanual action value learning enables better category-level generalization rests on the assumption that pre-trained frozen dense geometric features from 3D point clouds are sufficiently informative and robust across intra-category CLOTH3D variations without adaptation. No ablations are described that compare frozen vs. fine-tuned features or test alternative backbones, leaving open the possibility that performance (11.5% efficiency drop, 89% coverage) is an artifact of the specific feature extractor rather than the proposed conditioning architecture.

Authors: We agree that ablations on the feature extractor would provide additional support for our central claim. In the revised manuscript, we will add experiments comparing frozen versus fine-tuned features using the same backbone. Additionally, we will evaluate an alternative pre-trained feature backbone to show that the performance gains stem from the proposed conditioning and bimanual value prediction architecture rather than the specific choice of features. revision: yes
Referee: Experiments section: The quantitative gains over baselines are presented as evidence for superior generalization, but the results lack reported statistical significance, variance across runs, number of trials, or hyperparameter sensitivity analysis. This weakens the ability to interpret the 11.5% vs. 96.2% efficiency drop and 89% vs. 83% coverage as robust support for the decoupling hypothesis.

Authors: We agree that including statistical details would strengthen the presentation of our results. In the revised manuscript, we will report the number of evaluation trials, include error bars or standard deviations from multiple runs, perform statistical significance tests on the key metrics, and provide a hyperparameter sensitivity analysis in the supplementary materials. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected; empirical claims rest on held-out evaluation

full rationale

The paper's central claim—that decoupling geometric feature extraction from bimanual value learning improves category-level generalization—is advanced via an empirical architecture (pre-trained frozen dense features from 3D point clouds feeding a trainable value head) and validated on held-out garments from the CLOTH3D dataset in PyFlex simulation. Comparisons to a 2D baseline and a 3D correspondence baseline using identical per-point features but a fixed primitive provide external grounding. No load-bearing equations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text; the design choice is presented as a hypothesis tested against independent test splits rather than derived by construction from its own inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that frozen pre-trained geometric features are adequate for the smoothing task and that simulation results reflect meaningful generalization; no new physical entities are introduced.

free parameters (1)

Network hyperparameters and training schedule
Standard deep RL training choices that are fitted during optimization but not enumerated in the abstract.

axioms (1)

domain assumption Pre-trained dense geometric features capture all necessary intra-category variations for bimanual smoothing
Invoked when the method freezes these features and trains only downstream components.

pith-pipeline@v0.9.0 · 5794 in / 1357 out tokens · 42267 ms · 2026-05-19T01:12:13.308879+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 1 internal anchor

[1]

Ide ntifying the potential for robotics to assist older adults in differe nt living environments,

T. L. Mitzner, T. L. Chen, C. C. Kemp, and W. A. Rogers, “Ide ntifying the potential for robotics to assist older adults in differe nt living environments,” International Journal of Social Robotics , vol. 6, no. 2, pp. 213–227, Apr 2014

work page 2014
[2]

Managing activity difﬁculties at ho me: A survey of medicare beneﬁciaries,

B. J. Dudgeon, J. M. Hoffman, M. A. Ciol, A. Shumway-Cook, K. M. Y orkston, and L. Chan, “Managing activity difﬁculties at ho me: A survey of medicare beneﬁciaries,” Archives of Physical Medicine and Rehabilitation, vol. 89, no. 7, pp. 1256–1261, Jul 2008

work page 2008
[3]

Challenges and outlook in robotic manipula tion of deformable objects,

J. Zhu, A. Cherubini, C. Dune, D. Navarro-Alarcon, F. Ala mbeigi, D. Berenson, F. Ficuciello, K. Harada, J. Kober, X. Li, J. Pan , W. Y uan, and M. Gienger, “Challenges and outlook in robotic manipula tion of deformable objects,” IEEE Robotics & Automation Magazine , vol. 29, no. 3, pp. 67–77, 2022

work page 2022
[4]

Unigarmentman ip: A uniﬁed framework for category-level garment manipulatio n via dense visual correspondence,

R. Wu, H. Lu, Y . Wang, Y . Wang, and H. Dong, “Unigarmentman ip: A uniﬁed framework for category-level garment manipulatio n via dense visual correspondence,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , June 2024, pp. 16 340–16 350

work page 2024
[5]

Speedfolding: Learning efﬁcient bimanual folding of garm ents,

Y . Avigal, L. Berscheid, T. Asfour, T. Kr¨ oger, and K. Gol dberg, “Speedfolding: Learning efﬁcient bimanual folding of garm ents,” in 2022 IEEE/RSJ International Conference on Intelligent Rob ots and Systems (IROS), 2022, pp. 1–8

work page 2022
[6]

Cloth funnels: Canonicalized-alignment for mult i-purpose garment manipulation,

A. Canberk, C. Chi, H. Ha, B. Burchﬁel, E. Cousineau, S. Fe ng, and S. Song, “Cloth funnels: Canonicalized-alignment for mult i-purpose garment manipulation,” in 2023 IEEE International Conference on Robotics and Automation (ICRA) , 2023, pp. 5872–5879

work page 2023
[7]

Unifolding : Towards sample-efﬁcient, scalable, and generalizable robotic gar ment folding,

H. Xue, Y . Li, W. Xu, H. Li, D. Zheng, and C. Lu, “Unifolding : Towards sample-efﬁcient, scalable, and generalizable robotic gar ment folding,” arXiv preprint arXiv:2311.01267 , 2023

work page arXiv 2023
[8]

Folding clothes autonomously: A complete pipeline,

A. Doumanoglou, J. Stria, G. Peleka, I. Mariolis, V . Petr ´ ık, A. Kargakos, L. Wagner, V . Hlav´ aˇ c, T.-K. Kim, and S. Malassiotis, “Folding clothes autonomously: A complete pipeline,” IEEE Transactions on Robotics , vol. 32, no. 6, pp. 1461–1478, 2016

work page 2016
[9]

Cloth grasp point detection based on multiple-view geometric cue s with appli- cation to robotic towel folding,

J. Maitin-Shepard, M. Cusumano-Towner, J. Lei, and P . Ab beel, “Cloth grasp point detection based on multiple-view geometric cue s with appli- cation to robotic towel folding,” in 2010 IEEE International Conference on Robotics and Automation , 2010, pp. 2308–2315

work page 2010
[10]

Skeleton merger: An uns upervised aligned keypoint detector,

R. Shi, Z. Xue, Y . Y ou, and C. Lu, “Skeleton merger: An uns upervised aligned keypoint detector,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , June 2021, pp. 43–52

work page 2021
[11]

Clothesnet: An information-rich 3d garment model repository with simulat ed clothes environment,

B. Zhou, H. Zhou, T. Liang, Q. Y u, S. Zhao, Y . Zeng, J. Lv, S . Luo, Q. Wang, X. Y u, H. Chen, C. Lu, and L. Shao, “Clothesnet: An information-rich 3d garment model repository with simulat ed clothes environment,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , October 2023, pp. 20 428–20 438

work page 2023
[12]

Learning foresightful dens e visual affordance for deformable object manipulation,

R. Wu, C. Ning, and H. Dong, “Learning foresightful dens e visual affordance for deformable object manipulation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision ( ICCV), October 2023, pp. 10 947–10 956

work page 2023
[13]

Dualafford: Learning collaborative visual affordance fo r dual-gripper manipulation,

Y . Zhao, R. Wu, Z. Chen, Y . Zhang, Q. Fan, K. Mo, and H. Dong , “Dualafford: Learning collaborative visual affordance fo r dual-gripper manipulation,” in International Conference on Learning Representa- tions, 2023

work page 2023
[14]

A heuristic-based approach for ﬂattening wrinkled clothe s,

L. Sun, G. Aragon-Camarasa, P . Cockshott, S. Rogers, an d J. P . Siebert, “A heuristic-based approach for ﬂattening wrinkled clothe s,” in Towards Autonomous Robotic Systems , A. Natraj, S. Cameron, C. Melhuish, and M. Witkowski, Eds. Berlin, Heidelberg: Springer Berlin Hei delberg, 2014, pp. 148–160

work page 2014
[15]

Model for unf olding laundry using interactive perception,

B. Willimon, S. Birchﬁeld, and I. Walker, “Model for unf olding laundry using interactive perception,” in 2011 IEEE/RSJ International Confer- ence on Intelligent Robots and Systems , 2011, pp. 4871–4876

work page 2011
[16]

Learning to rearrange deformable cables , fabrics, and bags with goal-conditioned transporter networks,

D. Seita, P . Florence, J. Tompson, E. Coumans, V . Sindhw ani, K. Gold- berg, and A. Zeng, “Learning to rearrange deformable cables , fabrics, and bags with goal-conditioned transporter networks,” in 2021 IEEE International Conference on Robotics and Automation (ICRA ), 2021, pp. 4568–4575

work page 2021
[17]

Learning dense visual corresponde nces in simulation to smooth and fold real fabrics,

A. Ganapathi, P . Sundaresan, B. Thananjeyan, A. Balakr ishna, D. Seita, J. Grannen, M. Hwang, R. Hoque, J. E. Gonzalez, N. Jamali, K. Y amane, S. Iba, and K. Goldberg, “Learning dense visual corresponde nces in simulation to smooth and fold real fabrics,” in 2021 IEEE International Conference on Robotics and Automation (ICRA) , 2021, pp. 11 515– 11 522

work page 2021
[18]

Flingbot: The unreasonable effectiv eness of dynamic manipulation for cloth unfolding,

H. Ha and S. Song, “Flingbot: The unreasonable effectiv eness of dynamic manipulation for cloth unfolding,” in Proceedings of the 5th Conference on Robot Learning , ser. Proceedings of Machine Learning Research, A. Faust, D. Hsu, and G. Neumann, Eds., vol. 164. PM LR, 08–11 Nov 2022, pp. 24–33

work page 2022
[19]

VisuoSpatial foresight for multi-step, multi-task fabric manipulation,

R. Hoque, D. Seita, A. Balakrishna, A. Ganapathi, A. K. T anwani, N. Jamali, K. Y amane, S. Iba, and K. Goldberg, “VisuoSpatial foresight for multi-step, multi-task fabric manipulation,” 2020

work page 2020
[20]

Pybullet, a python module for phy sics simulation for games, robotics and machine learning,

E. Coumans and Y . Bai, “Pybullet, a python module for phy sics simulation for games, robotics and machine learning,” 2016

work page 2016
[21]

Softgym: Benchma rking deep reinforcement learning for deformable object manipul ation,

X. Lin, Y . Wang, J. Olkin, and D. Held, “Softgym: Benchma rking deep reinforcement learning for deformable object manipul ation,” in Proceedings of the 2020 Conference on Robot Learning , ser. Proceedings of Machine Learning Research, J. Kober, F. Ramos, and C. Toml in, Eds., vol. 155. PMLR, 16–18 Nov 2021, pp. 432–448

work page 2020
[22]

Un iﬁed particle physics for real-time applications,

M. Macklin, M. M¨ uller, N. Chentanez, and T.-Y . Kim, “Un iﬁed particle physics for real-time applications,” ACM Trans. Graph. , vol. 33, no. 4, Jul. 2014

work page 2014
[23]

Garmentlab: A uniﬁed simulation and benchmark for garment manipulation,

H. Lu, R. Wu, Y . Li, S. Li, Z. Zhu, C. Ning, Y . Shen, L. Luo, Y . Chen, and H. Dong, “Garmentlab: A uniﬁed simulation and benchmark for garment manipulation,” in Advances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, Eds., vol. 37. Curran Associates, I nc., 2024, pp. 11 866–11 903

work page 2024
[24]

Towards building ai-cps with nvidia isaac sim: An in dustrial benchmark and case study for robotics manipulation,

Z. Zhou, J. Song, X. Xie, Z. Shu, L. Ma, D. Liu, J. Yin, and S. See, “Towards building ai-cps with nvidia isaac sim: An in dustrial benchmark and case study for robotics manipulation,” in Proceedings of the 46th International Conference on Software Engineering : Software Engineering in Practice , ser. ICSE-SEIP ’24. New Y ork, NY , USA: Association for Co...

work page 2024
[25]

Pointnet++: Deep hierarchical feature learning on point sets in a metric space,

C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” in Advances in Neural Information Processing Systems , I. Guyon, U. V . Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds. , vol. 30. Curran Associates, Inc., 2017

work page 2017
[26]

Cloth3d: Clot hed 3d humans,

H. Bertiche, M. Madadi, and S. Escalera, “Cloth3d: Clot hed 3d humans,” in Computer Vision – ECCV 2020 , A. V edaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds. Cham: Springer International Publishing , 2020, pp. 344–359

work page 2020
[27]

Adam: A Method for Stochastic Optimization

K. D. B. J. Adam et al. , “A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, vol. 1412, no. 6, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[1] [1]

Ide ntifying the potential for robotics to assist older adults in differe nt living environments,

T. L. Mitzner, T. L. Chen, C. C. Kemp, and W. A. Rogers, “Ide ntifying the potential for robotics to assist older adults in differe nt living environments,” International Journal of Social Robotics , vol. 6, no. 2, pp. 213–227, Apr 2014

work page 2014

[2] [2]

Managing activity difﬁculties at ho me: A survey of medicare beneﬁciaries,

B. J. Dudgeon, J. M. Hoffman, M. A. Ciol, A. Shumway-Cook, K. M. Y orkston, and L. Chan, “Managing activity difﬁculties at ho me: A survey of medicare beneﬁciaries,” Archives of Physical Medicine and Rehabilitation, vol. 89, no. 7, pp. 1256–1261, Jul 2008

work page 2008

[3] [3]

Challenges and outlook in robotic manipula tion of deformable objects,

J. Zhu, A. Cherubini, C. Dune, D. Navarro-Alarcon, F. Ala mbeigi, D. Berenson, F. Ficuciello, K. Harada, J. Kober, X. Li, J. Pan , W. Y uan, and M. Gienger, “Challenges and outlook in robotic manipula tion of deformable objects,” IEEE Robotics & Automation Magazine , vol. 29, no. 3, pp. 67–77, 2022

work page 2022

[4] [4]

Unigarmentman ip: A uniﬁed framework for category-level garment manipulatio n via dense visual correspondence,

R. Wu, H. Lu, Y . Wang, Y . Wang, and H. Dong, “Unigarmentman ip: A uniﬁed framework for category-level garment manipulatio n via dense visual correspondence,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , June 2024, pp. 16 340–16 350

work page 2024

[5] [5]

Speedfolding: Learning efﬁcient bimanual folding of garm ents,

Y . Avigal, L. Berscheid, T. Asfour, T. Kr¨ oger, and K. Gol dberg, “Speedfolding: Learning efﬁcient bimanual folding of garm ents,” in 2022 IEEE/RSJ International Conference on Intelligent Rob ots and Systems (IROS), 2022, pp. 1–8

work page 2022

[6] [6]

Cloth funnels: Canonicalized-alignment for mult i-purpose garment manipulation,

A. Canberk, C. Chi, H. Ha, B. Burchﬁel, E. Cousineau, S. Fe ng, and S. Song, “Cloth funnels: Canonicalized-alignment for mult i-purpose garment manipulation,” in 2023 IEEE International Conference on Robotics and Automation (ICRA) , 2023, pp. 5872–5879

work page 2023

[7] [7]

Unifolding : Towards sample-efﬁcient, scalable, and generalizable robotic gar ment folding,

H. Xue, Y . Li, W. Xu, H. Li, D. Zheng, and C. Lu, “Unifolding : Towards sample-efﬁcient, scalable, and generalizable robotic gar ment folding,” arXiv preprint arXiv:2311.01267 , 2023

work page arXiv 2023

[8] [8]

Folding clothes autonomously: A complete pipeline,

A. Doumanoglou, J. Stria, G. Peleka, I. Mariolis, V . Petr ´ ık, A. Kargakos, L. Wagner, V . Hlav´ aˇ c, T.-K. Kim, and S. Malassiotis, “Folding clothes autonomously: A complete pipeline,” IEEE Transactions on Robotics , vol. 32, no. 6, pp. 1461–1478, 2016

work page 2016

[9] [9]

Cloth grasp point detection based on multiple-view geometric cue s with appli- cation to robotic towel folding,

J. Maitin-Shepard, M. Cusumano-Towner, J. Lei, and P . Ab beel, “Cloth grasp point detection based on multiple-view geometric cue s with appli- cation to robotic towel folding,” in 2010 IEEE International Conference on Robotics and Automation , 2010, pp. 2308–2315

work page 2010

[10] [10]

Skeleton merger: An uns upervised aligned keypoint detector,

R. Shi, Z. Xue, Y . Y ou, and C. Lu, “Skeleton merger: An uns upervised aligned keypoint detector,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , June 2021, pp. 43–52

work page 2021

[11] [11]

Clothesnet: An information-rich 3d garment model repository with simulat ed clothes environment,

B. Zhou, H. Zhou, T. Liang, Q. Y u, S. Zhao, Y . Zeng, J. Lv, S . Luo, Q. Wang, X. Y u, H. Chen, C. Lu, and L. Shao, “Clothesnet: An information-rich 3d garment model repository with simulat ed clothes environment,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , October 2023, pp. 20 428–20 438

work page 2023

[12] [12]

Learning foresightful dens e visual affordance for deformable object manipulation,

R. Wu, C. Ning, and H. Dong, “Learning foresightful dens e visual affordance for deformable object manipulation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision ( ICCV), October 2023, pp. 10 947–10 956

work page 2023

[13] [13]

Dualafford: Learning collaborative visual affordance fo r dual-gripper manipulation,

Y . Zhao, R. Wu, Z. Chen, Y . Zhang, Q. Fan, K. Mo, and H. Dong , “Dualafford: Learning collaborative visual affordance fo r dual-gripper manipulation,” in International Conference on Learning Representa- tions, 2023

work page 2023

[14] [14]

A heuristic-based approach for ﬂattening wrinkled clothe s,

L. Sun, G. Aragon-Camarasa, P . Cockshott, S. Rogers, an d J. P . Siebert, “A heuristic-based approach for ﬂattening wrinkled clothe s,” in Towards Autonomous Robotic Systems , A. Natraj, S. Cameron, C. Melhuish, and M. Witkowski, Eds. Berlin, Heidelberg: Springer Berlin Hei delberg, 2014, pp. 148–160

work page 2014

[15] [15]

Model for unf olding laundry using interactive perception,

B. Willimon, S. Birchﬁeld, and I. Walker, “Model for unf olding laundry using interactive perception,” in 2011 IEEE/RSJ International Confer- ence on Intelligent Robots and Systems , 2011, pp. 4871–4876

work page 2011

[16] [16]

Learning to rearrange deformable cables , fabrics, and bags with goal-conditioned transporter networks,

D. Seita, P . Florence, J. Tompson, E. Coumans, V . Sindhw ani, K. Gold- berg, and A. Zeng, “Learning to rearrange deformable cables , fabrics, and bags with goal-conditioned transporter networks,” in 2021 IEEE International Conference on Robotics and Automation (ICRA ), 2021, pp. 4568–4575

work page 2021

[17] [17]

Learning dense visual corresponde nces in simulation to smooth and fold real fabrics,

A. Ganapathi, P . Sundaresan, B. Thananjeyan, A. Balakr ishna, D. Seita, J. Grannen, M. Hwang, R. Hoque, J. E. Gonzalez, N. Jamali, K. Y amane, S. Iba, and K. Goldberg, “Learning dense visual corresponde nces in simulation to smooth and fold real fabrics,” in 2021 IEEE International Conference on Robotics and Automation (ICRA) , 2021, pp. 11 515– 11 522

work page 2021

[18] [18]

Flingbot: The unreasonable effectiv eness of dynamic manipulation for cloth unfolding,

H. Ha and S. Song, “Flingbot: The unreasonable effectiv eness of dynamic manipulation for cloth unfolding,” in Proceedings of the 5th Conference on Robot Learning , ser. Proceedings of Machine Learning Research, A. Faust, D. Hsu, and G. Neumann, Eds., vol. 164. PM LR, 08–11 Nov 2022, pp. 24–33

work page 2022

[19] [19]

VisuoSpatial foresight for multi-step, multi-task fabric manipulation,

R. Hoque, D. Seita, A. Balakrishna, A. Ganapathi, A. K. T anwani, N. Jamali, K. Y amane, S. Iba, and K. Goldberg, “VisuoSpatial foresight for multi-step, multi-task fabric manipulation,” 2020

work page 2020

[20] [20]

Pybullet, a python module for phy sics simulation for games, robotics and machine learning,

E. Coumans and Y . Bai, “Pybullet, a python module for phy sics simulation for games, robotics and machine learning,” 2016

work page 2016

[21] [21]

Softgym: Benchma rking deep reinforcement learning for deformable object manipul ation,

X. Lin, Y . Wang, J. Olkin, and D. Held, “Softgym: Benchma rking deep reinforcement learning for deformable object manipul ation,” in Proceedings of the 2020 Conference on Robot Learning , ser. Proceedings of Machine Learning Research, J. Kober, F. Ramos, and C. Toml in, Eds., vol. 155. PMLR, 16–18 Nov 2021, pp. 432–448

work page 2020

[22] [22]

Un iﬁed particle physics for real-time applications,

M. Macklin, M. M¨ uller, N. Chentanez, and T.-Y . Kim, “Un iﬁed particle physics for real-time applications,” ACM Trans. Graph. , vol. 33, no. 4, Jul. 2014

work page 2014

[23] [23]

Garmentlab: A uniﬁed simulation and benchmark for garment manipulation,

H. Lu, R. Wu, Y . Li, S. Li, Z. Zhu, C. Ning, Y . Shen, L. Luo, Y . Chen, and H. Dong, “Garmentlab: A uniﬁed simulation and benchmark for garment manipulation,” in Advances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, Eds., vol. 37. Curran Associates, I nc., 2024, pp. 11 866–11 903

work page 2024

[24] [24]

Towards building ai-cps with nvidia isaac sim: An in dustrial benchmark and case study for robotics manipulation,

Z. Zhou, J. Song, X. Xie, Z. Shu, L. Ma, D. Liu, J. Yin, and S. See, “Towards building ai-cps with nvidia isaac sim: An in dustrial benchmark and case study for robotics manipulation,” in Proceedings of the 46th International Conference on Software Engineering : Software Engineering in Practice , ser. ICSE-SEIP ’24. New Y ork, NY , USA: Association for Co...

work page 2024

[25] [25]

Pointnet++: Deep hierarchical feature learning on point sets in a metric space,

C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” in Advances in Neural Information Processing Systems , I. Guyon, U. V . Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds. , vol. 30. Curran Associates, Inc., 2017

work page 2017

[26] [26]

Cloth3d: Clot hed 3d humans,

H. Bertiche, M. Madadi, and S. Escalera, “Cloth3d: Clot hed 3d humans,” in Computer Vision – ECCV 2020 , A. V edaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds. Cham: Springer International Publishing , 2020, pp. 344–359

work page 2020

[27] [27]

Adam: A Method for Stochastic Optimization

K. D. B. J. Adam et al. , “A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, vol. 1412, no. 6, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014