pith. sign in

arxiv: 1907.10844 · v1 · pith:KXVQNR4Enew · submitted 2019-07-25 · 💻 cs.CV

PU-GAN: a Point Cloud Upsampling Adversarial Network

Pith reviewed 2026-05-24 16:29 UTC · model grok-4.3

classification 💻 cs.CV
keywords point cloud upsamplinggenerative adversarial network3D surface reconstructionself-attentionpoint distribution uniformityrange scan processing
0
0 comments X

The pith

PU-GAN uses a generative adversarial network to learn point distributions that upsample sparse scans into uniform sets close to object surfaces.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces PU-GAN to address sparse, noisy, and non-uniform point clouds from range scans by training a GAN that samples varied distributions from latent space. The generator incorporates an up-down-up expansion unit for feature upsampling with error feedback plus a self-attention unit, while the loss combines adversarial, uniformity, and reconstruction terms. Evaluations claim the resulting points achieve better distribution uniformity, surface proximity, and downstream 3D reconstruction quality than prior methods. A reader would care because improved upsampling directly raises the fidelity of models built from real sensor data without requiring denser acquisition hardware.

Core claim

PU-GAN formulates point cloud upsampling inside a GAN framework where the generator learns rich point distributions from latent space and applies them to surface patches; it realizes this with an up-down-up expansion unit that upsamples features through error feedback and self-correction, a self-attention unit for integration, and a compound loss that trains the discriminator on adversarial, uniform, and reconstruction objectives, yielding outputs that exceed state-of-the-art uniformity, proximity, and reconstruction metrics.

What carries the argument

The up-down-up expansion unit that upsamples point features with error feedback and self-correction, augmented by a self-attention unit and driven by a compound adversarial-uniform-reconstruction loss.

If this is right

  • Upsampled point sets exhibit higher distribution uniformity across surface patches.
  • Generated points lie closer to the underlying object surface than outputs from earlier networks.
  • Downstream 3D reconstructions from the upsampled clouds achieve measurably higher quality.
  • The method outperforms existing upsampling approaches on both visual inspection and numerical benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same generator architecture might be retrained on different sensor noise profiles to handle varying acquisition conditions.
  • Patch-wise application could be replaced by a global network that processes entire scenes at once.
  • The learned latent distributions may transfer to related tasks such as point cloud denoising or completion.

Load-bearing premise

The GAN can learn a rich variety of point distributions from the latent space that generalize to upsample points over patches on object surfaces when the up-down-up expansion and self-attention units are used.

What would settle it

Quantitative comparison on a held-out set of scanned objects showing that PU-GAN outputs fail to exceed prior methods on uniformity or proximity-to-surface metrics.

Figures

Figures reproduced from arXiv: 1907.10844 by Chi-Wing Fu, Daniel Cohen-Or, Pheng-Ann Heng, Ruihui Li, Xianzhi Li.

Figure 1
Figure 1. Figure 1: Upsampling (b) a point set cropped from (a) the real [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of PU-GAN’s generator and discriminator architecture. Note that [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of the self-attention unit. [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: (a) Seed points (black dots) and patches (blue disks) on a [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Example point sets with same number of points (625) but [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparing point set upsampling (x4) and surface reconstruction results produced with different methods (c-f) from inputs (a). [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Using PU-GAN to upsample real-scanned point cloud data acquired by LiDAR. [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Visual comparison of (c) PU-GAN against (b) the base [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Upsampling results by applying PU-GAN to inputs [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗
read the original abstract

Point clouds acquired from range scans are often sparse, noisy, and non-uniform. This paper presents a new point cloud upsampling network called PU-GAN, which is formulated based on a generative adversarial network (GAN), to learn a rich variety of point distributions from the latent space and upsample points over patches on object surfaces. To realize a working GAN network, we construct an up-down-up expansion unit in the generator for upsampling point features with error feedback and self-correction, and formulate a self-attention unit to enhance the feature integration. Further, we design a compound loss with adversarial, uniform and reconstruction terms, to encourage the discriminator to learn more latent patterns and enhance the output point distribution uniformity. Qualitative and quantitative evaluations demonstrate the quality of our results over the state-of-the-arts in terms of distribution uniformity, proximity-to-surface, and 3D reconstruction quality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper proposes PU-GAN, a GAN for point cloud upsampling that learns rich point distributions from latent space to upsample patches on object surfaces. It introduces an up-down-up expansion unit in the generator for feature upsampling with error feedback and self-correction, a self-attention unit for feature integration, and a compound loss (adversarial + uniform + reconstruction) to improve output uniformity. The central claim is that qualitative and quantitative evaluations show superior performance over state-of-the-art methods in distribution uniformity, proximity-to-surface, and 3D reconstruction quality.

Significance. If the generalization claims hold under rigorous testing, the work would provide a practical GAN-based approach to handling sparse and non-uniform point clouds from range scans, with the self-attention and expansion units offering concrete architectural contributions to feature handling in unstructured data. The empirical focus on uniformity and reconstruction metrics aligns with needs in 3D vision applications, though the absence of detailed experimental protocols in the provided material limits assessment of broader impact.

major comments (1)
  1. [Abstract] Abstract: The claim that the compound loss enables the discriminator to learn rich latent patterns that generalize via the up-down-up expansion and self-attention units is load-bearing for the reported gains, yet no stability analysis, diversity metrics, or mitigation for mode collapse (e.g., gradient penalties or spectral normalization) is referenced, leaving the 'rich variety' assertion dependent on an unverified training dynamic.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment regarding the GAN training dynamics and the claims in the abstract. We address this point below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that the compound loss enables the discriminator to learn rich latent patterns that generalize via the up-down-up expansion and self-attention units is load-bearing for the reported gains, yet no stability analysis, diversity metrics, or mitigation for mode collapse (e.g., gradient penalties or spectral normalization) is referenced, leaving the 'rich variety' assertion dependent on an unverified training dynamic.

    Authors: We acknowledge that the original manuscript does not reference explicit stability analysis, diversity metrics, or standard mode-collapse mitigations such as gradient penalties or spectral normalization. The compound loss was designed to promote diverse latent patterns through the combination of adversarial, uniform, and reconstruction terms, and the reported quantitative gains in uniformity and reconstruction quality provide indirect empirical support that training did not suffer from collapse. However, to strengthen the manuscript, we will revise the abstract to moderate the phrasing around 'rich variety' and add a short discussion subsection on training dynamics (including loss curves and qualitative checks for sample diversity) in the experiments section. This revision will be made without new experiments. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical design with external validation

full rationale

The paper proposes a GAN architecture (generator with up-down-up expansion and self-attention units, plus compound adversarial-uniform-reconstruction loss) for point cloud upsampling and supports its claims solely through qualitative/quantitative comparisons to external SOTA methods on distribution uniformity, surface proximity, and reconstruction metrics. No derivation chain, equation, or claim reduces by construction to a fitted parameter renamed as prediction, a self-citation load-bearing premise, or any of the enumerated circular patterns; the method is self-contained against independent benchmark evaluations.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard GAN assumptions about learning distributions from latent space and the effectiveness of the custom units; no explicit free parameters or invented entities are detailed in the abstract.

free parameters (1)
  • GAN training hyperparameters
    Typical neural network training involves multiple tunable parameters such as learning rates and loss weights, though none are specified in the abstract.
axioms (1)
  • domain assumption GANs can learn rich point distributions from latent space for surface patches
    The generator formulation directly relies on this generative modeling premise.

pith-pipeline@v0.9.0 · 5689 in / 1128 out tokens · 25981 ms · 2026-05-24T16:29:14.772512+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages

  1. [1]

    http://www.infra-visionair.eu/

    Visionair. http://www.infra-visionair.eu/. Ac- cessed: 2019-07-24. 5

  2. [2]

    Achlioptas, O

    P. Achlioptas, O. Diamanti, I. Mitliagkas, and L. J. Guibas. Learning representations and generative models for 3D point clouds. In Int. Conf. on Machine Learning. (ICML) , pages 40–49, 2018. 2

  3. [3]

    Alexa, J

    M. Alexa, J. Behr, D. Cohen-Or, S. Fleishman, D. Levin, and C. T. Silva. Computing and rendering point set surfaces. IEEE Trans. Vis. & Comp. Graphics, 9(1):3–15, 2003. 1, 2

  4. [4]

    Berger, J

    M. Berger, J. A. Levine, L. G. Nonato, G. Taubin, and C. T. Silva. A benchmark for surface reconstruction. ACM Trans. on Graphics, 32(2):20:1–17, 2013. 6

  5. [5]

    Corsini, P

    M. Corsini, P. Cignoni, and R. Scopigno. Efficient and flexi- ble sampling with blue noise properties of triangular meshes. IEEE Trans. Vis. & Comp. Graphics, 18(6):914–924, 2012. 4

  6. [6]

    A. Dai, A. X. Chang, M. Savva, M. Halber, T. Funkhouser, and M. Nießner. ScanNet: Richly-annotated 3D reconstruc- tions of indoor scenes. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 5828–5839, 2017. 1

  7. [7]

    H. Fan, H. Su, and L. J. Guibas. A point set generation network for 3D object reconstruction from a single image. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 605–613, 2017. 5

  8. [8]

    Geiger, P

    A. Geiger, P. Lenz, C. Stiller, and R. Urtasun. Vision meets robotics: The KITTI dataset. The International Journal of Robotics Research, 32(11):1231–1237, 2013. 1, 7

  9. [9]

    I. J. Goodfellow. On distinguishability criteria for estimating generative models. InInt. Conf. on Learning Representations (ICLR) Workshops, 2015. 2

  10. [10]

    Gurumurthy and S

    S. Gurumurthy and S. Agrawal. High fidelity semantic shape completion for point clouds using latent optimization. In IEEE Winter Conf. on Applications of Computer Vision (WACV), pages 1099–1108, 2019. 2

  11. [11]

    J. Gwak, C. B. Choy, M. Chandraker, A. Garg, and S. Savarese. Weakly supervised 3D reconstruction with ad- versarial constraint. In 2017 International Conference on 3D Vision (3DV), pages 263–272, 2017. 2

  12. [12]

    Heusel, H

    M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In Int. Conf. on Advances in Neural Information Processing Systems (NIPS), pages 6626–6637, 2017. 5

  13. [13]

    Hua, M.-K

    B.-S. Hua, M.-K. Tran, and S.-K. Yeung. Pointwise convo- lutional neural networks. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 984–993, 2018. 2

  14. [14]

    Huang, D

    H. Huang, D. Li, H. Zhang, U. Ascher, and D. Cohen-Or. Consolidation of unorganized point clouds for surface re- construction. ACM Trans. on Graphics (SIGGRAPH Asia) , 28(5):176:1–7, 2009. 1, 2

  15. [15]

    Huang, S

    H. Huang, S. Wu, M. Gong, D. Cohen-Or, U. Ascher, and H. Zhang. Edge-aware point set resampling. ACM Trans. on Graphics, 32(1):9:1–12, 2013. 1, 2, 7

  16. [16]

    Isola, J.-Y

    P. Isola, J.-Y . Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) , pages 1125–1134, 2017. 2

  17. [17]

    Kazhdan and H

    M. Kazhdan and H. Hoppe. Screened Poisson surface recon- struction. ACM Trans. on Graphics , 32(3):29:1–13, 2013. 7

  18. [18]

    Kingma and J

    D. Kingma and J. Ba. Adam: A method for stochastic opti- mization. In Int. Conf. on Learning Representations (ICLR) ,

  19. [19]

    Ledig, L

    C. Ledig, L. Theis, F. Husz ´ar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. Photo-realistic single image super-resolution using a genera- tive adversarial network. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 4681–4690, 2017. 2

  20. [20]

    Y . Li, R. Bu, M. Sun, W. Wu, X. Di, and B. Chen. PointCNN: Convolution on X -transformed points. In Int. Conf. on Ad- vances in Neural Information Processing Systems (NIPS) , pages 828–838, 2018. 2

  21. [21]

    Lipman, D

    Y . Lipman, D. Cohen-Or, D. Levin, and H. Tal-Ezer. Parameterization-free projection for geometry reconstruc- tion. ACM Trans. on Graphics (SIGGRAPH) , 26(3):22:1–5,

  22. [22]

    X. Mao, Q. Li, H. Xie, R. Y . Lau, Z. Wang, and S. Paul Smol- ley. Least squares generative adversarial networks. In IEEE Int. Conf. on Computer Vision (ICCV) , pages 2794–2802,

  23. [23]

    C. R. Qi, H. Su, K. Mo, and L. J. Guibas. PointNet: Deep learning on point sets for 3D classification and segmentation. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 652–660, 2017. 2, 8

  24. [24]

    C. R. Qi, L. Yi, H. Su, and L. J. Guibas. PointNet++: Deep hierarchical feature learning on point sets in a metric space. In Int. Conf. on Advances in Neural Information Processing Systems (NIPS), pages 5099–5108, 2017. 2

  25. [25]

    Radford, L

    A. Radford, L. Metz, and S. Chintala. Unsupervised repre- sentation learning with deep convolutional generative adver- sarial networks. In Int. Conf. on Learning Representations (ICLR), 2016. 2

  26. [26]

    Y . Shen, C. Feng, Y . Yang, and D. Tian. Mining point cloud local structures by kernel correlation and graph pooling. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 4548–4557, 2018. 2

  27. [27]

    Smith and D

    E. Smith and D. Meger. Improved adversarial systems for 3D object generation and reconstruction. In 1st Annual Con- ference on Robot Learning(CoRL), pages 87–96, 2017. 2

  28. [28]

    S. Song, S. P. Lichtenberg, and J. Xiao. SUN RGB-D: A RGB-D scene understanding benchmark suite. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) , pages 567–576, 2015. 1

  29. [29]

    H. Su, V . Jampani, D. Sun, S. Maji, E. Kalogerakis, M.-H. Yang, and J. Kautz. SPLATNet: Sparse lattice networks for point cloud processing. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 2530–2539, 2018. 2

  30. [30]

    W. Wang, Q. Huang, S. You, C. Yang, and U. Neumann. Shape inpainting using 3D generative adversarial network and recurrent convolutional networks. In IEEE Int. Conf. on Computer Vision (ICCV), pages 2298–2306, 2017. 2

  31. [31]

    Y . Wang, Y . Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon. Dynamic graph CNN for learning on point clouds. ACM Trans. on Graphics, 2019. to appear. 2

  32. [32]

    J. Wu, C. Zhang, T. Xue, B. Freeman, and J. Tenenbaum. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In Int. Conf. on Advances in Neural Information Processing Systems (NIPS), pages 82– 90, 2016. 2

  33. [33]

    S. Wu, H. Huang, M. Gong, M. Zwicker, and D. Cohen- Or. Deep points consolidation. ACM Trans. on Graphics (SIGGRAPH Asia), 34(6):176:1–13, 2015. 1, 2

  34. [34]

    Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao. 3D ShapeNets: A deep representation for volumet- ric shapes. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 1912–1920, 2015. 8

  35. [35]

    Y . Xu, T. Fan, M. Xu, L. Zeng, and Y . Qiao. Spider- CNN: Deep learning on point sets with parameterized con- volutional filters. In European Conf. on Computer Vision (ECCV), pages 87–102, 2018. 2

  36. [36]

    B. Yang, H. Wen, S. Wang, R. Clark, A. Markham, and N. Trigoni. 3D object reconstruction from a single depth view with adversarial learning. In IEEE Int. Conf. on Com- puter Vision (ICCV), pages 679–688, 2017. 2

  37. [37]

    Y . Yang, C. Feng, Y . Shen, and D. Tian. FoldingNet: Point cloud auto-encoder via deep grid deformation. InIEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 206–215, 2018. 4

  38. [38]

    Yifan, S

    W. Yifan, S. Wu, H. Huang, D. Cohen-Or, and O. Sorkine- Hornung. Patch-based progressive 3D point set upsampling. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 5958–5967, 2019. 1, 2, 3, 5, 6, 7

  39. [39]

    L. Yu, X. Li, C.-W. Fu, D. Cohen-Or, and P.-A. Heng. EC- Net: An edge-aware point set consolidation network. In Eu- ropean Conf. on Computer Vision (ECCV) , pages 386–402,

  40. [40]

    L. Yu, X. Li, C.-W. Fu, D. Cohen-Or, and P.-A. Heng. PU- Net: Point cloud upsampling network. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) , pages 2790–2799, 2018. 1, 2, 3, 5, 7

  41. [41]

    W. Yuan, T. Khot, D. Held, C. Mertz, and M. Hebert. PCN: Point completion network. In 2018 International Conference on 3D Vision (3DV), pages 728–737, 2018. 2, 3

  42. [42]

    Zhang, I

    H. Zhang, I. J. Goodfellow, D. N. Metaxas, and A. Odena. Self-attention generative adversarial networks. In Int. Conf. on Machine Learning. (ICML), pages 7354–7363, 2019. 4

  43. [43]

    Zhang, T

    H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, and D. N. Metaxas. StackGAN: Text to photo-realistic image synthesis with stacked generative adversarial networks. In IEEE Int. Conf. on Computer Vision (ICCV) , pages 5907– 5915, 2017. 2

  44. [44]

    H. Zhao, L. Jiang, C.-W. Fu, and J. Jia. PointWeb: Enhanc- ing local neighborhood features for point cloud processing. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 5565–5573, 2019. 2