A review on deep learning techniques for 3D sensed data classification
Pith reviewed 2026-05-25 00:42 UTC · model grok-4.3
The pith
Deep learning methods for 3D sensed data classification fall into four main architecture categories.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that the state-of-the-art deep learning architectures for unstructured Euclidean 3D data can be grouped into RGB-D based methods, multi-view methods, volumetric methods, and fully end-to-end architecture designs, each supported by specific datasets, and that mapping these categories clarifies the path toward more capable classification systems.
What carries the argument
The four-category taxonomy of RGB-D, multi-view, volumetric, and end-to-end architecture designs for processing 3D sensed data.
If this is right
- Indoor robotics navigation systems can adopt more reliable 3D classification once the reviewed methods mature.
- National-scale remote sensing applications gain automated understanding of sensed data.
- Researchers gain documented datasets for benchmarking new classification models.
- Future work concentrates on the research areas the discussion identifies as most valuable.
Where Pith is reading between the lines
- The taxonomy supplies a baseline that later reviews can use to measure how the field has progressed.
- Hybrid methods that combine elements from more than one category may emerge as a natural next step.
- Periodic updates to the dataset list would keep the overview useful as new collections appear.
Load-bearing premise
That the four categories and the listed datasets together give a representative picture of the field without major omissions at the time of writing.
What would settle it
A widely used deep learning method for 3D data classification that cannot be placed in any of the four architecture categories.
Figures
read the original abstract
Over the past decade deep learning has driven progress in 2D image understanding. Despite these advancements, techniques for automatic 3D sensed data understanding, such as point clouds, is comparatively immature. However, with a range of important applications from indoor robotics navigation to national scale remote sensing there is a high demand for algorithms that can learn to automatically understand and classify 3D sensed data. In this paper we review the current state-of-the-art deep learning architectures for processing unstructured Euclidean data. We begin by addressing the background concepts and traditional methodologies. We review the current main approaches including; RGB-D, multi-view, volumetric and fully end-to-end architecture designs. Datasets for each category are documented and explained. Finally, we give a detailed discussion about the future of deep learning for 3D sensed data, using literature to justify the areas where future research would be most valuable.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper is a survey reviewing deep learning techniques for 3D sensed data classification. It covers background concepts and traditional methodologies, then examines four main approach categories (RGB-D, multi-view, volumetric, and fully end-to-end architectures), documents associated datasets, and concludes with a literature-based discussion of future research directions in the field.
Significance. If the review accurately and representatively synthesizes the literature, it would offer a useful consolidation of the state of 3D deep learning as of 2019, particularly for applications in robotics and remote sensing where 3D data processing lags behind 2D. The explicit documentation of datasets and forward-looking discussion add practical value for researchers entering the area.
minor comments (3)
- [Abstract] Abstract: grammatical error in 'such as point clouds, is comparatively immature' (subject-verb agreement); should be 'are comparatively immature'.
- [Abstract] Abstract: unnecessary semicolon in 'including; RGB-D'; rephrase to 'including RGB-D, multi-view, volumetric and fully end-to-end architecture designs' for clarity.
- [Abstract] The manuscript should ensure consistent terminology between the title ('3D sensed data classification') and abstract ('processing unstructured Euclidean data') to avoid reader confusion.
Simulated Author's Rebuttal
We thank the referee for their supportive summary, recognition of the paper's potential value for researchers in robotics and remote sensing, and recommendation of minor revision. No specific major comments were raised in the report.
Circularity Check
No significant circularity: pure literature review with no derivations or predictions
full rationale
This is a survey paper that summarizes background concepts, existing architectures (RGB-D, multi-view, volumetric, end-to-end), datasets, and future directions drawn from external literature. No original equations, fitted parameters, predictions, or derivation chains are present. All claims are descriptive citations of prior work; the representativeness of coverage is an external validity issue, not a circularity issue. No steps reduce to self-definition, fitted inputs, or self-citation chains.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
3D free-form object recognition in range images using local surface patches,
H. Chen and B. Bhanu, “3D free-form object recognition in range images using local surface patches,” Pattern Recognition Letters, vol. 28, no. 10, pp. 1252–1262, 2007
work page 2007
-
[2]
Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes,
A. E. Johnson and M. Hebert, “Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes,” IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 21, no. 5, pp. 433–449, 1999
work page 1999
-
[3]
Intrinsic shape signatures: A shape descriptor for 3D object recognition,
Y . Zhong, “Intrinsic shape signatures: A shape descriptor for 3D object recognition,” in 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 2009, pp. 689–696
work page 2009
-
[4]
A Concise and Provably Informative Multi-Scale Signature Based on Heat Diffusion,
J. Sun, M. Ovsjanikov, and L. Guibas, “A Concise and Provably Informative Multi-Scale Signature Based on Heat Diffusion,” Computer Graphics Forum, vol. 28, no. 5, pp. 1383–1392, 2009
work page 2009
-
[5]
Rapid object indexing using locality sensitive hashing and joint 3D-signature space estimation,
B. Matei, Y . Shan, H. S. Sawhney, Y . Tan, R. Kumar, D. Huber, and M. Hebert, “Rapid object indexing using locality sensitive hashing and joint 3D-signature space estimation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 7, pp. 1111–1126, 2006
work page 2006
-
[6]
Real-time Object Recognition in Sparse Range Images Using Error Surface Embedding,
L. Shang and M. Greenspan, “Real-time Object Recognition in Sparse Range Images Using Error Surface Embedding,” International Journal of Computer Vision, vol. 89, no. 2, pp. 211–228, 2010
work page 2010
-
[7]
Rotational Projection Statistics for 3D Local Surface Description and Object Recognition,
Y . Guo, F. Sohel, M. Bennamoun, M. Lu, and J. Wan, “Rotational Projection Statistics for 3D Local Surface Description and Object Recognition,” International Journal of Computer Vision, vol. 105, no. 1, pp. 63–86, 2013
work page 2013
-
[8]
Y . LeCun, Y . Bengio, and G. Hinton, “Deep learning,”Nature, vol. 521, no. 7553, pp. 436–444, 2015
work page 2015
-
[9]
Multidimensional binary search trees used for associative searching,
J. L. Bentley, “Multidimensional binary search trees used for associative searching,” Communications of the ACM, vol. 18, no. 9, pp. 509–517, 1975
work page 1975
-
[10]
Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration,
M. Muja and D. Lowe, “Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration,” in Proceedings of the Fourth International Conference on Computer Vision Theory and Applications. Lisboa, Portugal: SciTePress - Science and and Technology Publications, 2009, pp. 331–340
work page 2009
-
[11]
M. Weinmann, B. Jutzi, S. Hinz, and C. Mallet, “Semantic point cloud interpretation based on optimal neighbor- hoods, relevant features and efficient classifiers,”ISPRS Journal of Photogrammetry and Remote Sensing, vol. 105, pp. 286–304, 2015
work page 2015
-
[12]
Contextual classification of lidar data and building object detection in urban areas,
J. Niemeyer, F. Rottensteiner, and U. Soergel, “Contextual classification of lidar data and building object detection in urban areas,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 87, pp. 152–165, 2014
work page 2014
-
[13]
Multi-scale Feature Extraction on Point-Sampled Surfaces,
M. Pauly, R. Keiser, and M. Gross, “Multi-scale Feature Extraction on Point-Sampled Surfaces,” Computer Graphics Forum, vol. 22, no. 3, pp. 281–289, 2003
work page 2003
-
[14]
N. Brodu and D. Lague, “3D terrestrial lidar data classification of complex natural scenes using a multi-scale dimensionality criterion: Applications in geomorphology,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 68, pp. 121–134, 2012
work page 2012
-
[15]
Dimensionality Based Scale Selection in 3D LiDAR Point Clouds,
J. Demantké, C. Mallet, N. David, and B. Vallet, “Dimensionality Based Scale Selection in 3D LiDAR Point Clouds,” ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. XXXVIII-5/W12, pp. 97–102, 2012
work page 2012
-
[16]
Classification of Aerial Photogrammetric 3D Point Clouds,
C. Becker, N. Häni, E. Rosinskaya, E. d’Angelo, and C. Strecha, “Classification of Aerial Photogrammetric 3D Point Clouds,” ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. IV-1/W1, pp. 3–10, 2017
work page 2017
-
[17]
3D Urban GIS From Laser Altimeter And 2D Map Data,
N. Haala, C. Brenner, and K.-h. Anders, “3D Urban GIS From Laser Altimeter And 2D Map Data,” in Interna- tional Archives of Photogrammetry and Remote Sensing, 1998, pp. 339–346
work page 1998
-
[18]
Extraction of buildings and trees in urban environments,
N. Haala and C. Brenner, “Extraction of buildings and trees in urban environments,” ISPRS Journal of Pho- togrammetry and Remote Sensing, vol. 54, no. 2, pp. 130–137, 1999
work page 1999
-
[19]
Slope Based Filtering of Laser Altimetry Data,
G. V osselman, “Slope Based Filtering of Laser Altimetry Data,”International Archives of Photogrammetry and Remote Sensing, vol. 33(Part 3B), pp. 935–942, 2000
work page 2000
-
[20]
Digital terrain models from airborne laser scanner data – a grid based approach,
R. Wack and A. Wimmer, “Digital terrain models from airborne laser scanner data – a grid based approach,” International Archives of Photogrammetry and Remote Sensing, vol. 34 (Part 3B), pp. 293–296, 2002
work page 2002
-
[21]
Enhanced Computer Vision With Microsoft Kinect Sensor: A Review,
J. Han, L. Shao, D. Xu, and J. Shotton, “Enhanced Computer Vision With Microsoft Kinect Sensor: A Review,” IEEE Transactions on Cybernetics, vol. 43, no. 5, pp. 1318–1334, 2013. 20 A PREPRINT - JULY 11, 2019
work page 2013
-
[22]
Human detection using depth information by Kinect,
L. Xia, C. Chen, and J. K. Aggarwal, “Human detection using depth information by Kinect,” in CVPR 2011 WORKSHOPS, 2011, pp. 15–22
work page 2011
-
[23]
Hierarchical image segmentation algorithm in depth image processing,
J. Yin and S. Kong, “Hierarchical image segmentation algorithm in depth image processing,” Journal of Multimedia, vol. 8, no. 5, pp. 512–518, 2013
work page 2013
-
[24]
A. K. Aijazi, P. Checchin, and L. Trassoudaine, “Segmentation based classification of 3D urban point clouds: A super-voxel based approach with evaluation,” Remote Sensing, vol. 5, no. 4, pp. 1624–1650, 2013
work page 2013
-
[25]
Imagenet classification with deep convolutional neural networks,
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, 2012, pp. 1097–1105
work page 2012
-
[26]
Object Recognition with Gradient-Based Learning,
Y . LeCun, P. Haffner, L. Bottou, and Y . Bengio, “Object Recognition with Gradient-Based Learning,” inShape, Contour and Grouping in Computer Vision, ser. Lecture Notes in Computer Science, D. A. Forsyth, J. L. Mundy, V . di Gesú, and R. Cipolla, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 1999, pp. 319–345
work page 1999
-
[27]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
work page 2016
-
[28]
Rich feature hierarchies for accurate object detection and semantic segmentation,
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014
work page 2014
-
[29]
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y . LeCun, “Overfeat: Integrated recognition, localization and detection using convolutional networks,” arXiv preprint arXiv:1312.6229, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[30]
Ssd: Single shot multibox detector,
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y . Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in European Conference on Computer Vision. Springer, 2016, pp. 21–37
work page 2016
-
[31]
You only look once: Unified, real-time object detection,
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 779–788
work page 2016
-
[32]
Fully convolutional networks for semantic segmentation,
J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431–3440
work page 2015
-
[33]
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation,
V . Badrinarayanan, A. Kendall, and R. Cipolla, “SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 12, pp. 2481–2495, 2017
work page 2017
-
[34]
Simultaneous Detection and Segmentation,
B. Hariharan, P. Arbeláez, R. Girshick, and J. Malik, “Simultaneous Detection and Segmentation,” in Computer Vision – ECCV 2014, ser. Lecture Notes in Computer Science, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds. Springer International Publishing, 2014, pp. 297–312
work page 2014
-
[35]
Multiscale Combinatorial Grouping,
P. Arbelaez, J. Pont-Tuset, J. T. Barron, F. Marques, and J. Malik, “Multiscale Combinatorial Grouping,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 328–335
work page 2014
-
[36]
Learning to Segment Object Candidates,
P. O. Pinheiro, R. Collobert, and P. Dollar, “Learning to Segment Object Candidates,” in Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, Eds. Curran Associates, Inc., 2015, pp. 1990–1998
work page 2015
-
[37]
Learning to Refine Object Segments
P. O. Pinheiro, T.-Y . Lin, R. Collobert, and P. Dollàr, “Learning to Refine Object Segments,”arXiv:1603.08695 [cs], 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[38]
K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” arXiv:1703.06870 [cs], 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[39]
A large-scale hierarchical multi-view RGB-D object dataset,
K. Lai, L. Bo, X. Ren, and D. Fox, “A large-scale hierarchical multi-view RGB-D object dataset,” in 2011 IEEE International Conference on Robotics and Automation, 2011, pp. 1817–1824
work page 2011
-
[40]
Indoor Segmentation and Support Inference from RGBD Images,
N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor Segmentation and Support Inference from RGBD Images,” in Computer Vision – ECCV 2012, ser. Lecture Notes in Computer Science, A. Fitzgibbon, S. Lazebnik, P. Perona, Y . Sato, and C. Schmid, Eds. Springer Berlin Heidelberg, 2012, pp. 746–760
work page 2012
-
[41]
SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels,
J. Xiao, A. Owens, and A. Torralba, “SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels,” in Proceedings of the IEEE International Conference on Computer Vision, 2013, pp. 1625–1632
work page 2013
-
[42]
ViDRILO: The Visual and Depth Robot Indoor Localization with Objects information dataset,
J. Martínez-Gómez, I. García-Varea, M. Cazorla, and V . Morell, “ViDRILO: The Visual and Depth Robot Indoor Localization with Objects information dataset,” The International Journal of Robotics Research, vol. 34, no. 14, pp. 1681–1687, 2015
work page 2015
-
[43]
SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite,
S. Song, S. P. Lichtenberg, and J. Xiao, “SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 567–576
work page 2015
-
[44]
A Benchmark for 3D Mesh Segmentation,
X. Chen, A. Golovinskiy, and T. Funkhouser, “A Benchmark for 3D Mesh Segmentation,” in ACM SIGGRAPH 2009 Papers, ser. SIGGRAPH ’09. New York, NY , USA: ACM, 2009, pp. 73:1–73:12. 21 A PREPRINT - JULY 11, 2019
work page 2009
-
[45]
ShapeNet: An Information-Rich 3D Model Repository
A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, J. Xiao, L. Yi, and F. Yu, “ShapeNet: An Information-Rich 3D Model Repository,” arXiv:1512.03012 [cs], 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[46]
A Scalable Active Framework for Region Annotation in 3D Shape Collections,
L. Yi, V . G. Kim, D. Ceylan, I.-C. Shen, M. Yan, H. Su, C. Lu, Q. Huang, A. Sheffer, and L. Guibas, “A Scalable Active Framework for Region Annotation in 3D Shape Collections,” ACM Trans. Graph., vol. 35, no. 6, pp. 210:1–210:12, 2016
work page 2016
-
[47]
Joint 2D-3D-Semantic Data for Indoor Scene Understanding
I. Armeni, S. Sax, A. R. Zamir, and S. Savarese, “Joint 2D-3D-Semantic Data for Indoor Scene Understanding,” arXiv:1702.01105 [cs], 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[48]
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
A. Dai, A. X. Chang, M. Savva, M. Halber, T. Funkhouser, and M. Nießner, “ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes,” arXiv:1702.04405 [cs], 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[49]
Bundlefusion: Real-time globally consistent 3d reconstruction using on-the-fly surface reintegration,
A. Dai, M. Nießner, M. Zollhöfer, S. Izadi, and C. Theobalt, “Bundlefusion: Real-time globally consistent 3d reconstruction using on-the-fly surface reintegration,” ACM Transactions on Graphics (ToG), vol. 36, no. 4, p. 76a, 2017
work page 2017
-
[50]
Contextual classification with functional Max-Margin Markov Networks,
D. Munoz, J. A. Bagnell, N. Vandapel, and M. Hebert, “Contextual classification with functional Max-Margin Markov Networks,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 975–982
work page 2009
-
[51]
An occlusion-aware feature for range images,
A. Quadros, J. P. Underwood, and B. Douillard, “An occlusion-aware feature for range images,” in 2012 IEEE International Conference on Robotics and Automation, 2012, pp. 4428–4435
work page 2012
-
[52]
A. Serna, B. Marcotegui, F. Goulette, and J.-E. Deschaud, “Paris-rue-Madame database: A 3D mobile laser scanner dataset for benchmarking urban detection, segmentation and classification methods,” in4th International Conference on Pattern Recognition, Applications and Methods ICPRAM 2014, Angers, France, 2014
work page 2014
-
[53]
TerraMobilita/iQmulus urban point cloud analysis benchmark,
B. Vallet, M. Brédif, A. Serna, B. Marcotegui, and N. Paparoditis, “TerraMobilita/iQmulus urban point cloud analysis benchmark,” Computers & Graphics, vol. 49, pp. 126–133, 2015
work page 2015
-
[54]
An Approach To Extract Moving Object From MLS Data Using A V olumetric Background Representation,
J. Gehrung, M. Hebel, M. Arens, and U. Stilla, “An Approach To Extract Moving Object From MLS Data Using A V olumetric Background Representation,” ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. IV-1/W1, pp. 107–114, 2017
work page 2017
-
[55]
Semantic3D.net: A new Large-scale Point Cloud Classification Benchmark
T. Hackel, N. Savinov, L. Ladicky, J. D. Wegner, K. Schindler, and M. Pollefeys, “Semantic3D.net: A new Large-scale Point Cloud Classification Benchmark,”arXiv:1704.03847 [cs], 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[56]
X. Roynard, J.-E. Deschaud, and F. Goulette, “Paris-lille-3d: A large and high-quality ground-truth urban point cloud dataset for automatic segmentation and classification,”The International Journal of Robotics Research, vol. 37, no. 6, pp. 545–557, 2018
work page 2018
-
[57]
Convolutional-recursive deep learning for 3d object classification,
R. Socher, B. Huval, B. Bath, C. D. Manning, and A. Y . Ng, “Convolutional-recursive deep learning for 3d object classification,” inAdvances in Neural Information Processing Systems, 2012, pp. 656–664
work page 2012
-
[58]
Multimodal deep learning for robust RGB-D object recognition,
A. Eitel, J. T. Springenberg, L. Spinello, M. Riedmiller, and W. Burgard, “Multimodal deep learning for robust RGB-D object recognition,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015, pp. 681–687
work page 2015
-
[59]
Indoor Semantic Segmentation using depth information
C. Couprie, C. Farabet, L. Najman, and Y . LeCun, “Indoor Semantic Segmentation using depth information,” arXiv:1301.3572 [cs], 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[60]
Learning Hierarchical Features for Scene Labeling,
C. Farabet, C. Couprie, L. Najman, and Y . LeCun, “Learning Hierarchical Features for Scene Labeling,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 8, pp. 1915–1929, 2013
work page 1915
-
[61]
Learning rich features from RGB-D images for object detection and segmentation,
S. Gupta, R. Girshick, P. Arbeláez, and J. Malik, “Learning rich features from RGB-D images for object detection and segmentation,” in European Conference on Computer Vision. Springer, 2014, pp. 345–360
work page 2014
-
[62]
Structured Forests for Fast Edge Detection,
P. Dollar and C. L. Zitnick, “Structured Forests for Fast Edge Detection,” inProceedings of the IEEE International Conference on Computer Vision, 2013, pp. 1841–1848
work page 2013
-
[63]
Perceptual Organization and Recognition of Indoor Scenes from RGB- D Images,
S. Gupta, P. Arbelaez, and J. Malik, “Perceptual Organization and Recognition of Indoor Scenes from RGB- D Images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2013, pp. 564–571
work page 2013
-
[64]
Automatic corine land cover classification from airborne lidar data,
J. Balado, P. Arias, L. Díaz-Vilariño, and L. M. González-deSantos, “Automatic corine land cover classification from airborne lidar data,” Procedia Computer Science, vol. 126, pp. 186–194, 2018
work page 2018
-
[65]
LSTM-CF: Unifying Context Modeling and Fusion with LSTMs for RGB-D Scene Labeling,
Z. Li, Y . Gan, X. Liang, Y . Yu, H. Cheng, and L. Lin, “LSTM-CF: Unifying Context Modeling and Fusion with LSTMs for RGB-D Scene Labeling,” /paper/LSTM-CF%3A-Unifying-Context-Modeling-and-Fusion-with-Li- Gan/df4b5974b22e7c46611daf1926c4d2a7400145ad, 2016
work page 2016
-
[66]
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs,” arXiv:1412.7062 [cs], 2014. 22 A PREPRINT - JULY 11, 2019
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[67]
FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-Based CNN Architecture,
C. Hazirbas, L. Ma, C. Domokos, and D. Cremers, “FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-Based CNN Architecture,” in Computer Vision – ACCV 2016, ser. Lecture Notes in Computer Science, S.-H. Lai, V . Lepetit, K. Nishino, and Y . Sato, Eds. Springer International Publishing, 2017, pp. 213–228
work page 2016
-
[68]
Multi-view Self-supervised Deep Learning for 6D Pose Estimation in the Amazon Picking Challenge
A. Zeng, K.-T. Yu, S. Song, D. Suo, E. Walker Jr., A. Rodriguez, and J. Xiao, “Multi-view Self-supervised Deep Learning for 6D Pose Estimation in the Amazon Picking Challenge,” arXiv:1609.09475 [cs], 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[69]
Multi-View Deep Learning for Consistent Semantic Mapping with RGB-D Cameras
L. Ma, J. Stückler, C. Kerl, and D. Cremers, “Multi-View Deep Learning for Consistent Semantic Mapping with RGB-D Cameras,” arXiv:1703.08866 [cs], 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[70]
V oxNet: A 3D Convolutional Neural Network for real-time object recognition,
D. Maturana and S. Scherer, “V oxNet: A 3D Convolutional Neural Network for real-time object recognition,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015, pp. 922–928
work page 2015
-
[71]
3D ShapeNets: A Deep Representation for V olumetric Shapes,
Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, “3D ShapeNets: A Deep Representation for V olumetric Shapes,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1912–1920
work page 2015
-
[72]
Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images,
S. Song and J. Xiao, “Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 808–816
work page 2016
-
[73]
Sliding Shapes for 3D Object Detection in Depth Images,
——, “Sliding Shapes for 3D Object Detection in Depth Images,” in Computer Vision – ECCV 2014, ser. Lecture Notes in Computer Science, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds. Springer International Publishing, 2014, pp. 634–651
work page 2014
-
[74]
Volumetric and Multi-View CNNs for Object Classification on 3D Data
C. R. Qi, H. Su, M. Niessner, A. Dai, M. Yan, and L. J. Guibas, “V olumetric and Multi-View CNNs for Object Classification on 3D Data,”arXiv:1604.03265 [cs], 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[75]
M. Lin, Q. Chen, and S. Yan, “Network In Network,” arXiv:1312.4400 [cs], 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[76]
Point cloud labeling using 3D Convolutional Neural Network,
J. Huang and S. You, “Point cloud labeling using 3D Convolutional Neural Network,” in 2016 23rd International Conference on Pattern Recognition (ICPR), 2016, pp. 2670–2675
work page 2016
-
[77]
SEGCloud: Semantic Segmentation of 3D Point Clouds,
L. Tchapmi, C. Choy, I. Armeni, J. Gwak, and S. Savarese, “SEGCloud: Semantic Segmentation of 3D Point Clouds,” in 2017 International Conference on 3D Vision (3DV), 2017, pp. 537–547
work page 2017
-
[78]
Multi-view convolutional neural networks for 3d shape recognition,
H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller, “Multi-view convolutional neural networks for 3d shape recognition,” in The IEEE International Conference on Computer Vision (ICCV), December 2015
work page 2015
-
[79]
Learning methods for generic object recognition with invariance to pose and lighting,
Y . LeCun, F. J. Huang, and L. Bottou, “Learning methods for generic object recognition with invariance to pose and lighting,” in Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., vol. 2, 2004, pp. II–104 V ol.2
work page 2004
-
[80]
3D Shape Segmentation With Projective Convolutional Networks,
E. Kalogerakis, M. Averkiou, S. Maji, and S. Chaudhuri, “3D Shape Segmentation With Projective Convolutional Networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2017, pp. 3779–3788
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.