Recognition: unknown
Feature importance for machine learning redshifts applied to SDSS galaxies
read the original abstract
We present an analysis of importance feature selection applied to photometric redshift estimation using the machine learning architecture Decision Trees with the ensemble learning routine Adaboost (hereafter RDF). We select a list of 85 easily measured (or derived) photometric quantities (or `features') and spectroscopic redshifts for almost two million galaxies from the Sloan Digital Sky Survey Data Release 10. After identifying which features have the most predictive power, we use standard artificial Neural Networks (aNN) to show that the addition of these features, in combination with the standard magnitudes and colours, improves the machine learning redshift estimate by 18% and decreases the catastrophic outlier rate by 32%. We further compare the redshift estimate using RDF with those from two different aNNs, and with photometric redshifts available from the SDSS. We find that the RDF requires orders of magnitude less computation time than the aNNs to obtain a machine learning redshift while reducing both the catastrophic outlier rate by up to 43%, and the redshift error by up to 25%. When compared to the SDSS photometric redshifts, the RDF machine learning redshifts both decreases the standard deviation of residuals scaled by 1/(1+z) by 36% from 0.066 to 0.041, and decreases the fraction of catastrophic outliers by 57% from 2.32% to 0.99%.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Machine Learning Techniques for Astrophysics and Cosmology: Photometric Redshifts
AI techniques for photometric redshift estimation have converged and are now limited by the size, systematics, and selection effects in spectroscopic training samples rather than by methodology.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.