We propose a new sequential classification model for astronomical objects based on a recurrent convolutional neural network (RCNN) which uses sequences of images as inputs. This approach avoids the computation of light curves or difference images. This is the first time that sequences of images are used directly for the classification of variable objects in astronomy. The second contribution of this work is the image simulation process. We generate synthetic image sequences which take into account the instrumental and observing conditions, obtaining a realistic, unevenly sampled, and variable noise set of movies for each astronomical object. The simulated data set is used to train our RCNN classifier. This approach allows us to generate data sets to train and test our RCNN model for different astronomical surveys and telescopes. Moreover, using a simulated data set is faster and more adaptable to different surveys and classification tasks. We aim to build a simulated data set whose distribution is close enough to the real data set, so that fine tuning could match the distributions. To test the RCNN classifier trained with the synthetic data set, we used real-world data from the High cadence Transient Survey (HiTS), obtaining an average recall of 85%, improved to 94% after performing fine tuning with 10 real samples per class. We compare the results of our RCNN model with those of a light curve random forest classifier. The proposed RCNN with fine tuning has a similar performance on the HiTS data set compared to the light curve random forest classifier, trained on an augmented training set with 10 real samples per class. The RCNN approach presents several advantages in an alert stream classification scenario, such as a reduction of the data pre-processing, faster online evaluation, and easier performance improvement using a few real data samples. The results obtained encourage us to use the proposed method for astronomical alert broker systems that will process alert streams generated by new telescopes such as the Large Synoptic Survey Telescope.
Ralph, Nicholas O.; Norris, Ray P.; Fang, Gu; Park, Laurence A. F.; Galvin, Timothy J.; Alger, Matthew J.; Andernach, Heinz; Lintott, Chris; Rudnick, Lawrence; Shabala, Stanislav; Wong, O. Ivy
doi: 10.1088/1538-3873/ab213dpmid: N/A
This paper demonstrates a novel and efficient unsupervised clustering method with the combination of a self-organizing map (SOM) and a convolutional autoencoder. The rapidly increasing volume of radio-astronomical data has increased demand for machine-learning methods as solutions to classification and outlier detection. Major astronomical discoveries are unplanned and found in the unexpected, making unsupervised machine learning highly desirable by operating without assumptions and labeled training data. Our approach shows SOM training time is drastically reduced and high-level features can be clustered by training on auto-encoded feature vectors instead of raw images. Our results demonstrate this method is capable of accurately separating outliers on a SOM with neighborhood similarity and K-means clustering of radio-astronomical features. We present this method as a powerful new approach to data exploration by providing a detailed understanding of the morphology and relationships of Radio Galaxy Zoo (RGZ) data set image features which can be applied to new radio survey data.
Norris, Ray P.; Salvato, M.; Longo, G.; Brescia, M.; Budavari, T.; Carliles, S.; Cavuoti, S.; Farrah, D.; Geach, J.; Luken, K.; Musaeva, A.; Polsterer, K.; Riccio, G.; Seymour, N.; Smolčić, V.; Vaccari, M.; Zinn, P.
doi: 10.1088/1538-3873/ab0f7bpmid: N/A
Future radio surveys will generate catalogs of tens of millions of radio sources, for which redshift estimates will be essential to achieve many of the science goals. However, spectroscopic data will be available for only a small fraction of these sources, and in most cases even the optical and infrared photometry will be of limited quality. Furthermore, radio sources tend to be at higher redshift than most optical sources (most radio surveys have a median redshift greater than 1) and so a significant fraction of radio sources hosts differ from those for which most photometric redshift templates are designed. We therefore need to develop new techniques for estimating the redshifts of radio sources. As a starting point in this process, we evaluate a number of machine-learning techniques for estimating redshift, together with a conventional template-fitting technique. We pay special attention to how the performance is affected by the incompleteness of the training sample and by sparseness of the parameter space or by limited availability of ancillary multiwavelength data. As expected, we find that the quality of the photometric-redshift degrades as the quality of the photometry decreases, but that even with the limited quality of photometry available for all-sky-surveys, useful redshift information is available for the majority of sources, particularly at low redshift. We find that a template-fitting technique performs best in the presence of high-quality and almost complete multi-band photometry, especially if radio sources that are also X-ray emitting are treated separately, using specific templates and priors. When we reduced the quality of photometry to match that available for the EMU all-sky radio survey, the quality of the template-fitting degraded and became comparable to some of the machine-learning methods. Machine learning techniques currently perform better at low redshift than at high redshift, because of incompleteness of the currently available training data at high redshifts.
Vilalta, Ricardo; Gupta, Kinjal Dhar; Boumber, Dainis; Meskhi, Mikhail M.
doi: 10.1088/1538-3873/aaf1fcpmid: N/A
The ability to build a model on a source task and subsequently adapt this model to a new target task is a pervasive need in many astronomical applications. The problem is generally known in the machine learning field as transfer learning, where domain adaptation is a popular scenario. An example is to build a predictive model on spectroscopic data to identify Type Ia supernovae (SNe Ia), while subsequently trying to adapt such a model to photometric data. In this paper we propose a new general approach to domain adaptation which does not rely on the proximity of source and target distributions. Instead we simply assume a strong similarity in model complexity across domains, and use active learning to mitigate the dependence on source examples. Our work leads to a new formulation for the likelihood as a function of empirical error using a theoretical learning bound; the result is a novel mapping from generalization error to a likelihood estimation. Results using two real astronomical problems, SN Ia classification and identification of Mars landforms, show two main advantages of our approach: increased performance accuracy and substantial savings in computational cost.
Pérez-Carrasco, M.; Cabrera-Vives, G.; Martinez-Marin, M.; Cerulo, P.; Demarco, R.; Protopapas, P.; Godoy, J.; Huertas-Company, M.
doi: 10.1088/1538-3873/aaeeb4pmid: N/A
We present visual-like morphologies over 16 photometric bands, from ultraviolet to near-infrared, for 8412 galaxies in the Cluster Lensing And Supernova survey with Hubble (CLASH) obtained using a convolutional neural network (ConvNet) model. Our model follows the Cosmic Assembly Near-IR Deep Extragalactic Legacy Survey (CANDELS) main morphological classification scheme, obtaining the probability for each galaxy at each CLASH band of being spheroid, disk, irregular, point source, or unclassifiable. Our catalog contains morphologies for each galaxy with Hmag < 24.5 in every filter where the galaxy is observed. We trained an initial ConvNet model using approximately 7500 expert eyeball labels from CANDELS. We created eyeball labels for 100 randomly selected galaxies per each of the 16-filter set of CLASH (1600 galaxy images in total), where each image was classified by at least five of us. We use these labels to fine-tune the network to accurately predict labels for the CLASH data and to evaluate the performance of our model. We achieve a root-mean-square error of 0.0991 on the test set. We show that our proposed fine-tuning technique reduces the number of labeled images needed for training, as compared to directly training over the CLASH data, and achieves a better performance. This approach is very useful to minimize eyeball labeling efforts when classifying unlabeled data from new surveys. This will become particularly useful for massive data sets such as those coming from near-future surveys such as EUCLID or the LSST. Our catalog consists of prediction of probabilities for each galaxy by morphology in their different bands and is made publicly available at http://www.inf.udec.cl/~guille/data/Deep-CLASH.csv.
Chattopadhyay, Tanuka; Fraix-Burnet, Didier; Mondal, Saptarshi
doi: 10.1088/1538-3873/aaf7c6pmid: N/A
Subjective classification of galaxies can mislead us in the quest for the formation and evolution of galaxies since this is necessarily limited to a few features. The human mind is unable to comprehend the complex correlations in a manifold parameter space, and multivariate analyses are the best tools for understanding the differences among various kinds of objects. In this series of papers, an objective classification of 362,923 galaxies from the Value Added Galaxy Catalog is carried out with the help of two methods of multivariate analysis. First, Independent Component Analysis is used to determine a set of derived independent components that are linear combinations of 47 observed features (namely, ionized lines, Lick indices, photometric and morphological properties, star formation rates, etc.) of the galaxies. Subsequently, a K-means cluster analysis is applied to the nine independent components to obtain 10 distinct, homogeneous groups. In this first paper, we describe the methods and the main results. It appears that the nine Independent Components represent a complete physical description of galaxies (velocity dispersion, ionization, metallicity, surface brightness, and structure). We find that our 10 groups can be essentially placed into traditional and empirical classes (from color–magnitude and emission-line diagnostic diagrams, early versus late types) despite the classical corresponding features (color, line ratios, and morphology) not being significantly correlated with the nine Independent Components. A more detailed physical interpretation of the groups will be performed in subsequent papers.
doi: 10.1088/1538-3873/aaeeecpmid: N/A
We develop a novel method based on machine-learning principles to achieve optimal initiation of CPU-intensive computations for forward asteroseismic modeling in a multi-dimensional parameter space. A deep neural network is trained on a precomputed asteroseismology grid containing about 62 million coherent oscillation-mode frequencies derived from stellar evolution models. These models are representative of the core-hydrogen-burning stage of intermediate-mass and high-mass stars. The evolution models constitute a 6D parameter space and their predicted low-degree pressure- and gravity-mode oscillations are scanned using a genetic algorithm. A software pipeline is created to find the best-fitting stellar parameters for a given set of observed oscillation frequencies. The proposed method finds the optimal regions in the 6D parameter space in less than a minute, hence providing the optimal starting point for further and more detailed forward asteroseismic modeling in a high-dimensional context. We test and apply the method to seven pulsating stars that were previously modeled asteroseismically by classical grid-based forward modeling based on a χ2 statistic, and obtain good agreement with past results. Our deep-learning methodology opens up the application of asteroseismic modeling in +6D parameter space for thousands of stars pulsating in coherent modes with long lifetimes observed by the Kepler space telescope and to be discovered with the TESS and PLATO space missions, while applications so far have been done star-by-star for only a handful of cases. Our method is open source and can be freely used by anyone.3
Luken, Kieran J.; Norris, Ray P.; Park, Laurence A. F.
doi: 10.1088/1538-3873/aaea17pmid: N/A
In the near future, all-sky radio surveys are set to produce catalogues of tens of millions of sources with limited multiwavelength photometry. Spectroscopic redshifts will only be possible for a small fraction of these new-found sources. In this paper, we provide the first in-depth investigation into the use of k-nearest-neighbor (kNN) regression for the estimation of redshift of these sources. We use Australia Telescope Large Area Survey (ATLAS) radio data, combined with Spitzer Wide-Area Infrared Extragalactic Survey infrared, Dark Energy Survey optical, and Australian Dark Energy Survey spectroscopic survey data. We then reduce the depth of photometry to match what is expected from the upcoming Evolutionary Map of the Universe survey, testing against both data sets. To examine the generalization of our methods, we test one of the subfields of ATLAS against the other. We achieve an outlier rate of ∼10% across all tests, showing that the kNN regression algorithm is an acceptable method of estimating redshift, and would perform better given a sample training set with uniform redshift coverage.
doi: 10.1088/1538-3873/aae7fcpmid: N/A
The data torrent unleashed by current and upcoming astronomical surveys demands scalable analysis methods. Many machine learning approaches scale well, but separating the instrument measurement from the physical effects of interest, dealing with variable errors, and deriving parameter uncertainties is often an afterthought. Classic forward-folding analyses with Markov chain Monte Carlo or nested sampling enable parameter estimation and model comparison, even for complex and slow-to-evaluate physical models. However, these approaches require independent runs for each data set, implying an unfeasible number of model evaluations in the Big Data regime. Here I present a new algorithm, collaborative nested sampling, for deriving parameter probability distributions for each observation. Importantly, the number of physical model evaluations scales sub-linearly with the number of data sets, and no assumptions about homogeneous errors, Gaussianity, the form of the model, or heterogeneity/completeness of the observations need to be made. Collaborative nested sampling has immediate applications in speeding up analyses of large surveys, integral-field-unit observations, and Monte Carlo simulations.
Showing 1 to 10 of 13 Articles