Representation Learning on a Multi-Wavelength Quasar Dataset

Thumbnail Image

Authors

Rodrigue, Kurtis

Issue Date

2023

Type

Thesis

Language

Keywords

Chandra , Quasar , Representation Learning , SDSS

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Principal Component Analysis (PCA) and Uniform Manifold Approximation and Projection (UMAP) are among the most common representation learning techniques used in various scientific fields. By reducing the dimensionality of large datasets, representation learning algorithms find lower dimensionality representations of the data to help scientists gain a better understanding. They excel in the field of astronomy due to their capability of finding hidden patterns or various characteristics about a given dataset. Within astronomy, the field of cosmology focuses on the most distant objects observable to understand the process of galactic evolution. Quasars are a class of objects, explained by supermassive black holes (hundreds of millions of solar masses) that accrete matter at the centers of distant galaxies, causing them to emit large amounts of radiation across the electromagnetic spectrum. Observations have been made in various wavelengths of light to gain a better understanding of the underlying pattern. For this thesis, we use data from the Sloan Digital Sky Survey (SDSS) in the visible wavelengths while Chandra X-ray Observatory (CXO) supplements the dataset with X-ray observations. Through representation learning it is shown that the Eddington ratio is the most defining feature, followed up by an anti-correlation between iron and oxygen possibly denoting orientation. Subsets of the dataset that are examined for any additional information about the encoding process to highlight characteristics of the overall dataset. Understanding the population of quasars as a whole helps astrophysicists understand the evolution of galaxies.

Description

Citation

Publisher

License

Journal

Volume

Issue

PubMed ID

DOI

ISSN

EISSN