TY - JOUR
T1 - Clean kinematic samples in dwarf spheroidals
T2 - An algorithm for evaluting membership and estimating distribution parameters when contamination is present
AU - Walker, Matthew G.
AU - Mateo, Mario
AU - Olszewski, Edward W.
AU - Sen, Bodhisattva
AU - Woodroofe, Michael
PY - 2009
Y1 - 2009
N2 - We develop an algorithm for estimating parameters of a distribution sampled with contamination. We employ a statistical technique known as "expectation maximization" (EM). Given models for both member and contaminant populations, the EM algorithm iteratively evaluates the membership probability of each discrete data point, then uses those probabilities to update parameter estimates for member and contaminant distributions. The EM approach has wide applicability to the analysis of astronomical data. Here we tailor an EM algorithm to operate on spectroscopic samples obtained with the Michigan-MIKE Fiber System (MMFS) as part of our Magellan survey of stellar radial velocities in nearby dwarf spheroidal (dSph) galaxies. These samples, to be presented in a companion paper, contain discrete measurements of line-of-sight velocity, projected position, and pseudo-equivalent width of the Mg-triplet feature, for 1000-2500 stars per dSph, including some fraction of contamination by foreground Milky Way stars. The EM algorithm uses all of the available data to quantify dSph and contaminant distributions. For distributions (e.g., velocity and Mg-index of dSph stars) assumed to be Gaussian, the EM algorithm returns maximum-likelihood estimates of the mean and variance, as well as the probability that each star is a dSph member. These probabilities can serve as weights in subsequent analyses. Applied to our MMFS data, the EM algorithm identifies more than 5000 stars as probable dSph members. We test the performance of the EM algorithm on simulated data sets that represent a range of sample size, level of contamination, and amount of overlap between dSph and contaminant velocity distributions. The simulations establish that for samples ranging from large (N 3000, characteristic of the MMFS samples) to small (N 30), resembling new samples for extremely faint dSphs), the EM algorithm distinguishes members from contaminants and returns accurate parameter estimates much more reliably than conventional methods of contaminant removal (e.g., sigma clipping).
AB - We develop an algorithm for estimating parameters of a distribution sampled with contamination. We employ a statistical technique known as "expectation maximization" (EM). Given models for both member and contaminant populations, the EM algorithm iteratively evaluates the membership probability of each discrete data point, then uses those probabilities to update parameter estimates for member and contaminant distributions. The EM approach has wide applicability to the analysis of astronomical data. Here we tailor an EM algorithm to operate on spectroscopic samples obtained with the Michigan-MIKE Fiber System (MMFS) as part of our Magellan survey of stellar radial velocities in nearby dwarf spheroidal (dSph) galaxies. These samples, to be presented in a companion paper, contain discrete measurements of line-of-sight velocity, projected position, and pseudo-equivalent width of the Mg-triplet feature, for 1000-2500 stars per dSph, including some fraction of contamination by foreground Milky Way stars. The EM algorithm uses all of the available data to quantify dSph and contaminant distributions. For distributions (e.g., velocity and Mg-index of dSph stars) assumed to be Gaussian, the EM algorithm returns maximum-likelihood estimates of the mean and variance, as well as the probability that each star is a dSph member. These probabilities can serve as weights in subsequent analyses. Applied to our MMFS data, the EM algorithm identifies more than 5000 stars as probable dSph members. We test the performance of the EM algorithm on simulated data sets that represent a range of sample size, level of contamination, and amount of overlap between dSph and contaminant velocity distributions. The simulations establish that for samples ranging from large (N 3000, characteristic of the MMFS samples) to small (N 30), resembling new samples for extremely faint dSphs), the EM algorithm distinguishes members from contaminants and returns accurate parameter estimates much more reliably than conventional methods of contaminant removal (e.g., sigma clipping).
KW - Galaxies: Dwarf galaxies: Individual (Carina, Fornax, Sculptor, Sextans) galaxies: Kinematics and dynamics Local Group techniques: Radial velocities
UR - http://www.scopus.com/inward/record.url?scp=64849117009&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=64849117009&partnerID=8YFLogxK
U2 - 10.1088/0004-6256/137/2/3109
DO - 10.1088/0004-6256/137/2/3109
M3 - Article
AN - SCOPUS:64849117009
SN - 0004-6256
VL - 137
SP - 3109
EP - 3138
JO - Astronomical Journal
JF - Astronomical Journal
IS - 2
ER -