Abstract
While DistanceWeighted Discrimination (DWD) is an appealing approach to classification in high dimensions, it was designed for balanced datasets. In the case of unequal costs, biased sampling, or unbalanced data, there are major improvements available, using appropriately weighted versions of DWD (wDWD). A major contribution of this paper is the development of optimal weighting schemes for various nonstandard classification problems. In addition, we discuss several alternative criteria and propose an adaptive weighting scheme (awDWD) and demonstrate its advantages over nonadaptive weighting schemes under some situations. The second major contribution is a theoretical study of weighted DWD. Both high-dimensional low sample-size asymptotics and Fisher consistency of DWD are studied. The performance of weighted DWD is evaluated using simulated examples and two real data examples. The theoretical results are also confirmed by simulations.
Original language | English (US) |
---|---|
Pages (from-to) | 401-414 |
Number of pages | 14 |
Journal | Journal of the American Statistical Association |
Volume | 105 |
Issue number | 489 |
DOIs | |
State | Published - 2010 |
Externally published | Yes |
Keywords
- Fisher consistency
- High dimensional
- Linear discrimination
- Low sample-size data
- Nonstandard asymptotics
- Unbalanced data
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty