Finding well-clusterable subspaces for high dimensional data: A numerical one-dimension approach

Chuanren Liu, Tianming Hu, Yong Ge, Hui Xiong

Research output: Contribution to journalConference articlepeer-review

Abstract

High dimensionality poses two challenges for clustering algorithms: features may be noisy and data may be sparse. To address these challenges, subspace clustering seeks to project the data onto simple yet informative subspaces. The projection process should be fast and the projected subspaces should be well-clusterable. In this paper, we describe a numerical one-dimensional subspace approach for high dimensional data. First, we show that the numerical one-dimensional subspaces can be constructed efficiently by controlling the correlation structure. Next, we propose two strategies to aggregate the representatives from each numerical one-dimensional subspace into the final projected space, where the clustering problem becomes tractable. Finally, the experiments on real-world document data sets demonstrate that, compared to competing methods, our approach can find more clusterable subspaces which align better with the true class labels.

Original languageEnglish (US)
Pages (from-to)311-323
Number of pages13
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8444 LNAI
Issue numberPART 2
DOIs
StatePublished - 2014
Externally publishedYes
Event18th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2014 - Tainan, Taiwan, Province of China
Duration: May 13 2014May 16 2014

Keywords

  • clusterable subspace
  • numerical one-dimension
  • subspace learning

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Finding well-clusterable subspaces for high dimensional data: A numerical one-dimension approach'. Together they form a unique fingerprint.

Cite this