Minimum description length principle: Generators are preferable to closed patterns

Jinyan Li, Haiquan Li, Limsoon Wong, Jian Pei, Guozliu Dong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

57 Scopus citations

Abstract

The generators and the unique closed pattern of an equivalence class of itemsets share a common set of transactions. The generators are the minimal ones among the equivalent itemsets, while the closed pattern is the maximum one. As a generator is usually smaller than the closed pattern in cardinality, by the Minimum Description Length Principle, the generator is preferable to the closed pattern in inductive inference and classification. To efficiently discover frequent generators from a large dataset, we develop a depth-first algorithm called Gr-growth. The idea is novel in contrast to traditional breadth-first bottom-up generator-mining algorithms. Our extensive performance study shows that Gr-growth is significantly faster (an order or even two orders of magnitudes when the support thresholds are low) than the existing generator mining algorithms. It can be also faster than the state-of-the-art frequent closed itemset mining algorithms such as FPclose and CLOSET+.

Original languageEnglish (US)
Title of host publicationProceedings of the 21st National Conference on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conference, AAAI-06/IAAI-06
Pages409-414
Number of pages6
StatePublished - 2006
Externally publishedYes
Event21st National Conference on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conference, AAAI-06/IAAI-06 - Boston, MA, United States
Duration: Jul 16 2006Jul 20 2006

Publication series

NameProceedings of the National Conference on Artificial Intelligence
Volume1

Other

Other21st National Conference on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conference, AAAI-06/IAAI-06
Country/TerritoryUnited States
CityBoston, MA
Period7/16/067/20/06

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Minimum description length principle: Generators are preferable to closed patterns'. Together they form a unique fingerprint.

Cite this