An algorithm for segmenting categorical time series into meaningful episodes

Paul Cohen, Niall Adams

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    32 Scopus citations

    Abstract

    This paper describes an unsupervised algorithm for segmenting categorical time series. The algorithm first collects statistics about the frequency and boundary entropy of ngrams, then passes a window over the series and has two “expert methods” decide where in the window boundaries should be drawn. The algorithm segments text into words successfully in three languages. We claim that the algorithm finds meaningful episodes in categorical time series, because it exploits two statistical characteristics of meaningful episodes.

    Original languageEnglish (US)
    Title of host publicationAdvances in Intelligent Data Analysis - 4th International Conference, IDA 2001, Proceedings
    EditorsFrank Hoffmann, Gabriela Guimaraes, David J. Hand, Niall Adams, Douglas Fisher
    PublisherSpringer-Verlag
    Pages198-207
    Number of pages10
    ISBN (Print)3540425810, 3540425810, 9783540425816, 9783540425816
    DOIs
    StatePublished - 2001
    Event4th International Conference on Intelligent Data Analysis, IDA 2001 - Cascais, Portugal
    Duration: Sep 13 2001Sep 15 2001

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume2189
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Other

    Other4th International Conference on Intelligent Data Analysis, IDA 2001
    Country/TerritoryPortugal
    CityCascais
    Period9/13/019/15/01

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • General Computer Science

    Fingerprint

    Dive into the research topics of 'An algorithm for segmenting categorical time series into meaningful episodes'. Together they form a unique fingerprint.

    Cite this