Aligning alignments

John D. Kececioglu, Weiqing Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

20 Scopus citations

Abstract

While the area of sequence comparison has a rich collection of results on the alignment of two sequences, and even the alignment of multiple sequences, there is little known about the alignment of two alignments. The problem becomes interesting when the alignment objective function counts gaps, as is common when aligning biological sequences, and has the form of the sum-of-pairs objective. We begin a thorough investigation of aligning two alignments under the sum-of-pairs objective with general linear gap costs when either of the two alignments are given in the form of a sequence (a degenerate alignment containing a single sequence), a multiple alignment (containing two or more sequences), or a profile (a representation of a multiple alignment often used in computational biology). This leads to five problem variations, some of which arise in widely-used heuristics for multiple sequence alignment, and in assessing the relatedness of a sequence to a sequence family. For variations in which exact gap counts are computationally difficult to determine, we offer a framework in terms of optimistic and pessimistic gap counts. For optimistic and pessimistic gap counts we give efficient algorithms for the sequence vs. alignment, sequence vs. profile, alignment vs. alignment, and profile vs. profile variations, all of which run in essentially O(mn) time for two input alignments of lengths m and n. For exact gap counts, we give the first provably efficient algorithm for the sequence vs. alignment variation, which runs in essentially O(mn log n) time using the candidatelist technique developed for convex gap-costs, and we conjecture that the alignment vs. alignment variation is NP-complete.

Original languageEnglish (US)
Title of host publicationCombinatorial Pattern Matching - 9th Annual Symposium, CPM 1998, Proceedings
Pages189-208
Number of pages20
StatePublished - 1998
Externally publishedYes
Event9th Annual Symposium on Combinatorial Pattern Matching, CPM 1998 - Piscataway, NJ, United States
Duration: Jul 20 1998Jul 22 1998

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1448 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other9th Annual Symposium on Combinatorial Pattern Matching, CPM 1998
Country/TerritoryUnited States
CityPiscataway, NJ
Period7/20/987/22/98

Keywords

  • Affine gap costs
  • Profiles 1 Introduction While
  • Quasi-natural gap costs
  • Sequence comparison
  • Sum-of-pMrs alignment

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Aligning alignments'. Together they form a unique fingerprint.

Cite this