TY - JOUR
T1 - Adaptive Local Realignment of Protein Sequences
AU - Deblasio, Dan
AU - Kececioglu, John
N1 - Funding Information:
Research of J.K. and D.D. at the University of Arizona was supported by NSF Grant IIS-1217886. D.D. was also partially supported at Carnegie Mellon University by the Lane Fellows Program from the Computational Biology Department. This article is an extended version of the conference publication, DeBlasio and Kececioglu (2017a). Portions of this material are also presented in Chapter 8 of DeBlasio and Kececioglu (2017c).
Publisher Copyright:
© Copyright 2018, Mary Ann Liebert, Inc.
PY - 2018/7
Y1 - 2018/7
N2 - While mutation rates can vary markedly over the residues of a protein, multiple sequence alignment tools typically use the same values for their scoring-function parameters across a protein's entire length. We present a new approach, called adaptive local realignment, that in contrast automatically adapts to the diversity of mutation rates along protein sequences. This builds upon a recent technique known as parameter advising, which finds global parameter settings for an aligner, to now adaptively find local settings. Our approach in essence identifies local regions with low estimated accuracy, constructs a set of candidate realignments using a carefully-chosen collection of parameter settings, and replaces the region if a realignment has higher estimated accuracy. This new method of local parameter advising, when combined with prior methods for global advising, boosts alignment accuracy as much as 26% over the best default setting on hard-to-align protein benchmarks, and by 6.4% over global advising alone. Adaptive local realignment has been implemented within the Opal aligner using the Facet accuracy estimator.
AB - While mutation rates can vary markedly over the residues of a protein, multiple sequence alignment tools typically use the same values for their scoring-function parameters across a protein's entire length. We present a new approach, called adaptive local realignment, that in contrast automatically adapts to the diversity of mutation rates along protein sequences. This builds upon a recent technique known as parameter advising, which finds global parameter settings for an aligner, to now adaptively find local settings. Our approach in essence identifies local regions with low estimated accuracy, constructs a set of candidate realignments using a carefully-chosen collection of parameter settings, and replaces the region if a realignment has higher estimated accuracy. This new method of local parameter advising, when combined with prior methods for global advising, boosts alignment accuracy as much as 26% over the best default setting on hard-to-align protein benchmarks, and by 6.4% over global advising alone. Adaptive local realignment has been implemented within the Opal aligner using the Facet accuracy estimator.
KW - alignment accuracy
KW - iterative refinement
KW - local mutation rates
KW - multiple sequence alignment
KW - parameter advising
UR - http://www.scopus.com/inward/record.url?scp=85050266634&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85050266634&partnerID=8YFLogxK
U2 - 10.1089/cmb.2018.0045
DO - 10.1089/cmb.2018.0045
M3 - Article
C2 - 29889553
AN - SCOPUS:85050266634
SN - 1066-5277
VL - 25
SP - 780
EP - 793
JO - Journal of Computational Biology
JF - Journal of Computational Biology
IS - 7
ER -