Assessing the performance of KS plots for detecting ancient whole genome duplications

George P. Tiley, Michael S. Barker, J. Gordon Burleigh

Research output: Contribution to journalArticlepeer-review

53 Scopus citations

Abstract

Genomic data have provided evidence of previously unknown ancient whole genome duplications (WGDs) and highlighted the role of WGDs in the evolution of many eukaryotic lineages. Ancient WGDs often are detected by examining distributions of synonymous substitutions per site (Ks) within a genome, or “Ks plots.” For example, WGDs can be detected from Ks plots by using univariate mixture models to identify peaks in Ks distributions. We performed gene family simulation experiments to evaluate the effects of different Ks estimation methods and mixture models on our ability to detect ancient WGDs from Ks plots. The simulation experiments, which accounted for variation in substitution rates and gene duplication and loss rates across gene families, tested the effects of WGD age and gene retention rates following WGD on inferring WGDs from Ks plots. Our simulations reveal limitations of Ks plot analyses. Strict interpretations of mixture model analyses often overestimate the number of WGD events, and Ks plot analyses typically fail to detect WGDs when 10% of the duplicated genes are retained following the WGD. However, WGDs can accurately be characterized over an intermediate range of Ks. The simulation results are supported by empirical analyses of transcriptomic data, which also suggest that biases in gene retention likely affect our ability to detect ancient WGDs. Although our results indicate mixture model results should be interpreted with great caution, using node-averaged Ks estimates and applying more appropriate mixture models can improve the accuracy of detecting WGDs.

Original languageEnglish (US)
Pages (from-to)2882-2898
Number of pages17
JournalGenome biology and evolution
Volume10
Issue number11
DOIs
StatePublished - 2018

Keywords

  • Gene age distributions
  • Gene family simulation
  • Mixture models
  • Paleopolyploidy
  • Synonymous substitution rate

ASJC Scopus subject areas

  • General Medicine

Fingerprint

Dive into the research topics of 'Assessing the performance of KS plots for detecting ancient whole genome duplications'. Together they form a unique fingerprint.

Cite this