Differential Retention of Pfam Domains Contributes to Long-term Evolutionary Trends

Jennifer E. James, Paul G. Nelson, Joanna Masel

Research output: Contribution to journalArticlepeer-review


Protein domains that emerged more recently in evolution have a higher structural disorder and greater clustering of hydrophobic residues along the primary sequence. It is hard to explain how selection acting via descent with modification could act so slowly as not to saturate over the extraordinarily long timescales over which these trends persist. Here, we hypothesize that the trends were created by a higher level of selection that differentially affects the retention probabilities of protein domains with different properties. This hypothesis predicts that loss rates should depend on disorder and clustering trait values. To test this, we inferred loss rates via maximum likelihood for animal Pfam domains, after first performing a set of stringent quality control methods to reduce annotation errors. Intermediate trait values, matching those of ancient domains, are associated with the lowest loss rates, making our results difficult to explain with reference to previously described homology detection biases. Simulations confirm that effect sizes are of the right magnitude to produce the observed long-term trends. Our results support the hypothesis that differential domain loss slowly weeds out those protein domains that have nonoptimal levels of disorder and clustering. The same preferences also shape the differential diversification of Pfam domains, thereby further impacting proteome composition.

Original languageEnglish (US)
Article numbermsad073
JournalMolecular biology and evolution
Issue number4
StatePublished - Apr 1 2023


  • Cope's rule
  • clade selection
  • gene families
  • intrinsic structural disorder
  • phylostratigraphy
  • protein evolution
  • protein folding

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics


Dive into the research topics of 'Differential Retention of Pfam Domains Contributes to Long-term Evolutionary Trends'. Together they form a unique fingerprint.

Cite this