Abstract
Plant genomes demonstrate significant presence/absence variation (PAV) within a species; however, the factors that lead to this variation have not been studied systematically in Brassica across diploids and polyploids. Here, we developed pangenomes of polyploid Brassica napus and its two diploid progenitor genomes B. rapa and B. oleracea to infer how PAV may differ between diploids and polyploids. Modelling of gene loss suggests that loss propensity is primarily associated with transposable elements in the diploids while in B. napus, gene loss propensity is associated with homoeologous recombination. We use these results to gain insights into the different causes of gene loss, both in diploids and following polyploidization, and pave the way for the application of machine learning methods to understanding the underlying biological and physical causes of gene presence/absence.
Original language | English (US) |
---|---|
Pages (from-to) | 2488-2500 |
Number of pages | 13 |
Journal | Plant Biotechnology Journal |
Volume | 19 |
Issue number | 12 |
DOIs | |
State | Published - Dec 2021 |
Keywords
- Brassica
- XGBoost
- gene loss propensity
- machine learning
- pangenome
- transposable elements
ASJC Scopus subject areas
- Biotechnology
- Agronomy and Crop Science
- Plant Science