TY - JOUR
T1 - Rapid genotype imputation from sequence with reference panels
AU - Davies, Robert W.
AU - Kucka, Marek
AU - Su, Dingwen
AU - Shi, Sinan
AU - Flanagan, Maeve
AU - Cunniff, Christopher M.
AU - Chan, Yingguang Frank
AU - Myers, Simon
N1 - Funding Information:
We thank C. Lanz, R. Schwab, O. Weichenrieder and I. Bezrukov at the MPI Developmental Biology for assistance with high-throughput sequencing and associated data processing and A. Noll and the MPI Tübingen IT team for computational support. We used high-coverage resequencing of 1000 Genomes Project data performed by the NYGC. These data were generated at the NYGC with funds provided by National Human Genome Research Institute grant no. 3UM1HG008901-03S1. The research was supported by the Wellcome Trust Core Award Grant no. 203141/Z/16/Z with additional support from the National Institute for Health Research (NIHR) Oxford Biomedical Research Centre and by Wellcome Trust grant nos. 200186/Z/15/Z and 212284/Z/18/Z (to S.M.). The views expressed are those of the author(s) and not necessarily those of the NHS, NIHR or the Department of Health. We acknowledge the contribution and support from affected persons and their families who contributed to the Bloom Syndrome Repository. We thank the New York Community Trust and Weill Cornell Medicine’s Clinical and Translational Science Center for providing funding. M.K., D.S. and Y.F.C. are supported by the Max Planck Society and a European Research Council Starting Grant (no. 639096 HybridMiX).
Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer Nature America, Inc.
PY - 2021/7
Y1 - 2021/7
N2 - Inexpensive genotyping methods are essential to modern genomics. Here we present QUILT, which performs diploid genotype imputation using low-coverage whole-genome sequence data. QUILT employs Gibbs sampling to partition reads into maternal and paternal sets, facilitating rapid haploid imputation using large reference panels. We show this partitioning to be accurate over many megabases, enabling highly accurate imputation close to theoretical limits and outperforming existing methods. Moreover, QUILT can impute accurately using diverse technologies, including long reads from Oxford Nanopore Technologies, and a new form of low-cost barcoded Illumina sequencing called haplotagging, with the latter showing improved accuracy at low coverages. Relative to DNA genotyping microarrays, QUILT offers improved accuracy at reduced cost, particularly for diverse populations that are traditionally underserved in modern genomic analyses, with accuracy nearly doubling at rare SNPs. Finally, QUILT can accurately impute (four-digit) human leukocyte antigen types, the first such method from low-coverage sequence data.
AB - Inexpensive genotyping methods are essential to modern genomics. Here we present QUILT, which performs diploid genotype imputation using low-coverage whole-genome sequence data. QUILT employs Gibbs sampling to partition reads into maternal and paternal sets, facilitating rapid haploid imputation using large reference panels. We show this partitioning to be accurate over many megabases, enabling highly accurate imputation close to theoretical limits and outperforming existing methods. Moreover, QUILT can impute accurately using diverse technologies, including long reads from Oxford Nanopore Technologies, and a new form of low-cost barcoded Illumina sequencing called haplotagging, with the latter showing improved accuracy at low coverages. Relative to DNA genotyping microarrays, QUILT offers improved accuracy at reduced cost, particularly for diverse populations that are traditionally underserved in modern genomic analyses, with accuracy nearly doubling at rare SNPs. Finally, QUILT can accurately impute (four-digit) human leukocyte antigen types, the first such method from low-coverage sequence data.
UR - http://www.scopus.com/inward/record.url?scp=85107269290&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85107269290&partnerID=8YFLogxK
U2 - 10.1038/s41588-021-00877-0
DO - 10.1038/s41588-021-00877-0
M3 - Article
C2 - 34083788
AN - SCOPUS:85107269290
SN - 1061-4036
VL - 53
SP - 1104
EP - 1111
JO - Nature Genetics
JF - Nature Genetics
IS - 7
ER -