TY - JOUR
T1 - Comparison of HLA allelic imputation programs
AU - Karnes, Jason H.
AU - Shaffer, Christian M.
AU - Bastarache, Lisa
AU - Gaudieri, Silvana
AU - Glazer, Andrew M.
AU - Steiner, Heidi E.
AU - Mosley, Jonathan D.
AU - Mallal, Simon
AU - Denny, Joshua C.
AU - Phillips, Elizabeth J.
AU - Roden, Dan M.
N1 - Funding Information:
JHK has been supported by the VUMC Clinical Pharmacology Training grant (T32 GM07569), the American Heart Association (16SDG29090005 and 15POST22660017), and an ACCP Research Institute Futures Grants Award from the American College of Clinical Pharmacy. SR is supported by 5U01GM092691-04 and 1R01AR062886-01. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Funding Information:
The dataset used in the analyses described were obtained from Vanderbilt University Medical Centers BioVU which is supported by institutional funding and by the Vanderbilt CTSA grant ULTR000445 from NCATS/NIH. HumanExome BeadChip genotyping was supported institutionally. Genome-wide genotyping was funded by NIH grants RC2GM092618 from NIGMS/OD, U01HG004603 from NHGRI/NIGMS, and U19HL065962 from NHGRI/NIGMS. BioVU is supported by institutional funding and by the Vanderbilt CTSA grant UL1TR000445 from NCATS/NIH.
Publisher Copyright:
© 2017 Karnes et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2017/2
Y1 - 2017/2
N2 - Imputation of human leukocyte antigen (HLA) alleles from SNP-level data is attractive due to importance of HLA alleles in human disease, widespread availability of genome-wide association study (GWAS) data, and expertise required for HLA sequencing. However, comprehensive evaluations of HLA imputations programs are limited. We compared HLA imputation results of HIBAG, SNP2HLA, and HLA∗IMP:02 to sequenced HLA alleles in 3,265 samples from BioVU, a de-identified electronic health record database coupled to a DNA biorepository. We performed four-digit HLA sequencing for HLA-A, -B, -C, -DRB1, -DPB1, and-DQB1 using long-read 454 FLX sequencing. All samples were genotyped using both the Illumina Human Exome BeadChip platform and a GWAS platform. Call rates and concordance rates were compared by platform, frequency of allele, and race/ethnicity. Overall concordance rates were similar between programs in European Americans (EA) (0.975 [SNP2HLA]; 0.939 [HLA∗IMP:02]; 0.976 [HIBAG]). SNP2HLA provided a significant advantage in terms of call rate and the number of alleles imputed. Concordance rates were lower overall for African Americans (AAs). These observations were consistent when accuracy was compared across HLA loci. All imputation programs performed similarly for low frequency HLA alleles. Higher concordance rates were observed when HLA alleles were imputed from GWAS platforms versus the Human Exome BeadChip, suggesting that high genomic coverage is preferred as input for HLA allelic imputation. These findings provide guidance on the best use of HLA imputation methods and elucidate their limitations.
AB - Imputation of human leukocyte antigen (HLA) alleles from SNP-level data is attractive due to importance of HLA alleles in human disease, widespread availability of genome-wide association study (GWAS) data, and expertise required for HLA sequencing. However, comprehensive evaluations of HLA imputations programs are limited. We compared HLA imputation results of HIBAG, SNP2HLA, and HLA∗IMP:02 to sequenced HLA alleles in 3,265 samples from BioVU, a de-identified electronic health record database coupled to a DNA biorepository. We performed four-digit HLA sequencing for HLA-A, -B, -C, -DRB1, -DPB1, and-DQB1 using long-read 454 FLX sequencing. All samples were genotyped using both the Illumina Human Exome BeadChip platform and a GWAS platform. Call rates and concordance rates were compared by platform, frequency of allele, and race/ethnicity. Overall concordance rates were similar between programs in European Americans (EA) (0.975 [SNP2HLA]; 0.939 [HLA∗IMP:02]; 0.976 [HIBAG]). SNP2HLA provided a significant advantage in terms of call rate and the number of alleles imputed. Concordance rates were lower overall for African Americans (AAs). These observations were consistent when accuracy was compared across HLA loci. All imputation programs performed similarly for low frequency HLA alleles. Higher concordance rates were observed when HLA alleles were imputed from GWAS platforms versus the Human Exome BeadChip, suggesting that high genomic coverage is preferred as input for HLA allelic imputation. These findings provide guidance on the best use of HLA imputation methods and elucidate their limitations.
UR - http://www.scopus.com/inward/record.url?scp=85013074787&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85013074787&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0172444
DO - 10.1371/journal.pone.0172444
M3 - Article
C2 - 28207879
AN - SCOPUS:85013074787
VL - 12
JO - PLoS One
JF - PLoS One
SN - 1932-6203
IS - 2
M1 - e0172444
ER -