Document Type
Article
Publication Date
3-14-2019
Department
Computing
School
Computing Sciences and Computer Engineering
Abstract
Background: Reference genome selection is a prerequisite for successful analysis of next generation sequencing (NGS) data. Current practice employs one of the two most recent human reference genome versions: HG19 or HG38. To date, the impact of genome version on SNV identification has not been rigorously assessed.
Results: We conducted analysis comparing the SNVs identified based on HG19 vs HG38, leveraging whole genome sequencing (WGS) data from the genome-in-a-bottle (GIAB) project. First, SNVs were called using 26 different bioinformatics pipelines with either HG19 or HG38. Next, two tools were used to convert the called SNVs between HG19 and HG38. Lastly we calculated conversion rates, analyzed discordant rates between SNVs called with HG19 or HG38, and characterized the discordant SNVs.
Conclusions: The conversion rates from HG38 to HG19 (average 95%) were lower than the conversion rates from HG19 to HG38 (average 99%). The conversion rates varied slightly among the various calling pipelines. Around 1.5% SNVs were discordantly converted between HG19 or HG38. The conversions from HG38 to HG19 had more SNVs which failed conversion and more discordant SNVs than the opposite conversion (HG19 to HG38). Most of the discordant SNVs had low read depth, were low confidence SNVs as defined by GIAB, and/or were predominated by G/C alleles (52% observed versus 42% expected).
Publication Title
BMC Bioinformatics
Volume
20
Issue
S2
First Page
1
Last Page
13
Recommended Citation
Pan, B.,
Kusko, R.,
Xiao, W.,
Zheng, Y.,
Liu, Z.,
Xiao, C.,
Sakkiah, S.,
Guo, W.,
Gong, P.,
Zhang, C.,
Ge, W.,
Shi, L.,
Tong, W.,
Hong, H.
(2019). Similarities and Differences Between Variants Called With Human Reference Genome HG19 or HG38. BMC Bioinformatics, 20(S2), 1-13.
Available at: https://aquila.usm.edu/fac_pubs/15986
Comments
Published by 'BMC Bioinformatics' at 10.1186/s12859-019-2620-0.
Correction