ABSTRACT

However, it is time-consuming work for imputing from large reference panels, which imposes a high computational burden. Howie et al’s have carried on a study with a pre-phasing strategy for fast and accurate genotype imputation. The method could maintains the accuracy of leading methods while reducing computational costs[13]. Meanwhile, the result of Zheng et al’s work shows that the v variants with lower minor allele frequency (MAF) are more difficult to impute, especially for the very rare variants when using HapMap2 and 1000 Genomes pilot as reference panels[14]. In the data from the 1000 Genomes Project, there are 1092 individual with genetic variations. It might not be a suitable strategy to use all the dataset when carrying on particular imputation, such as imputing the results from WES. Besides, considering the divergence of allele frequency in different populations. In this case work, the reference panel was particularly reconstructed according to the test data set before imputing. The results indicate that this strategy could decrease the computing time, and the performance of result analysis protocol as well as when identifies the correlated variation cause with phenotypes.