In this project we developed statistical genetic methodology to increase the accuracy of genomic prediction for complex phenotypic traits by optimal use of whole genome sequence (WGS) information. We aimed to approximate as closely as possible total genetic variation, i.e., between genotype variation, by the variance in genomic estimated breeding values (GEBVs), where the latter GEBVs will be based on WGS information.
First, we investigated the benefit of WGS data for genomic prediction across different populations. We showed that the identification and differential weighting of functionally relevant or predictive variants in WGS data significantly improves prediction accuracy. Simply increasing marker density up to WGS level does not improve accuracy. We also developed a deterministic prediction equation for the accuracy of genomic prediction involving multiple populations and using WGS information. The equation can be used to optimize the design of genomic breeding programs, especially for numerically small populations. Subsequently, we developed an approach by which publicly available GWAS data on human complex traits, where experimental sample sizes are very large, can be utilized to aid the identification of relevant genes and associated variants affecting equivalent traits in livestock species. The appropriate modelling of variants nearby the identified genes potentially improves the accuracy of genomic prediction.
For further improvement we investigated the usefulness of imputation to whole genomes prior to genomic prediction, leading to an increase in the number of detected quantitative trait loci (QTL). We also investigated the utilization of Bayesian regularization and Bayesian prior distributions, indicating that heavy tailed prior distributions are favorable. We implemented biological information such as the location of markers on the genome for improvement and with the aim to improve genomic prediction over generations.
Published results in this project are:
van den Berg, S., J. Vandenplas, F. A. van Eeuwijk, A. C. Bouwman, M. S. Lopes, and R. F. Veerkamp. 2019. Imputation to whole-genome sequence using multiple pig populations and its use in genome-wide association studies. Genet. Sel. Evol. 51:2. https://doi.org/10.1186/s12711-019-0445-y
van den Berg, S., J. Vandenplas, F. A. van Eeuwijk, M. S. Lopes, and R. F. Veerkamp. 2019. Significance testing and genomic inflation factor using high-density genotypes or whole-genome sequence data. J. Anim. Breed. Genet. 136:418-429. https://doi.org/10.1111/jbg.12419
Raymond, B., A. C. Bouwman, C. Schrooten, J. Houwing-Duistermaat, and R. F. Veerkamp. 2018. Utility of whole-genome sequence data for across-breed genomic prediction. Genet. Sel. Evol. 50:27. https://doi.org/10.1186/s12711-018-0396-8
Raymond, B., A. C. Bouwman, Y. C. J. Wientjes, C. Schrooten, J. Houwing-Duistermaat, and R. F. Veerkamp. 2018. Genomic prediction for numerically small breeds, using models with pre-selected and differentially weighted markers. Genet. Sel. Evol. 50:49. https://doi.org/10.1186/s12711-018-0419-5