April 15 2015

10:00 159 Boyer Hall

Jae Hoon Sul
Department of Medicine, Division of Genetics, Brigham and Women's Hospital ,Harvard Medical School

Large-scale genetics studies of human complex traits


In the past decade, genotyping and next-generation sequencing \(NGS\) technologies have generated an enormous amount of data to discover genetic variants present in human genomes and to find the genetic basis of diseases The technologies have shifted the paradigm of genetic studies from studies that analyzed fewer than a hundred individuals at millions of genetic variants. With rapid decrease in sequencing costs and emphasis on genomic medicine, studies will sequence hundred of thousands of individuals in the near future. These large genetic datasets, however, have introduced two major challenges for genetic studies. The first challenge is developing computational methods that can utilize this big data efficiently. The second challenge is that it has become increasingly complicated to analyze large genetic data. Hence, it is critical for genetic studies to address these two challenges for identification of genetic basis of human complex traits.In this talk I will describe my work to address the two challenges. As genome-wide association studies have discovered numerous non-coding genetic variants associated traits, there has been increasing focus on interpreting these variants using functional genomics. Expression quantitative trait loci \(eQTL\) studies that attempt to detect genetic variants associated with gene expression may provide clues as to which variants are functional. I will discuss a method to perform multiple testing correction accurately and rapidly in eQTL studies for identification of genes whose expression is influenced by genetic variants. As eQTL studies have grown larger in sample size, multiple correction using the permutation test has become a major computational bottleneck. I developed a multivariate normal sampling approach \(MVN\), and MVN is more than 100 times faster than the permutation test for the sample size of 2,000 while generating almost the same results. My approach will be adopted by the Genotype-Tissue Expression \(GTEx\), a large consortium aiming to obtain gene expression from many human tissues. Next, I will present a novel approach to detect rare variants associated with a disease in large families. NGS enables studies to evaluate effects of rare variants on complex traits, and family-based studies have attracted great attention recently because of their higher power for rare variant testing than case-control studies. I developed a method called RareIBD than can be applied to large pedigrees, both binary and quantitative traits, and affected-only pedigrees. Using simulations, I will show my method achieves higher power than previous approaches. Lastly, I will discuss my work on analyzing high-coverage whole genome sequencing \(WGS\) of 808 ADNI individuals. I will present a challenge in analyzing large YGS data and procedures to measure the quality of WGS.















































































































































































































































































































































































































































































































this is idtest: