About
Brian Jo is interested in solving business problems by uncovering data-driven insights. Main motivations are to learn industry-leading technologies, tackle complex challenges, and collaborate with teams to continuously improve the status quo.
To get an idea of my main projects, please visit my personal site. Some highlights include:
Data science projects
Project Menrva
Dashboard of U.S. adult health behaviors from 2000-2020:
https://projectmenrva.herokuapp.com
https://github.com/bjo/TDI_capstone_webapp
Ph.D. research directions
During his Ph.D. career, Brian conducted research in Statistical Human Genetics under the mentorship of Barbara Engelhardt, and was part of the Graduate Program in Quantitative and Computational Biology at Princeton University. Brian worked with the human genome and transcriptome, and was a contributor to the Genotype-Tissue Expression (GTEx) consortium. Some technical highlights:
- Served as Genotype Tissue Expression (GTEx) Consortium’s main analyst in lab whose mission is to improve human health by analyzing “enormous reams of data generated by research labs, doctors, and hospitals”.
- Processed datasets of 900 individuals in Terabyte scale, including whole human genomes, gene expression, and other metadata. Programmed in R, Python, and Bash and performed statistical analysis.
- Collaborated with prestigious scientists in dynamic environment to deliver innovations, reinforcing academic background with hands-on concepts such as significance testing and software development environments.
- Distributed pre-processing, hypothesis testing, and post-processing in Princeton cluster and Google Cloud Platform.
The primary research directions were:
- Structures and statistical properties of genome-wide association studies at gene-level and tissue-level for multiple testing correction and significance testing
- Developing and testing various methods for tissue-specific transcriptome causal inference with genetic instrumental variable analysis (Mendelian Randomization)
- Understanding the structure of human transcriptome via various sparse and dense Factor Analysis approaches, and their biological implications
Highlighted publications
The GTEx Consortium atlas of genetic regulatory effects across human tissues (2020)
The Genotype-Tissue Expression (GTEx) project was established to characterize genetic effects on the transcriptome across human tissues and to link these regulatory mechanisms to trait and disease associations. Here, we present analyses of the version 8 data, examining 15,201 RNA-sequencing samples from 49 tissues of 838 postmortem donors. We comprehensively characterize genetic associations for gene expression and splicing in cis and trans, showing that regulatory associations are found for almost all genes, and describe the underlying molecular mechanisms and their contribution to allelic heterogeneity and pleiotropy of complex traits. Leveraging the large diversity of tissues, we provide insights into the tissue specificity of genetic effects and show that cell type composition is a key factor in understanding gene regulatory mechanisms in human tissues.
Genetic effects on gene expression across human tissues (2017)
The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body. This work describes the genetic effects on gene expression levels across 44 human tissue, and finds that local genetic variation affects gene expression levels for the majority of genes. This work further identifies inter-chromosomal genetic effects for 93 genes and 112 loci, and 673 distal eQTLs across 18 tissues (10% FDR). On the basis of the identified genetic effects, this work also characterizes patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.
Additional links
ORCiD (QR code on the right)