Colocalization Pipeline

Colocalization

Genetic association studies, in particular the genome-wide association study (GWAS) design, have provided a wealth of novel insights into the aetiology of a wide range of human diseases and traits, in particular cardiovascular diseases and lipid biomarkers. One of the next challenges is to assess whether two association signals are consistent with a shared causal variant. These can be accomplished by using statistical method that can use simply single variant summary statistics to test for colocalization of GWAS signals.

Methodology

A statistical analysis that computes posterior probability using Bayes Factor to see whether a causal variant colocalizes in GWAS (trait 1) and GWAS (trait 2) or GWAS and eQTL in a given region. However, Bayes Factor will only be calculated if regression coefficients and variances are available (each SNP). If these are not available then the software will then use p-value and MAF as an approximation.

Workflow

User Journey

A few simple steps are needed to run this pipeline:

choose the dataset using a therapeutic tag of interest in the Search Data page;
then, select the dataset and Coloc tab;
user provides GWAS parameters;
user selects the Colocalization tab and uploads a GWAS summary file/eQTL file;
user provides other parameters for the colocalization and the job name and finally runs the analysis;

then, you will be redirected to the results page.

The design is always the same as the previous pipelines since they share all the same characteristics.

You can find an in-depth explanation of GWAS parameters here.

The first thing to do is providing a Summary statistics file corresponding to the dataset you have already chosen to use: the input files must be:

VCF files selected on the Search Data page and subsequent GWAS summary statistics file;
Summary Statistics file uploaded on the Colocalization configuration page.

The input parameters for both traits from each Summary Statistics file should be (as a List):

N = Number of individuals in sample;
snp = SNP id’s;
p-Value = p-value for each SNP;
beta = regression coefficient per SNP;
var beta = Variance of beta;
type = “quant” = quantitative data, “cc” = case-control data (Our platform will always use “cc”);
s = proportion of samples in “Trait 1” that are cases.

The platform parameters to be inserted from the user interface regard the prior probability a SNP is associated with:

trait 1, default value: 1e-4;
trait 2, default value: 1e-4;
both traits, default value: 1e-5.

Results

Once you have chosen the pipeline to be used, uploaded the data file and set all the parameters, you can start your analysis using the Run Analysis box; at this point you will be redirected to this page, where you can keep an eye on which works are In Progress, which are Completed, and choose to carry out a new analysis.

By clicking on your JobName, you will have access to this page, where you can monitor all the processes involved in your analysis:

Now, selecting the Results box on the right, let's take a look at the demo results obtained using the Default Parameters Set:

Scatter Plot

By clicking on the Interactive Graphs option, you can also view your results like this:

Finally, using the Export box, you will be able to download the results of your analysis in a .pdf format file.

Reference

Statistical Independence of the Colocalized Association Signals for Type 1 Diabetes and RPS26 Gene Expression on Chromosome 12q13, Plagnol V et al, 2009
Statistical Testing of Shared Genetic Control for Potentially Related Traits, Wallace C, 2013
Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics, Giambartolomei C et al, 2014
Coloc: a package for colocalisation analyses, Wallace C, 2019

PreviousMetaGWAS Pipeline NextMedelian Randomization

Last updated 4 years ago

Was this helpful?