Colocalization Pipeline

Colocalization

Genetic association studies, in particular the genome-wide association study (GWAS) design, have provided a wealth of novel insights into the aetiology of a wide range of human diseases and traits, in particular cardiovascular diseases and lipid biomarkers. One of the next challenges is to assess whether two association signals are consistent with a shared causal variant. These can be accomplished by using statistical method that can use simply single variant summary statistics to test for colocalization of GWAS signals.

Methodology

A statistical analysis that computes posterior probability using Bayes Factor to see whether a causal variant colocalizes in GWAS (trait 1) and GWAS (trait 2) or GWAS and eQTL in a given region. However, Bayes Factor will only be calculated if regression coefficients and variances are available (each SNP). If these are not available then the software will then use p-value and MAF as an approximation.

Workflow

User Journey

A few simple steps are needed to run this pipeline:

  1. choose the dataset using a therapeutic tag of interest in the Search Data page;

  2. then, select the dataset and Coloc tab;

  3. user provides GWAS parameters;

  4. user selects the Colocalization tab and uploads a GWAS summary file/eQTL file;

  5. user provides other parameters for the colocalization and the job name and finally runs the analysis;

then, you will be redirected to the results page.

The design is always the same as the previous pipelines since they share all the same characteristics.

You can find an in-depth explanation of GWAS parameters here.

The first thing to do is providing a Summary statistics file corresponding to the dataset you have already chosen to use: the input files must be:

  1. VCF files selected on the Search Data page and subsequent GWAS summary statistics file;

  2. Summary Statistics file uploaded on the Colocalization configuration page.

The input parameters for both traits from each Summary Statistics file should be (as a List):

  • N = Number of individuals in sample;

  • snp = SNP id’s;

  • p-Value = p-value for each SNP;

  • beta = regression coefficient per SNP;

  • var beta = Variance of beta;

  • type = “quant” = quantitative data, “cc” = case-control data (Our platform will always use “cc”);

  • s = proportion of samples in “Trait 1” that are cases.

The platform parameters to be inserted from the user interface regard the prior probability a SNP is associated with:

  • trait 1, default value: 1e-4;

  • trait 2, default value: 1e-4;

  • both traits, default value: 1e-5.

Results

Once you have chosen the pipeline to be used, uploaded the data file and set all the parameters, you can start your analysis using the Run Analysis box; at this point you will be redirected to this page, where you can keep an eye on which works are In Progress, which are Completed, and choose to carry out a new analysis.

By clicking on your JobName, you will have access to this page, where you can monitor all the processes involved in your analysis:

Now, selecting the Results box on the right, let's take a look at the demo results obtained using the Default Parameters Set:

Scatter Plot

By clicking on the Interactive Graphs option, you can also view your results like this:

Finally, using the Export box, you will be able to download the results of your analysis in a .pdf format file.

Reference

Last updated