Genetic Correlation

Genetic correlation using LDSC (linkage disequilibrium score regression).

Genetic correlation and heritability

A genetic correlation is the proportion of variance that two traits share due to genetic causes, the correlation between the genetic influences on a trait and the genetic influences on a different trait estimating the degree of pleiotropy or causal overlap. Heritability is a measure of how well differences in people’s genes account for differences in their traits. The tool chosen is LDSC (https://github.com/bulik/ldsc), a command-line tool for estimating heritability and genetic correlation from GWAS summary statistics.

Workflow

Estimation of genetic correlation in European GWAS by utilizing pre-computed LD Scores.

LDSC performs:

  1. LD Score regression intercept for a 1st GWAS Summary Statistics;

  2. SNP-heritability for 1st Summary Statistics;

  3. Genetic correlation between 1st and subsequent Summary Statistics.

User Journey

A few simple steps are needed to run this pipeline:

  1. select Genetic Correlation pipeline from Pipelines section;

  2. upload your input files;

  3. provide other parameters for the genetic correlation and the job name and finally runs the analysis;

then, you will be redirected to the results page.

The first thing to do is to load your Summary statistics Files, which can be:

  1. GWAS Summary Statistics files;

  2. European GWAS pre-computed LD scores;

  3. SNP-list of alleles;

  4. Sample sizes for each Summary Statistics;

  5. Sample and Population Prevalence for stratification of summary stats along with genetic and heritability intercept for regression calculations.

Let's take a look at the parameters:

  • Number of Samples: Total no. of sample size (cases+control) for each summary statistics file provided;

  • Phenotype: Phenotype name of the summary stats trait;

  • Sample Prevalence: Value used for sample stratification respective to summary statistics; optional, default: NULL;

  • Population Prevalence: Value used for population stratification respective to summary statistics; optional, default: NULL;

  • Intercept: Value used for genetic covariance regression respective to summary statistics; optional, default: NULL;

  • h2_intercept: Value used for heritability regression respective to summary statistics; optional, default: NULL;

  • LDScore: Pre-calculated LDScore for 1000 European genome; input value to be chosen from Dropdown menu with default values;

  • SNP: Whole Genome reference SNPs;

  • Use-Intercept: User input switch to use no-intercept parameter for calculating genetic correlation in LDSC function; optional, default: switch on.

Results

Once you have chosen the pipeline to be used, uploaded the data file and set all the parameters, you can start your analysis using the Run Analysis box; at this point, you will be redirected to the Dashboard, where you can keep an eye on which works are In Progress, which are Completed, and choose to carry out a new analysis.

By clicking on your JobName, you will have access to a page where you can monitor all the processes involved in your analysis.

Now, selecting the Results box on the right, let's take a look at the demo results obtained using the Default Parameters Set:

Correlation plot is generated indicating the correlation between 1st summary stats and latter other summaries.

This example examines the evidence for the genetic correlation between psychological diseases Neuroticism, Depressive symptom and Subjective Well-Being for 10K and fills samples.

Input files: summary statistics folder consisting of the following files:

  1. Neuroticism_Full.txt

  2. DS_Full.txt

  3. SWB_Full.txt

  4. SWB_10K.txt

Whole-genome reference SNP list for European ancestral and Pre-computed LD scores.

Genetic correlation results

The Interactive Graph option also provides an alternative visualization method:

Finally, using the Export box, you will be able to download the results of your analysis in a .pdf format file.

Reference