# MetaGWAS Pipeline

### MetaGWAS - Meta-analysis of Genome-Wide Association Studies

MetaGWAS is a common approach for improving the power of complex trait-gene mapping studies. The basic principle of meta-analysis is to combine the evidence for association from individual studies, using appropriate weights.

### Workflow

![](/files/-M8zbYNlO6MKHGECXiCg)

#### User Journey

A few simple steps are needed to run this pipeline:

1. reach *Pipelines* section of the platform;
2. select Meta-GWAS pipeline card;
3. upload your data file;
4. choose the parameters from the configuration page;
5. provide the job name and run the analysis;

then, you will be redirected to the results page.

![](/files/-MDyPQpU84rIb0uioRSK)

The design is always the same as the previous pipelines since they share all the same characteristics.

![](/files/-MDyPlvL1Z50h-5AaDYb)

The first thing to do is to load your **Summary statistics Files** in .txt format.

This file must contain information regarding *marker name*, *chromosome*, *position*, *allele*, *allele frequency*, *effect size*, *standard error* and *p-value*.&#x20;

Once the file containing this information has been uploaded, it will be possible to define in which columns the values of your interest are contained:

![](/files/-MDyPtTN3VuTF3EUgWUn)

#### List of parameters&#x20;

* **Pheno:** phenotype associated to input data;
* **Separator:** you must chose in the drop down menu options (Whitespace, Comma, Tab) the column separator in the input file;
* **Name of 'Allele Frequency' Column:**  allele frequency of the first of these alleles, (e.g., allele frequency of the first of these alleles);
* **Name of 'Allele' Column:** column name for **effect**/**tested** allele and other allele, (e.g., the two allele labels are stored in columns labelled **EFFECT\_ALLELE** and **NON\_EFFECT\_ALLELE**);
* **Name of ‘Marker/SNP’ Column:** column name for Marker/SNP;
* **Name of ‘Weight' Column:** the number of individuals analyzed for each row -- and which can be used to weight the contribution of each study in sample size and p-value based meta-analysis is stored in a column labeled N;
* **Name of ‘Standard Error' Column:** specify the label for the standard error column;
* **Name of ‘Effect' Column:** column name for Effect size; **Log transform button:** switch *on* if you want *Log transform the effect column values;*
* **Name of ‘P Value’ Column:** column name of the p-value column; **Log transform button:** switch *on* if you want *Log transform the p-value column values;*
* **Name of ‘Strand’ Column:** column name of the strand column, if present.

![](/files/-MDyQ0S_NqMoyLK_AFVy)

Now select the *analysis scheme*.

The tool chosen for the analysis is **METAL**, which implements two approaches:

1. **Sample Size** based method: converts the direction of effect and P-value observed in each study into a signed Z-score such that very negative Z-scores indicate a small P-value and an allele associated with lower disease risk or quantitative trait levels;&#x20;
   * **Minimum sample size for all Markers:** default input value 10000;
   * **Default weight:** default input value 1000;
2. **Inverse variance** based method: weights the effect size estimates, or β-coefficients, by their estimated standard errors;
   * if you want to **use sample size & p-value for analysis**, switch on; default: *on*;
   * if you want to **use effect size and standard error for analysis**, switch on; default: *on.*

**Genomic Control Correction**

If the **genomic control** is switched *off*, there is no adjustment to test statistics; indeed, if it is switched *on* it automatically corrects test statistics to account for small amounts of population stratification or unaccounted for relatedness. The default value for **Inflation factor** is *0.97*.

![](/files/-MDyQ9OsibdHK59pQXoc)

**Strand Overlap Correction**

If you want do perform *overlap correction*, switch *on* and insert a threshold value for Z-statistics; by default, this value is 1.

If the **Strand Column** is present in the input file, switch *on* the **Strand Details** button.&#x20;

If you want to track

* **Mean Allele Frequency**
* **Minimum and Maximum Allele Frequency**

switch *on*.

There is also the option to get **Detailed SNP Report**, switch *on*.

### Results

Once you have chosen the pipeline to be used, uploaded the data file and set all the parameters, you can start your analysis using the *Run Analysis* box; at this point you will be redirected to this page, where you can keep an eye on which works are *In Progress*, which are *Completed*, and choose to carry out a new analysis.&#x20;

![](/files/-ME3S3HQbJfzDdiKFdRV)

By clicking on your *JobName*, you will have access to this page, where you can monitor all the processes involved in your analysis:

![](/files/-ME3Rj_M3uekxa18wcRx)

Now, selecting the *Results* box on the right, let's take a look at the demo results obtained using the Default Parameters Set:

![](/files/-M8thfAIkP63QkOMESE7)

**Manhattan Plot** is a type of *scatter plot -* plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data.

![](/files/-M8zdAzJCK7RCNqLpcma)

**Normal Quantile Plot**

![](/files/-M8zdZo4f8sJ80H0rHph)

By clicking on the Interactive Graphs option, you can also view your results like this:

![](/files/-M8zjEQ6aPlAZqXJCD7P)

![](/files/-M8zlGJAgKiqVAD2jUsJ)

Finally, using the *Export* box, you will be able to download the results of your analysis in a *.pdf* format file.

#### Reference

1. [**METAL: Fast and Efficient Meta-Analysis of Genomewide Association Scans**](https://pubmed.ncbi.nlm.nih.gov/20616382/), CJ Willer, Y Li, GR Abecasis, 2010


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://shivom.gitbook.io/documentation/pipelines/metagwas-pipeline.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
