-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
71 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,73 @@ | ||
# kleEpistasis | ||
Brute-force 3-way Interaction calculation for genetic studies | ||
|
||
Welcome to kleEpistasis. A tool for performing 3-way genetic interaction analysis. | ||
|
||
Contact: stefan.kleeberger@gmail.com (Stefan Kleeberger, Programmer), bmm@psych.mpg.de (Prof. Dr. Bertram Mueller-Myhsok, Supervisor) Max Planck Institute of Psychiatry, Munich, 2015 | ||
|
||
In order to perform brute-force statistical 3-way interaction tests on SNP data, you will have to provide your data in PLINK binary format and a Phenotype in PLINK alternate phenotype format. Please see "http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml#bed" and "http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml#pheno" for more information regarding file formats. | ||
|
||
|
||
|
||
Flags: | ||
|
||
|
||
REQUIRED | ||
|
||
'-path [path]' Absolute or relative path to plink files in binary format (.bed .bim .fam) !>> without <<! file extention | ||
|
||
'-pathPheno [path]' Absolute or relative path to plink alternate phenotype file !>> with <<! file extention\ | ||
|
||
'-outPath [path]' Absolute or relative path to file where results will be written !>> with <<! file extention (.csv) | ||
|
||
'-device [0,1,...]' GPU identifier, starting from 0 for first Graphics Card increasing in whole numbers | ||
|
||
'-blockSize [2,3,...]' \t parameter for optimizing runtime. See explicit paragraph beneath | ||
|
||
'-threads [1,2,...]' \t Number of threads to process results. See explicit paragraph beneath | ||
|
||
'-alphaPercent ]0.0;50.0[' \t Significance level for two-sided test. Allowed values: ]0.0;50.0[ | ||
|
||
|
||
OPTIONAL | ||
|
||
'-pheno [0,1,...]' \t If your Phenotype file contains multiple phenotypes, you can use this flag to specify which phenotype should be used. | ||
If you skip the '-pheno' flag, the first phenotype will be used (equivalent to '-pheno 0'). | ||
The second phenotype has index 1, the third index 2 and so forth... | ||
|
||
'-testBlockSize 1' \t This will start a test-run with only one sub-matrix to be calculated. | ||
Use this flag to reduce runtime to test for the best possible value for '-blockSize' | ||
This will not create ANY results! | ||
|
||
-blockSize: | ||
If you are familiar with CUDA, you know what's it about. Otherwise you needn't get too deep into this. | ||
All you need to know is that this parameter has to be found by trial and error. | ||
Use the '-testBlockSize 1' to reduce runtime dramatically and try different values for blockSize | ||
We achived best results with '-blockSize 4' on a NVIDIA Tesla K40 | ||
|
||
-threads: | ||
Results will be processed in backroud by the CPU while the GPU creates new results. | ||
Depending on your CPU und Graphics Card, the GPU has to wait for the CPU to finish befor copying the next results. | ||
To resolve this the task is split into chuncks and processed by one thread each. | ||
We used an Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz combined with a NVIDIA Tesla K40 and 8 threads (-threads 8) without encountering wait time. | ||
This programm will altogether create n+2 threads with n as the number passed via the '-threads' flag. | ||
|
||
|
||
Memory: | ||
Doing a run on 5k SNPs and 1k Individuals, you need at least 20GB of RAM | ||
Execution time: | ||
We were able to perform a run on 5k SNPs and 1k Individuals, on an Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz combined with a NVIDIA Tesla K40 in approx. " 2 hours" | ||
|
||
|
||
Result file: | ||
The file containing the results has 4 columns: | ||
Positio_ SNP1,Positio_ SNP2,Position_SNP3,Calculated_Value | ||
and as many rows as significant results have been found. | ||
|
||
|
||
Example calls: | ||
For performing a runtime test: | ||
./bin/kleEpistasis -path /home/testuser/testdata/plink -pathPheno /home/testuser/testdata/pheno.txt -outPath /home/testuser/results/testDataRes.csv -device 0 -blockSize 4 -threads 8 -alphaPercent 5 -testBlockSize 1 | ||
For performing a complete run: | ||
./bin/kleEpistasis -path /home/testuser/testdata/plink -pathPheno /home/testuser/testdata/pheno.txt -outPath /home/testuser/results/testDataRes.csv -device 0 -blockSize 4 -threads 8 -alphaPercent 5 | ||
|