-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Hello,
I am using FarmCPUpp with my own genotype (GD), genotype map (GM), and phenotype (Y) files.
When I run farmcpu(), I notice that some SNPs in the GWAS output have all results as NA:
From my observation, these NA results happen whenever the corresponding SNP in the GD file contains at least one missing value (NA).
My question
- Does FarmCPUpp currently support handling SNPs with missing values?
- Should missing values in the numeric GD file be represented as
NA,-9, or something else? - If missing data are not supported, what is the recommended pre-processing approach (e.g. imputation, filtering)?
Example of my workflow
library(bigmemory)
library(FarmCPUpp)
myY <- read.table("taxa.txt", header = TRUE, stringsAsFactors = FALSE)
myGM <- read.table("mdp_SNP_information.txt", header = TRUE, stringsAsFactors = FALSE)
myGD <- read.big.matrix("mdp_numeric.txt",
type = "double", sep = "\t", header = TRUE,
col.names = myGM$SNP, ignore.row.names = FALSE,
has.row.names = TRUE,
backingfile = "mdp_numeric.bin",
descriptorfile = "mdp_numeric.desc")
myResults <- farmcpu(Y = myY, GD = myGD, GM = myGM)
Could you please clarify:
What is the correct way to represent missing genotypes in the numeric GD file?
If missing values are not supported, is imputation required before running FarmCPUpp?
Thank you for your time and for maintaining this package!Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels