Hi,
First of all, thank you for creating this amazing resources for learning about GWAS pipeline!
I'm following along the tutorial and I stumbled about this formula on the 04_Data_QC section. In the the textbox that explains how MAF is calculated it is stated, for a in a population of N samples (2N alleles), $N = N_{AA} + 2 N_{AB} + N_{BB}$. I'm wondering why the number of heterozygous samples is multiplied by 2. I would have thought the formula would simply be: $N = N_{AA} + N_{AB} + N_{BB}$. Maybe I'm misunderstanding something as I'm new to the genomics field.
Hi,
First of all, thank you for creating this amazing resources for learning about GWAS pipeline!
I'm following along the tutorial and I stumbled about this formula on the 04_Data_QC section. In the the textbox that explains how MAF is calculated it is stated, for a in a population of N samples (2N alleles),$N = N_{AA} + 2 N_{AB} + N_{BB}$ . I'm wondering why the number of heterozygous samples is multiplied by 2. I would have thought the formula would simply be: $N = N_{AA} + N_{AB} + N_{BB}$ . Maybe I'm misunderstanding something as I'm new to the genomics field.