Not sure what's the best way of processing pedigree information: currently, user is required to provide info on everyone in the VCF, so e.g. duos you don't trust or parents without children should be provided as singletons. This way, mother and father lists always represent a subset of children.
Notably, if not all families are of the same type (all trios/all duos), the produced GRMs will have different numbers of individuals. GCTA will mostly take care of that, as it always extracts the intersection of individuals across all GRMs in --mgrm. However, the analyses restricted to only some haplotypes or genotypes will have different numbers of individuals.
E.g., provided with 1000 maternal duos and 1000 trios, as follows:
f1 0 m1
f2 0 m2
...
f1000 0 m1000
f1001 p1 m1001
f1002 p2 m1002
...
f2000 p1000 m2000
the produced GRMs will have 2000 individuals for M1, P1, M2, FG, MG, but 1000 for PG and P2; then the GCTA analyses will be based on 1000 individuals for PG+FG, M1+M2+P2, and M1+M2+P1+P2, but on 2000 individuals else. Current solution is for the user to provide the intersection lists and edit the split_haplotypes_launchers.sh to use those.
Not sure what's the best way of processing pedigree information: currently, user is required to provide info on everyone in the VCF, so e.g. duos you don't trust or parents without children should be provided as singletons. This way, mother and father lists always represent a subset of children.
Notably, if not all families are of the same type (all trios/all duos), the produced GRMs will have different numbers of individuals. GCTA will mostly take care of that, as it always extracts the intersection of individuals across all GRMs in
--mgrm. However, the analyses restricted to only some haplotypes or genotypes will have different numbers of individuals.E.g., provided with 1000 maternal duos and 1000 trios, as follows:
the produced GRMs will have 2000 individuals for M1, P1, M2, FG, MG, but 1000 for PG and P2; then the GCTA analyses will be based on 1000 individuals for PG+FG, M1+M2+P2, and M1+M2+P1+P2, but on 2000 individuals else. Current solution is for the user to provide the intersection lists and edit the split_haplotypes_launchers.sh to use those.