Discussion R scripts NIPT

Currently data are not divided into train and test set. Therefore, dispersion of z-score is estimated too small.
Arbitrary values. If we have train/test set we can optimise them based on data
bin size (50k)
chi^2 (3.5)
1.15 * CV
4 models
each model has 4 predictors (could use adjusted R^2 or Bayesian variable selection)
CV = sigma / mu is biased; according to http://en.wikipedia.org/wiki/Coefficient_of_variation it should be (1 + 1/(4n)) * sigma / mu
CV_observed is based on ratio's of observed vs. predicted fractions of reads. CV_theoretical is based on absolute number of reads in sample. This seem to be different units?!
According to http://en.wikipedia.org/wiki/Standard_score, the Z-score has sd in denominator, not CV! So I guess you mean SD_theoretical, SD_observed?
Z-scores seem to be bi-modal or even tri-modal; i.e., not normally distributed! How come?
The different prediction models should (1) agree and (2) be weighted to determine end result

Provide feedback