forked from hail-is/hail
-
Notifications
You must be signed in to change notification settings - Fork 1
spec
jbloom22 edited this page Jan 11, 2017
·
2 revisions
-
passback: String. Optional. Default: None. Returned as is in the response JSON to facilitate debug. -
md_version: String. Required. Indicates metadata version, including dataset. -
api_version: Float. Required. Indicates API version to synchronize clients. -
phenotype: String. Optional. Default: None.. Name of phenotype to be used as response in regression. List of available phenotypes is published through the metadata API. -
sample_covariates: Array of String. Optional. Default: None. phenotype must be present. Names of sample covariates to be projected out of phenotype. List of available sample covariates is published through the metadata API. -
variant_covariates: Array of Variant Object. Optional. Default: None. phenotype must be present. Variants to be projected out of the phenotype. Each variant object is comprised of a list of four key-value pairs:-
chrom: String. Chromosome. -
pos: Integer. Position. -
ref: String. Reference allele. -
alt: String. Alternate allele.
-
-
variant_filters: Array of Filter Objects. Required. Current max width is 600k. Filters are logically conjuncted (AND). Each filter statement is comprised of a list of key-value pairs that define the filter operation. The filter statements contain the following keys:-
operand: String. Property used in the formula (chrom, pos, mac). -
operator: String. Formula operator (eq, gte, gt, lte, lt). eq is not allowed for mac. -
value: String. Formula value. -
operand_type: String. Property type (String, Integer).
-
-
variant_list: Array of Variant Objects. Optional. Default: None. Each variant object is comprised of a list of four key-value pairs: -
variant_ld: Variant Object. Optional. Default: None. Variant must be active. Marked variant for LD computation. -
compute_linreg: Boolean. Optional. Default: false. phenotype must be present. If true, linear regression statistics are computed. -
compute_ld_r: Boolean. Optional. Default: false. variant_ld must be present. If true, r is computed against variant_ld. -
compute_ld_d: Boolean. Optional. Default: false. variant_ld must be present. If true, D' is computed against variant_ld. -
compute_scores: Boolean. Optional. Default: false. phenotype must be present. If true, the vector u of scores is computed. -
compute_covariance: Boolean. Optional. Default: false. If true, the unscaled covariance matrix C is computed. -
compute_sigma_sq: Boolean. Optional. Dafault: false. phenotype must be present. If true, sigma_sq is computed. -
limit: Integer. Optional. Default: current hard limit is 100k. Maximum number of variants returned. -
count: Boolean. Optional. Default: false. If true, only the number of active_variants is returned, with no statistics.
Active variants are those variants in the dataset that satisfy all variant filters and, if a variant list is present, are in the variant list.
When compute_covariance is true, we may impose harder limits (TBD) on the width of the window and size of variant_list.
Example input:
{
"passback" : "example",
"md_version" : "mdv1",
"api_version" : 1,
"phenotype" : "t2d",
"sample_covariates" : [ "BMI", "PC1" ],
"variant_covariates" : [
{"chrom": "20", "pos": 2000, "ref": "T", "alt": "G"}
],
"variant_filters" : [
{"operand": "chrom", "operator": "eq", "value": "20", "operand_type": "string"},
{"operand": "position", "operator": "gte", "value": 1000, "operand_type": "integer"},
{"operand": "position", "operator": "lte", "value": 4000, "operand_type": "integer"},
{"operand": "mac", "operator": "gte", "value": 4, "operand_type": "integer"}
],
"variant_list" : [
{"chrom": "20", "pos": 1234, "ref": "G", "alt": "A"}
{"chrom": "20", "pos": 2900, "ref": "C", "alt": "T"}
],
"variant_ld" : {"chrom": "20", "pos": 1234, "ref": "G", "alt": "A"},
"compute_linreg" : true,
"compute_ld_r" : true,
"compute_ld_d" : true,
"compute_scores" : true,
"compute_covariance" : true,
"compute_sigma_sq" : true,
"limit" : 50,
"count" : false
}-
count: Integer. Number of active variants. -
active_variants: Array of Variant Objects. Active variants sorted by pos, ref, alt. -
betas: Array of Float. Betas indexed by active_variants. Present if compute_linreg is true. -
stderrs: Array of Float. Standard errors indexed by active_variants. Present if compute_linreg is true. -
zstats: Array of Float. Test statistics indexed by active_variants. Present if compute_linreg is true. -
pvals: Array of Float. p-values indexed by active_variants. Present if compute_linreg is true. -
ld_r: Array of Float. r-values with variant_ld indexed by active_variants. Present if compute_ld_r is true. -
ld_d: Array of Float. D' values with variant_ld indexed by active_variants. Present if compute_ld_d is true. -
scores: Array of Float. Scores u = (X - Xbar)^T y indexed by active_variants. y is the residual phenotype. Present if compute_scores is true. -
covariance: Array of Float. Unscaled covariance matrix C = (X - Xbar)(X - Xbar)^T as an array of length n * (n - 1) / 2 where n = count. C is indexed by active_variants and encoded to array via upper triangle (row-major) or equivalently lower triangle (column-major):
[ 0 1 2 ]
[ 1 3 4 ]
[ 2 4 5 ]
[ (0,0), (0, 1), (0, 2), (1, 1), (1, 2), (2, 2)]
[ (0,0), (1, 0), (2, 0), (1, 1), (2, 1), (2, 2)]
The variance-covariance matrix V is given by sigma_sq * C. Present if compute_covariance is true.
-
sigma_sq: Float. Variance of the residual phenotype y. Present if compute_sigma_sq is true. -
nsamples: Integer. Number of samples used (e.g., with phenotype and all covariates present). Present if phenotype is present.
-
passback: String. Contains thepassbackvalue given in the original request. -
is_error: Boolean. True if the operation errored out due to bad input or an internal issue. -
error_message: String. Indicates the cause of failure. Present if is_error is true.
See the Methods section of Meta-Analysis of Gene Level Tests for Rare Variant Association for details on u, V, and sigma_sq.
Example output:
{
"is_error" : false,
"passback" : "example",
"count" : 2,
"active_variants" : [
{"chrom": "20", "pos": 1234, "ref": "G", "alt": "A"}
{"chrom": "20", "pos": 2900, "ref": "C", "alt": "T"}
],
"betas" : [ 0.1, 2.0],
"stderrs" : [ 0.2, 1.0],
"zstats" : [ 0.5, 2.0],
"pvals" : [ 0.6171, 0.0455],
"ld_r" : [ 1.0, -0.1],
"ld_d" : [ 1.0, -0.2],
"scores" : [ 1.2, 0.4 ],
"covariance" : [ 1.4, -1.2, 0.9 ],
"sigma_sq" : 12.2,
"nsamples" : 2104
}