Skip to content

Additions to experimental module #37

@tsrobinson

Description

@tsrobinson

Thanks to @antndlcrx we now have the basic treatment shock function. I will work up a demo for this at some point.

Looking forward, there are two further types of treatment shock we should model.

Interactive effects

In the first instance, what if we assume beyond the main effect there is an interaction effect with another variable in the data? Suppose:

  • $\tilde{X}$ -- a sample from the ESS-trained SyGNet model with $n$ observations
  • $\tilde{y} \in \tilde{X}$ -- some outcome of interest from the synthetic data
  • $\tilde{z} \in \tilde{X}$ -- some variable already present in the synthetic data
  • $\mu, \sigma$ -- hypothesized treatment effect and noise parameters for the main effect
  • $\mu_\text{Int.}, \sigma_\text{Int.}$ -- hypothesized treatment effect and noise parameters for an interaction effect

We can then simulate a scenario where:

  • $d \sim \text{Binom.}(n, 1, 0.5)$
  • $y' \sim \mathcal{N}\bigg(d \times \mu + \tilde{z} + d \times \tilde{z} \times \mathcal{N}(\mu_\text{Int.},\sigma_\text{Int.}), \ \sigma\bigg)$

Heterogeneous Treatment Effects (HTE)

Unlike in the interaction case, we might want to preserve the ATE by simulating a HTE where the main effect is a function of some third variable, centred on $\mu$ . So:

  • $\tilde{X}$ -- a sample from the ESS-trained SyGNet model with $n$ observations
  • $\tilde{y} \in \tilde{X}$ -- some outcome of interest from the synthetic data
  • $\tilde{z} \in \tilde{X}$ -- some variable already present in the synthetic data
  • $\mu, \sigma$ -- hypothesized treatment effect and noise parameters for the main effect

We can simulate a scenario where:

  • $d \sim \text{Binom.}(n, 1, 0.5)$
  • $\tilde{z}_\text{Z-score} = \frac{\tilde{z} - \text{Mean}(\tilde{z})}{\text{StdDev}(\tilde{z})}$
  • $y' = \mathcal{N}\bigg(d \times \mu \times (1 + \tilde{z}_\text{Z-score}),\sigma\bigg) $

One further complication we could add is a parameter $\psi$ to control the amount of heterogeneity:
$\tilde{z}_\text{Z-score} = \psi \times \frac{\tilde{z} - \text{Mean}(\tilde{z})}{\text{StdDev}(\tilde{z})}$,
and then we keep the same outcome equation.

Metadata

Metadata

Assignees

Labels

new featureNew feature or requestrefineImprovements to code short of a bug fix

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions