Skip to content

Commit 5e31300

Browse files
committed
Merge branch 'master' of github.com:DoubleML/doubleml-serverless into 0.0.X
2 parents 2881ba4 + 7a95d0d commit 5e31300

File tree

11 files changed

+115
-99
lines changed

11 files changed

+115
-99
lines changed

.github/workflows/pytest.yml

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -19,20 +19,35 @@ jobs:
1919
runs-on: ubuntu-latest
2020
strategy:
2121
matrix:
22-
python-version: ['3.6', '3.7', '3.8', '3.9']
22+
config:
23+
- {python-version: '3.6', doubleml-version: 'release'}
24+
- {python-version: '3.7', doubleml-version: 'release'}
25+
- {python-version: '3.8', doubleml-version: 'release'}
26+
- {python-version: '3.8', doubleml-version: 'dev'}
27+
- {python-version: '3.9', doubleml-version: 'release'}
2328

2429
steps:
2530
- uses: actions/checkout@v2
26-
- name: Set up Python ${{ matrix.python-version }}
31+
- name: Set up Python ${{ matrix.config.python-version }}
2732
uses: actions/setup-python@v2
2833
with:
29-
python-version: ${{ matrix.python-version }}
34+
python-version: ${{ matrix.config.python-version }}
35+
- uses: actions/checkout@v2
36+
if: matrix.config.doubleml-version == 'dev'
37+
with:
38+
repository: DoubleML/doubleml-for-py
39+
path: doubleml-for-py
40+
- name: DoubleML dev version
41+
if: matrix.config.doubleml-version == 'dev'
42+
run: |
43+
cd doubleml-for-py
44+
pip install --editable .
3045
- name: Install dependencies
3146
run: |
3247
python -m pip install --upgrade pip
33-
python -m pip install pytest
48+
python -m pip install pytest xgboost
3449
pip install -r requirements.txt
3550
pip install .
3651
- name: Test with pytest
3752
run: |
38-
pytest
53+
pytest doubleml_serverless/

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Copyright (c) 2020 Malte S. Kurz
1+
Copyright (c) 2020-2021 Malte S. Kurz
22

33
Permission is hereby granted, free of charge, to any person obtaining a copy
44
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ DoubleML-Serverless is an extension for serverless cloud computing of the Python
77
DoubleML is available via PyPI [https://pypi.org/project/DoubleML](https://pypi.org/project/DoubleML) and on GitHub [https://github.com/DoubleML/doubleml-for-py](https://github.com/DoubleML/doubleml-for-py).
88
The Python package DoubleML was introduced in
99
"DoubleML - An Object-Oriented Implementation of Double Machine Learning in Python"
10-
([Bach et al., 2021](https://arxiv.org/abs/2104.03220))
10+
([Bach et al., 2022](https://www.jmlr.org/papers/v23/21-0862.html))
1111
and a detailed documentation \& user guide for the package is available at
1212
[https://docs.doubleml.org](https://docs.doubleml.org).
1313

@@ -149,9 +149,10 @@ Bibtex-entry:
149149

150150
## References
151151

152-
Bach, P., Chernozhukov, V., Kurz, M. S., and Spindler, M. (2021).
153-
DoubleML - An Object-Oriented Implementation of Double Machine Learning in Python.
154-
arXiv:[2104.03220](https://arxiv.org/abs/2104.03220).
152+
Bach, P., Chernozhukov, V., Kurz, M. S., and Spindler, M. (2022), DoubleML - An
153+
Object-Oriented Implementation of Double Machine Learning in Python,
154+
Journal of Machine Learning Research, 23(53): 1-6,
155+
[https://www.jmlr.org/papers/v23/21-0862.html](https://www.jmlr.org/papers/v23/21-0862.html).
155156

156157
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W. and Robins, J. (2018).
157158
Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21: C1-C68.

doubleml_serverless/double_ml_iivm_aws_lambda.py

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -26,22 +26,22 @@ def __init__(self,
2626
draw_sample_splitting=True,
2727
apply_cross_fitting=True):
2828
DoubleMLIIVM.__init__(self,
29-
obj_dml_data,
30-
ml_g,
31-
ml_m,
32-
ml_r,
33-
n_folds,
34-
n_rep,
35-
score,
36-
subgroups,
37-
dml_procedure,
38-
trimming_rule,
39-
trimming_threshold,
40-
draw_sample_splitting,
41-
apply_cross_fitting)
29+
obj_dml_data=obj_dml_data,
30+
ml_g=ml_g,
31+
ml_m=ml_m,
32+
ml_r=ml_r,
33+
n_folds=n_folds,
34+
n_rep=n_rep,
35+
score=score,
36+
subgroups=subgroups,
37+
dml_procedure=dml_procedure,
38+
trimming_rule=trimming_rule,
39+
trimming_threshold=trimming_threshold,
40+
draw_sample_splitting=draw_sample_splitting,
41+
apply_cross_fitting=apply_cross_fitting)
4242
DoubleMLLambda.__init__(self,
43-
lambda_function_name,
44-
aws_region)
43+
lambda_function_name=lambda_function_name,
44+
aws_region=aws_region)
4545

4646
def _ml_nuisance_aws_lambda(self, cv_params):
4747
assert self._dml_data.n_treat == 1

doubleml_serverless/double_ml_irm_aws_lambda.py

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
from doubleml import DoubleMLIRM
2-
import numpy as np
32
from sklearn.utils import check_X_y
43

54
from ._helper import _get_cond_smpls
@@ -24,20 +23,20 @@ def __init__(self,
2423
draw_sample_splitting=True,
2524
apply_cross_fitting=True):
2625
DoubleMLIRM.__init__(self,
27-
obj_dml_data,
28-
ml_g,
29-
ml_m,
30-
n_folds,
31-
n_rep,
32-
score,
33-
dml_procedure,
34-
trimming_rule,
35-
trimming_threshold,
36-
draw_sample_splitting,
37-
apply_cross_fitting)
26+
obj_dml_data=obj_dml_data,
27+
ml_g=ml_g,
28+
ml_m=ml_m,
29+
n_folds=n_folds,
30+
n_rep=n_rep,
31+
score=score,
32+
dml_procedure=dml_procedure,
33+
trimming_rule=trimming_rule,
34+
trimming_threshold=trimming_threshold,
35+
draw_sample_splitting=draw_sample_splitting,
36+
apply_cross_fitting=apply_cross_fitting)
3837
DoubleMLLambda.__init__(self,
39-
lambda_function_name,
40-
aws_region)
38+
lambda_function_name=lambda_function_name,
39+
aws_region=aws_region)
4140

4241
def _ml_nuisance_aws_lambda(self, cv_params):
4342
assert self._dml_data.n_treat == 1

doubleml_serverless/double_ml_pliv_aws_lambda.py

Lines changed: 19 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ def __init__(self,
1111
lambda_function_name,
1212
aws_region,
1313
obj_dml_data,
14-
ml_g,
14+
ml_l,
1515
ml_m,
1616
ml_r,
1717
n_folds=5,
@@ -21,19 +21,19 @@ def __init__(self,
2121
draw_sample_splitting=True,
2222
apply_cross_fitting=True):
2323
DoubleMLPLIV.__init__(self,
24-
obj_dml_data,
25-
ml_g,
26-
ml_m,
27-
ml_r,
28-
n_folds,
29-
n_rep,
30-
score,
31-
dml_procedure,
32-
draw_sample_splitting,
33-
apply_cross_fitting)
24+
obj_dml_data=obj_dml_data,
25+
ml_l=ml_l,
26+
ml_m=ml_m,
27+
ml_r=ml_r,
28+
n_folds=n_folds,
29+
n_rep=n_rep,
30+
score=score,
31+
dml_procedure=dml_procedure,
32+
draw_sample_splitting=draw_sample_splitting,
33+
apply_cross_fitting=apply_cross_fitting)
3434
DoubleMLLambda.__init__(self,
35-
lambda_function_name,
36-
aws_region)
35+
lambda_function_name=lambda_function_name,
36+
aws_region=aws_region)
3737

3838
def _ml_nuisance_aws_lambda(self, cv_params):
3939
assert self._dml_data.n_treat == 1
@@ -47,12 +47,12 @@ def _ml_nuisance_aws_lambda(self, cv_params):
4747

4848
payload = self._dml_data.get_payload()
4949

50-
payload_ml_g = payload.copy()
50+
payload_ml_l = payload.copy()
5151
payload_ml_m = payload.copy()
5252
payload_ml_r = payload.copy()
5353

54-
_attach_learner(payload_ml_g,
55-
'ml_g', self.learner['ml_g'],
54+
_attach_learner(payload_ml_l,
55+
'ml_l', self.learner['ml_l'],
5656
self._dml_data.y_col, self._dml_data.x_cols)
5757

5858
_attach_learner(payload_ml_m,
@@ -63,7 +63,7 @@ def _ml_nuisance_aws_lambda(self, cv_params):
6363
'ml_r', self.learner['ml_r'],
6464
self._dml_data.d_cols[0], self._dml_data.x_cols)
6565

66-
payloads = _attach_smpls([payload_ml_g, payload_ml_m, payload_ml_r],
66+
payloads = _attach_smpls([payload_ml_l, payload_ml_m, payload_ml_r],
6767
[self.smpls, self.smpls, self.smpls],
6868
self.n_folds,
6969
self.n_rep,
@@ -80,9 +80,10 @@ def _ml_nuisance_aws_lambda(self, cv_params):
8080
# compute score elements
8181
self._psi_a[:, i_rep, self._i_treat], self._psi_b[:, i_rep, self._i_treat] = \
8282
self._score_elements(y, z, d,
83-
preds['ml_g'][:, i_rep],
83+
preds['ml_l'][:, i_rep],
8484
preds['ml_m'][:, i_rep],
8585
preds['ml_r'][:, i_rep],
86+
None,
8687
self.smpls[i_rep])
8788

8889
return
Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,16 @@
11
from doubleml import DoubleMLPLR
2-
import numpy as np
32
from sklearn.utils import check_X_y
43

54
from .double_ml_aws_lambda import DoubleMLLambda
6-
from ._helper import _attach_learner, _attach_smpls, _extract_preds
5+
from ._helper import _attach_learner, _attach_smpls
76

87

98
class DoubleMLPLRServerless(DoubleMLPLR, DoubleMLLambda):
109
def __init__(self,
1110
lambda_function_name,
1211
aws_region,
1312
obj_dml_data,
14-
ml_g,
13+
ml_l,
1514
ml_m,
1615
n_folds=5,
1716
n_rep=1,
@@ -20,18 +19,18 @@ def __init__(self,
2019
draw_sample_splitting=True,
2120
apply_cross_fitting=True):
2221
DoubleMLPLR.__init__(self,
23-
obj_dml_data,
24-
ml_g,
25-
ml_m,
26-
n_folds,
27-
n_rep,
28-
score,
29-
dml_procedure,
30-
draw_sample_splitting,
31-
apply_cross_fitting)
22+
obj_dml_data=obj_dml_data,
23+
ml_l=ml_l,
24+
ml_m=ml_m,
25+
n_folds=n_folds,
26+
n_rep=n_rep,
27+
score=score,
28+
dml_procedure=dml_procedure,
29+
draw_sample_splitting=draw_sample_splitting,
30+
apply_cross_fitting=apply_cross_fitting)
3231
DoubleMLLambda.__init__(self,
33-
lambda_function_name,
34-
aws_region)
32+
lambda_function_name=lambda_function_name,
33+
aws_region=aws_region)
3534

3635
def _ml_nuisance_aws_lambda(self, cv_params):
3736
assert self._dml_data.n_treat == 1
@@ -42,18 +41,18 @@ def _ml_nuisance_aws_lambda(self, cv_params):
4241

4342
payload = self._dml_data.get_payload()
4443

45-
payload_ml_g = payload.copy()
44+
payload_ml_l = payload.copy()
4645
payload_ml_m = payload.copy()
4746

48-
_attach_learner(payload_ml_g,
49-
'ml_g', self.learner['ml_g'],
47+
_attach_learner(payload_ml_l,
48+
'ml_l', self.learner['ml_l'],
5049
self._dml_data.y_col, self._dml_data.x_cols)
5150

5251
_attach_learner(payload_ml_m,
5352
'ml_m', self.learner['ml_m'],
5453
self._dml_data.d_cols[0], self._dml_data.x_cols)
5554

56-
payloads = _attach_smpls([payload_ml_g, payload_ml_m],
55+
payloads = _attach_smpls([payload_ml_l, payload_ml_m],
5756
[self.smpls, self.smpls],
5857
self.n_folds,
5958
self.n_rep,
@@ -70,8 +69,9 @@ def _ml_nuisance_aws_lambda(self, cv_params):
7069
# compute score elements
7170
self._psi_a[:, i_rep, self._i_treat], self._psi_b[:, i_rep, self._i_treat] = \
7271
self._score_elements(y, d,
73-
preds['ml_g'][:, i_rep],
72+
preds['ml_l'][:, i_rep],
7473
preds['ml_m'][:, i_rep],
74+
None,
7575
self.smpls[i_rep])
7676

7777
return

doubleml_serverless/tests/test_pliv.py

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -58,16 +58,16 @@ def dml_pliv_fixture(generate_data_pliv, idx, learner, score, dml_procedure):
5858
x_cols = data.columns[data.columns.str.startswith('X')].tolist()
5959

6060
# Set machine learning methods for m & g
61-
ml_g = clone(learner)
61+
ml_l = clone(learner)
6262
ml_m = clone(learner)
6363
ml_r = clone(learner)
6464

6565
np.random.seed(3141)
6666
dml_data_json = dml_lambda.DoubleMLDataJson(data, 'y', ['d'], x_cols, 'Z1')
6767
dml_pliv_lambda = DoubleMLPLIVServerlessLocal('local', 'local',
6868
dml_data_json,
69-
ml_g, ml_m, ml_r,
70-
n_folds,
69+
ml_l, ml_m, ml_r,
70+
n_folds=n_folds,
7171
score=score,
7272
dml_procedure=dml_procedure)
7373

@@ -76,8 +76,8 @@ def dml_pliv_fixture(generate_data_pliv, idx, learner, score, dml_procedure):
7676
np.random.seed(3141)
7777
dml_data = dml.DoubleMLData(data, 'y', ['d'], x_cols, 'Z1')
7878
dml_pliv = dml.DoubleMLPLIV(dml_data,
79-
ml_g, ml_m, ml_r,
80-
n_folds,
79+
ml_l, ml_m, ml_r,
80+
n_folds=n_folds,
8181
score=score,
8282
dml_procedure=dml_procedure)
8383

@@ -140,7 +140,7 @@ def dml_pliv_scaling_fixture(generate_data_pliv, idx, learner, score, dml_proced
140140
x_cols = data.columns[data.columns.str.startswith('X')].tolist()
141141

142142
# Set machine learning methods for m & g
143-
ml_g = clone(learner)
143+
ml_l = clone(learner)
144144
ml_m = clone(learner)
145145
ml_r = clone(learner)
146146

@@ -149,8 +149,8 @@ def dml_pliv_scaling_fixture(generate_data_pliv, idx, learner, score, dml_proced
149149
np.random.seed(3141)
150150
dml_pliv_folds = DoubleMLPLIVServerlessLocal('local', 'local',
151151
dml_data_json,
152-
ml_g, ml_m, ml_r,
153-
n_folds,
152+
ml_l, ml_m, ml_r,
153+
n_folds=n_folds,
154154
score=score,
155155
dml_procedure=dml_procedure)
156156

@@ -159,8 +159,8 @@ def dml_pliv_scaling_fixture(generate_data_pliv, idx, learner, score, dml_proced
159159
np.random.seed(3141)
160160
dml_pliv_reps = DoubleMLPLIVServerlessLocal('local', 'local',
161161
dml_data_json,
162-
ml_g, ml_m, ml_r,
163-
n_folds,
162+
ml_l, ml_m, ml_r,
163+
n_folds=n_folds,
164164
score=score,
165165
dml_procedure=dml_procedure)
166166

0 commit comments

Comments
 (0)