Skip to content

Conversation

@benclifford
Copy link
Contributor

This PR:

  • uses auto-sklearn instead of NNI (although it does not remove the NNI code)
  • adds a testing script
  • makes the code run from the root of a blindml checkout, rather than /Users/maksim

This probably means that the code will only work run from the blindml source directory.

but it's better than not running.
… least

tooling as installed by this is broken right now for me in a clean venv
- pip-compile is broken because pip-tools breaks with recent pip, and apparently
the requirements.txt doesnt pin that stuff hard enough.

- the breakage only happens when I try to add in auto-sktools to requirements.txt
and rebuild...

this is an upgrade of pip-tools and a rerun of the regenerate/reinstall until
a fixpoint is reached in my env

first manually edit requirements.txt because pip-compile cant do this itself to
constrain pip-tools>=5.5

then:
run this over and over until no more git diff:

(looks like one iteration is ok for me actually)

rm -rfv ~/cqx/c/datastation/virtualenv/ && virtualenv /home/benc/cqx/c/datastation/virtualenv/ --python=python3 && pip install -r requirements.txt --extra-index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple && pip-compile --extra-index-url https://pypi.anaconda.org/scipy-wheels-nightly/simple
…roken

This gives basic linting for style errors and some basic static semantic checking.

Future commits could tidy up the broken flake8 tests and re-enable them.

Run with:

  $ flake8
Remove existing test that fails for me

Adjust path of other existing test to run from blindml repo root for when it is invoked by pytest there
Comment on lines 80 to 81
# ^ should this drop the y_col even when there isn't a drop_cols
# configuration?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes because otherwise we're training on the y_col to predict y_col (i.e. poisoning the training)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess if the user specifies the ycol in their explicitly specified X_cols, that's their own fault, though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup major party foul (but there should also probably be a check)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like the explicit X_cols code path was broken anyway - I've put in a fix and added a test case.

I've also added in an assert in the explicit X_cols path for "no ycol in xcols"

@makslevental
Copy link
Contributor

makslevental commented Feb 26, 2021

Should've put this in the review whoops: please remove references to nni (including helper/util code)

Prior to this, auto-sklearn was only called in regression mode

can i get a classification test/demo?

does the previous NNI implementation in blindml support classification, to test my changes against?
I couldn't get it to work on a simple csv that i made up
@benclifford benclifford force-pushed the benc-replace-nni-autosklearn branch from 92bada1 to d6ae253 Compare March 1, 2021 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants