Conversation
LastShekel
left a comment
There was a problem hiding this comment.
Checkout numpy documentation, it should help you to improve code performance
| _RANDOM_STATE = 42 | ||
|
|
||
| def __init__(self, measures, kappa, estimator, score): | ||
| self._measures = measures |
There was a problem hiding this comment.
Documentation with argument explanation and types needed in functions
|
|
||
| for j in range(n_measures): | ||
| score = self._measures[j](X, y) | ||
| for i in range(n_features): |
There was a problem hiding this comment.
Why don't you use numpy array assignment?
| maximum = planes.max() | ||
| min_max_diff = maximum - minimum | ||
|
|
||
| for i in range(shape[0]): |
There was a problem hiding this comment.
Again you could use array assignment like
(Planes-minimum)/min_max_diff
Or use bumpy minmaxscaler
| return normalized | ||
|
|
||
| def _kappa_filter(self, planes): | ||
| n_measures = len(self._measures) |
There was a problem hiding this comment.
You are repeating this string, it is better to create class field with this value
| n_measures = len(self._measures) | ||
|
|
||
| indexed = [] | ||
| for i, plane in enumerate(planes): |
There was a problem hiding this comment.
It seems you could just assign enumerate to indexed without for loop
| kappa_indices = set() | ||
| for i in range(n_measures): | ||
| planes = sorted(planes, key=lambda p: p[1][i]) | ||
| kappa_indices.add(planes[-self._kappa][0]) |
There was a problem hiding this comment.
What if -self._kappa exceeds list index bounds?
You better check it as early as posible
|
|
||
| filtered_indices = set() | ||
| for i in range(n_measures): | ||
| planes.sort(key=lambda p: p[1][i]) |
There was a problem hiding this comment.
It seems that you already sorted it before
| planes.sort(key=lambda p: p[1][i]) | ||
|
|
||
| left = 0 | ||
| while planes[left][0] not in kappa_indices: |
There was a problem hiding this comment.
Numpy where or count, should work faster
|
|
||
| intersection = SortedSet(key=cmp_to_key(_double_list_cmp)) | ||
| for k in range(dim): | ||
| for l in range(k + 1, dim): |
There was a problem hiding this comment.
Definitely this could be optimised with numpy
4 nasted loops is too much
|
|
||
| for j in range(dim): | ||
| point = np.zeros(dim) | ||
| point[j] = planes[i][j] |
There was a problem hiding this comment.
This could be made with numpy diagonal array
Pull Request Template
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes # (issue)
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests or set link to that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Checklist: