While working on #116, I noticed that the sub_sampling function of feta_network is broken. Its not exercised in our standard test-suite, since its only needed when the number of objects is higher than the 5 objects our testsuite uses.
The function is implemented as follows:
def sub_sampling(self, X, Y):
if self.n_objects_fit_ > self.max_number_of_objects:
bucket_size = int(self.n_objects_fit_ / self.max_number_of_objects)
idx = self.random_state_.randint(
bucket_size, size=(len(X), self.n_objects_fit_)
)
# TODO: subsampling multiple rankings
idx += np.arange(start=0, stop=self.n_objects_fit_, step=bucket_size)[
: self.n_objects_fit_
]
X = X[np.arange(len(X))[:, None], idx]
Y = Y[np.arange(len(X))[:, None], idx]
tmp_sort = Y.argsort(axis=-1)
Y = np.empty_like(Y)
Y[np.arange(len(X))[:, None], tmp_sort] = np.arange(self.n_objects_fit_)
return X, Y
and breaks at the idx += line because of a dimension mismatch. It's trying to concatenate arrays like
[[0 1 0 0 0]
[0 0 1 1 0]]
and
i.e. a 2d array with a 1d array. I'm not sure how this sampling is supposed to work. Is the intention documented somewhere @kiudee @prithagupta?
While working on #116, I noticed that the
sub_samplingfunction offeta_networkis broken. Its not exercised in our standard test-suite, since its only needed when the number of objects is higher than the 5 objects our testsuite uses.The function is implemented as follows:
and breaks at the
idx +=line because of a dimension mismatch. It's trying to concatenate arrays likeand
i.e. a 2d array with a 1d array. I'm not sure how this sampling is supposed to work. Is the intention documented somewhere @kiudee @prithagupta?