Skip to content
This repository was archived by the owner on Jan 31, 2024. It is now read-only.
This repository was archived by the owner on Jan 31, 2024. It is now read-only.

K-means Clustering #2

@kaiyanl

Description

@kaiyanl

Hi Imad,

Your article, K-means Clustering: Algorithm, Applications, Evaluation Methods, and Drawbacks, is great!

I would like to point out a couple of issues:

  • In Kmeans class implementation

    1. In def initializ_centroids, simply putting np.random.RandomState in a line doesn't have an effect (FYR). You could do
      $ r = np.random.RandomState(self.random_state)
      $ random_idx = r.permutation(X.shape[0])
    2. In def predict, old_centroids is out of the scope. You could do
      $ distance = self.compute_distance(X, self.centroids)
    3. In def compute_distance, squaring the calculated distance is unnecessary, although it doesn't hurt. I would do
      $ distance[:, k] = norm(X - centroids[k, :], axis=1)
  • In the Image Compression instance, the following description doesn't make sense to me:
    "The original image size was 396 x 396 x 24 = 3,763,584 bits; however, the new compressed image would be 30 x 24 + 396 x 396 x 4 = 627,984 bits. The huge difference comes from the fact that we’ll be using centroids as a lookup for pixels’ colors and that would reduce the size of each pixel location to 4-bit instead of 8-bit."

    1. The original size of the image is 396 x 396 x 24 because the image has in total 396 x 396 pixels and each pixel has 24-bit color representation; however, after the compression, each pixel has 30 colors that can be represented with at least 5 bits (4-bit can represent 16 colors); Plus the overhead storage of 30 colors, the number of bits should be 30 x 24 + 396 x 396 x 5.
    2. The number of bits at each pixel location is reduced to 5-bit from 24-bit.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions