KMeans Clustering Implementation in Scala

Introduction

This project implements the KMeans clustering algorithm in Scala, showcasing an efficient approach to partition a set of points in three-dimensional space into clusters. This algorithm is crucial for data analysis, pattern recognition, and image processing. It demonstrates the power of parallel collections in Scala for improving performance.

Features

Parallel Collections: Utilizes Scala's parallel collections for efficient data processing, enabling faster computation by taking advantage of multicore processors.

Custom Point Class: Defines a Point class to represent points in three-dimensional space, providing functionalities to calculate distances between points.

Modular Interface: Implements a trait KMeansInterface to define the core functionalities of the KMeans algorithm, ensuring a clear and modular structure.

Efficient Clustering: Includes methods for classifying points into clusters based on their proximity to the centroids, updating centroids based on classified points, and checking for convergence.

Performance Measurement: Incorporates performance measurement using ScalaMeter, allowing for benchmarking of sequential vs. parallel execution times, showcasing the speedup achieved through parallelism. Utility Functions: Provides utility functions for generating random points and initializing means, facilitating the testing and demonstration of the algorithm.

Installation and Setup

To utilize this KMeans clustering project:

Clone the repository: git clone https://github.com/Emre-Akgul/K-Means-Clustering/

Ensure Scala and sbt are installed on your system. In the project directory, use the following command to run sbt: sbt

Usage

After setting up the project, you can run the main application to execute the KMeans algorithm and measure its performance. Use the following command in sbt: run

For testing and benchmarking, use the following command in sbt: test

Performance Insights

The project includes a performance measurement setup that compares the execution time of the algorithm using sequential and parallel collections. It demonstrates the effectiveness of parallelism in Scala for computational tasks like KMeans clustering.

Contact Information

For any additional questions or collaboration opportunities, feel free to contact me at emreakgulcs@gmail.com or visit my GitHub Profile.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
project		project
src		src
.gitignore		.gitignore
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

KMeans Clustering Implementation in Scala

Introduction

Features

Installation and Setup

Usage

Performance Insights

Contact Information

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Emre-Akgul/K-Means-Clustering

Folders and files

Latest commit

History

Repository files navigation

KMeans Clustering Implementation in Scala

Introduction

Features

Installation and Setup

Usage

Performance Insights

Contact Information

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages