Implement three clustering algorithms to find clusters of genes that exhibit similar expression profiles: K-means, Hierarchical Agglomerative clustering with Single Link (Min), and one from (density-based, mixture model, spectral).
Set up a single-node Hadoop cluster on your machine and implement MapReduce K-means. Compare with non-parallel K-means on the given data sets. Try to improve the running time.