Hey! Welcome to HackOS 4. This repository contains technical information on the challenge, including guides and a brief dataset visualization. With this repository, we hope to provide you a starting point to work on your machine learning models and hack on the Single-Cell Perturbations dataset!
To get started with this repository:
- Fork the repository by visiting https://github.com/aniketsrinivasan/hackos-4 and clicking the "Fork" button in the top-right corner
- Clone your forked repository:
git clone https://github.com/YOUR_USERNAME/hackos-4.git
cd hackos-4Our main dataset is from the NeurIPS 2023 Competition, and the main training set can be found in dataset/de_train_split.parquet.
The Jupyter Notebook dataset.ipynb demonstrates how to load this dataset, convert and save it as a CSV file, and convert it into popular ML formats (such as Tensors).
All the relevant datasets can be found on the official Kaggle site for the NeurIPS 2023 Competition. Feel free to use any dataset(s) that follow the competition guidelines!
https://www.kaggle.com/competitions/open-problems-single-cell-perturbations/overview
We've compiled a brief guide for the hackathon containing information about past approaches to the challenge that worked (NeurIPS 2023), as well as a collection of ideas you might want to build. By no means do you have to follow this guide, but if you're stuck at any point, we feel this is a good place to look!
https://docs.google.com/document/d/1i9fo4z8QdXA9L17yZ34uPA2vfjbW8P8T5cG5gRZQuAc/edit?usp=sharing
You can find more information about the event in general, as well as links to our Discord server, here: https://lu.ma/xrt0iiqx.
Feel free to reach out to myself (@anixus on Discord) or Laurence Liang (@larryl4643 on Discord) if you have any questions!