Skip to content

pelancha/galaxyHackers

Repository files navigation

galaxyHackers

Project aims to find and analyse galaxy clusters using AI in microwave and IK range

Installating Dependencies (UNIX)

We assume Python 3.10+ is installed. Dependencies can be managed via Poetry or pip.

Set Up Virtual Environment

  1. Navigate to the project directory:
    cd galaxyHackers

Installing Dependencies

Option 1: using Poetry 2.0.0+

  1. Enter the Poetry environment:

    poetry env activate
  2. Install the project dependencies:

    poetry install

Option 2: using pip

If Poetry doesn't suit your needs or it fails to fetch all required libraries, you can use pip to install all dependencies.

  1. (Optional) Set up and activate the virtual environment:

    python3.10 -m venv venv
    source ./venv/bin/activate
  2. Install the necessary packages using pip:

    pip install torch torchvision timm torch_optimizer tqdm
    pip install numpy pandas matplotlib scikit-learn Pillow
    pip install astropy astroquery pixell dynaconf wget
    pip install comet_ml h5py ultralytics mlflow

Training Models

Once dependencies are installed, you can start training the models.

Train a model by running the main.py script:

python3 -m galaxy.main --models MODEL_NAME --epochs NUM_EPOCH --data DATASET

Available flags:

Flag Description Default / Options
--models Model(s) to train Baseline, ResNet18, EfficientNet, DenseNet, SpinalNet_ResNet, SpinalNet_VGG, ViTL16, AlexNet_VGG, CNN_MLP, YOLO12n, YOLO12s, YOLO12m, YOLO12l, YOLO12x (default: all except YOLO12s & YOLO12l)
--epochs Number of training epochs Default: 5
--mm Momentum (for SGD or RMSprop) Default: 0.9
--optimizer Optimizer to use Adam, SGD, Rprop, NAdam, RAdam, AdamW, RMSprop, DiffGrad (default: AdamW)
--repoptimizer Wrap optimizer using RepOptimizer Optional (see paper Re-parameterizing Your Optimizers rather than Architectures)
--segment Generate segmentation maps after training Optional
--data Dataset to use WISE (W1/W2 bands) or ACT (90, 150, 220 GHz from LAMBDA) (default: WISE)
--seed Random seed (integer number) Default: 1
--comet Enable Comet.ml tracking Optional
--mlflow Enable MLflow tracking Optional

Important

Only one optimizer can be passed in command-line.

In case script fails to download any dataset, it will address Legacy Survey website to download data of the infrared bands W1, W2 (for Russia available with VPN only).

Learning rate is found automatically using LR range test, so it is not passed using command-line.

For more information see A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay

Running Multiple Models Simultaneously

You can run several models simultaneously with the same optimizer and learning rate.


Example Commands

Single model training:

python3 -m galaxy.main --models YOLO --epochs 5 --data ACT --segment

Multiple models training:

python3 -m galaxy.main --models AlexNet_VGG ResNet18 --epochs 20 --data WISE --segment
python3 -m galaxy.main --models --epochs 20 --data WISE --segment

Logging

The script supports Comet.ml and MLflow for experiment tracking.

Comet.ml

  1. Rename .example.secrets.toml to .secrets.toml
  2. In .secrets.toml pass your Comet API to the varible COMET_API_KEY and the name of the workspace into COMET_WORKSPACE
  3. Enable --comet flag

Important

The script will not work without renaming .example.secrets.toml to .secrets.toml- this way you pass empty key.

MLflow

  1. Enable --mlflow flag

Full documentation can be viewed here.

About

Algorithm to classify galaxy clusters using various architectures: CNN, MLP and Transformer - and to compare their efficiencies on the infrared (IR) data of the WISE survey (W1, W2 bands) and the microwave data of ACT+Planck (f90, f150, f220 frequencies).

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors