BathyNet

Standalone point cloud semantic segmentation with PointCNN (PyTorch).

This repository provides an end-to-end pipeline for classifying LAS/LAZ point clouds:

Convert LAS to H5 training blocks, 2) Train a PointCNN model, 3) Run inference on new point clouds. No proprietary dependencies.

Features

Full pipeline: LAS → H5 → training → inference
PointCNN segmentation model with hierarchical X-Conv layers
Clean, configurable CLIs for preprocessing, training, and inference
Works with standard LAS classes and optional features (intensity, num_returns)
JSON-config support for reproducible runs

Repository structure

BathyNet/
├── 01_prepare_data.py              # Convert LAS to H5 blocks
├── 02_train_model.py               # Train PointCNN on H5 data
├── 03_run_inference.py             # Inference on LAS/LAZ with trained model
├── config_example.json             # Example configuration
├── pointcnn_default_config.json    # Default hyperparameters (reference)
├── requirements.txt                # Python dependencies
├── models/
│   ├── pointcnn_core.py            # PointCNN network + dataset + inference utils
│   ├── pointcnn_segmentation_trainer.py
│   └── trainer.py                  # Training entry used by 02_train_model.py
└── utilities/
    └── data_converter.py           # LAS → H5 converter

Requirements

Python 3.8+ recommended
PyTorch 1.8+ (with CUDA if using GPU)
See requirements.txt for the full list:
- torch, torchvision, torchaudio
- numpy, h5py, laspy
- scikit-learn, scipy, tqdm, matplotlib, seaborn

Install dependencies:

pip install -r requirements.txt

1) Data preparation (LAS → H5)

Script: 01_prepare_data.py

This converts raw LAS files into H5 blocks suitable for training. It will create train/ and val/ subfolders under the output path and write a meta.json file describing classes and settings.

Basic usage:

python 01_prepare_data.py --input_dir ./data/las_files --output_dir ./data/h5_files

Advanced options (see --help for all):

python 01_prepare_data.py \
  --input_dir ./data/las_files \
  --output_dir ./data/h5_files \
  --block_size 50 \
  --max_points 8192 \
  --train_split 0.8 \
  --intensity_range 0 5000 \
  --returns_range 1 5

Config-file driven (reproducible):

python 01_prepare_data.py --config preprocessing_config.json

Example preprocessing_config.json:

{
  "input_dir": "./data/las_files",
  "output_dir": "./data/h5_files",
  "block_size": 50.0,
  "max_points": 8192,
  "train_split": 0.8,
  "augment": false,
  "workers": 1,
  "intensity_range": [0, 5000],
  "returns_range": [1, 5]
}

Notes

The converter reads XYZ, labels (Classification), and optional features (intensity, num_returns) from LAS.
Coordinates are normalized per block; XZY layout is used internally and handled consistently in the model.
meta.json records class IDs discovered in your dataset.

2) Train the model

Script: 02_train_model.py

Train a PointCNN segmentation model using the preprocessed H5 dataset.

Basic usage:

python 02_train_model.py --data_dir ./data/h5_files --output_dir ./models/output

Common options:

python 02_train_model.py \
  --data_dir ./data/h5_files \
  --output_dir ./models/output \
  --epochs 100 \
  --batch_size 8 \
  --learning_rate 0.001 \
  --num_classes 4 \
  --num_points 8192

Resume training from a checkpoint:

python 02_train_model.py --data_dir ./data/h5_files --output_dir ./models/output \
  --resume ./models/output/checkpoint_epoch_50.pth

You can also provide a JSON config:

python 02_train_model.py --config training_config.json

Example training_config.json:

{
  "data_dir": "./data/h5_files",
  "output_dir": "./models/output",
  "epochs": 100,
  "batch_size": 8,
  "learning_rate": 0.001,
  "weight_decay": 0.0001,
  "num_classes": 4,
  "num_points": 8192,
  "feature_dim": 5,
  "validate_every": 5,
  "save_every": 10,
  "early_stopping": 20,
  "workers": 4,
  "device": "auto"
}

Outputs

Model checkpoints and the final model under --output_dir
A copy of the training configuration for reproducibility

3) Run inference on LAS/LAZ

Script: 03_run_inference.py

Use a trained model to classify new point clouds. The script expects the dataset meta.json (for class mapping) from your training data directory.

Single-file inference:

python 03_run_inference.py \
  --model_path ./models/output/pointcnn_best_model.pth \
  --data_path ./data/h5_files \
  --las_file ./data/test/sample.las \
  --output_dir ./results

Batch inference:

python 03_run_inference.py \
  --model_path ./models/output/pointcnn_best_model.pth \
  --data_path ./data/h5_files \
  --las_dir ./data/test \
  --output_dir ./results/batch

Useful options:

--block_size (default 50.0), --max_points (default 8192)
--save_h5 to store intermediate H5 outputs
--detailed_metrics to compute metrics if ground-truth labels are present
--selective_classify <ids...>: only reclassify specific input classes
--preserve_classes <ids...>: never change these classes
--remap_classes old:new old2:new2: remap classes before writing

Model and data details

Network: PointCNN encoder–decoder with X-Conv layers and per-point classification head.
Dataset: H5 files contain normalized blocks; features include XYZ (XZY ordering internally), intensity, and num_returns when available.
Classes: Derived from LAS Classification values present in your data (see meta.json).

Troubleshooting

“No H5 files found” during training: Run the preprocessing step and verify that train/ and val/ contain .h5 files and that meta.json exists under --data_dir.
“Model file not found” during inference: Make sure you pass the correct --model_path to a .pth file saved by training.
Large memory usage: Reduce --num_points (training) or --max_points (inference) and/or decrease --batch_size.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BathyNet

Features

Repository structure

Requirements

1) Data preparation (LAS → H5)

2) Train the model

3) Run inference on LAS/LAZ

Model and data details

Troubleshooting

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
models		models
utilities		utilities
.gitignore		.gitignore
01_prepare_data.py		01_prepare_data.py
02_train_model.py		02_train_model.py
03_run_inference.py		03_run_inference.py
README.md		README.md
config_example.json		config_example.json
pointcnn_default_config.json		pointcnn_default_config.json
requirements.txt		requirements.txt

ovipaul/BathyNet

Folders and files

Latest commit

History

Repository files navigation

BathyNet

Features

Repository structure

Requirements

1) Data preparation (LAS → H5)

2) Train the model

3) Run inference on LAS/LAZ

Model and data details

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages