⚠️ Development Phase Notice: This library is in active development. APIs may change between releases.
bae-kai is a fork of bae with full Windows support, pre-built CUDA wheels, and cuDSS bundling. It provides PyTorch-based 2nd-order optimization for Bundle Adjustment (BA) and Pose Graph Optimization (PGO) using custom CUDA kernels and sparse block matrix operations.
- Sparse Block Matrix Operations: Optimized implementations of sparse matrix operations for large-scale optimization
- CUDA Acceleration: Custom CUDA kernels for high-performance sparse linear algebra
- Bundle Adjustment: Efficient implementation for camera pose and 3D structure optimization
- Pose Graph Optimization: Tools for optimizing robot trajectories using pose graph representations
- PyTorch Integration: Seamlessly integrates with PyTorch's automatic differentiation framework
- Levenberg-Marquardt Optimizer: Custom implementation of the LM algorithm for non-linear least squares problems
- Python 3.12+
- PyTorch 2.0+ with CUDA (CPU-only PyTorch will not work)
- NVIDIA GPU with CUDA support
- CUDA Toolkit installed (for building from source)
You must install a CUDA-enabled PyTorch before installing bae-kai. The CUDA version of PyTorch is not the default on PyPI, so you need to specify the index URL:
# CUDA 12.8 (recommended for RTX 30/40/50 series)
pip install torch --index-url https://download.pytorch.org/whl/cu128
# Or CUDA 12.4
pip install torch --index-url https://download.pytorch.org/whl/cu124Verify your PyTorch has CUDA:
python -c "import torch; print(torch.version.cuda)"
# Should print something like "12.8", NOT "None"bae-kai is distributed as a source package on PyPI. It compiles CUDA extensions during installation, which requires the CUDA Toolkit to be installed on your system.
pip install bae-kai --no-build-isolation--no-build-isolation is required so the build can use your installed CUDA-enabled PyTorch.
Windows: The CUDA Toolkit is usually installed at C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\vX.Y and is detected automatically.
Linux: Set CUDA_HOME if the toolkit is not at /usr/local/cuda:
CUDA_HOME=/usr/local/cuda-12.8 pip install bae-kai --no-build-isolationPre-built wheels with bundled CUDA libraries are available as GitHub Actions artifacts (no CUDA Toolkit needed to install):
| Platform | CUDA | Architectures |
|---|---|---|
| Linux | 12.4, 12.8, 13.0 | sm_70 - sm_120 |
| Windows | 12.4, 12.6, 12.8 | sm_70 - sm_120 |
To install a pre-built wheel, download the .whl file for your platform and CUDA version from the latest successful workflow run, then:
pip install bae-0.1.2+cu12.8-cp312-cp312-win_amd64.whlgit clone https://github.com/OpsiClear/bae-kai.git
cd bae-kai
uv syncCUDA_HOME/CUDA_PATH: Path to CUDA Toolkit (auto-detected on Windows)BAE_SKIP_EXTENSIONS=1: Skip CUDA extensions entirely (for sdist builds only)USE_CUDSS:"1"(default) to enable cuDSS support,"0"to disableCUDSS_DIR: Path to cuDSS installation if not in standard locations
from bae.optim import LM
# model: a torch.nn.Module whose forward() returns residuals
optimizer = LM(model, reject=30)
for idx in range(20):
loss = optimizer.step(input)
print(f'Iteration {idx}, loss: {loss.item()}')The optimizer auto-selects the solver, damping strategy, and method. For explicit control:
# String-based
optimizer = LM(model, solver="pcg", strategy="trustregion", method="schur")
# Object-based
from bae.utils import PCG, TrustRegion
optimizer = LM(model, solver=PCG(tol=1e-4, maxiter=250), strategy=TrustRegion())See ba_example.py for a complete Bundle Adjustment example using the BAL dataset.
| Module | Exports | Description |
|---|---|---|
bae.optim |
LM, SchurLM |
Levenberg-Marquardt optimizer with auto-selection |
bae.autograd |
TrackingTensor, map_transform, jacobian |
Sparse jacobian via operation tracing |
bae.utils |
PCG, PCG_, CuDSS, SciPySpSolver |
Linear solvers |
bae.utils |
TrustRegion, Adaptive |
Damping strategies |
bae-kai can be used as a Bundle Adjustment backend in VGGT (Visual Geometry Grounded Transformer) to refine camera poses, intrinsics, and 3D points before exporting a COLMAP reconstruction:
python demo_colmap.py --scene_dir /path/to/scene --use_ba --implementation baeIf you use bae-kai in your research, please cite the original paper:
@article{zhan2025bundle,
title = {Bundle Adjustment in the Eager Mode},
author = {Zhan, Zitong and Xu, Huan and Fang, Zihang and Wei, Xinpeng and Hu, Yaoyu and Wang, Chen},
journal = {arXiv preprint arXiv:2409.12190},
year = {2025},
url = {https://arxiv.org/abs/2409.12190}
}This project is a fork of bae by Zitong Zhan et al.
The implementation draws inspiration from:
- bae (original) - Bundle Adjustment in the Eager Mode
- PyPose for SE(3) pose representations
- GTSAM for reprojection jacobian concepts
Original code by Zitong Zhan et al. is licensed under Apache 2.0. Additions in this fork are licensed under AGPL 3.0.