Commercial SAR STAC Catalogs

This repository provides tools to discover, consolidate, and visualize metadata from the open data STAC catalogs of major commercial SAR providers: Capella Space, ICEYE, and Umbra.

The primary goal is to create a harmonized GeoDataFrame for each provider, which is then saved in GeoParquet format. The entire process is automated to run weekly via GitHub Actions, ensuring the datasets remain up-to-date.

NOTE: I believe that Synspective only provides open data upon request as of October 15, 2025 https://synspective.com/gallery/

Inspired by @scottyhq's stac2geojson

Parquet Formats

VIZ (Visualization)

Optimized for browser-based visualization with stac-map:

Datetime fields parsed to pd.Timestamp for temporal sliders
Bbox stored as a nested dict for spatial queries
Assets compacted to essential fields (href, type, roles)
GeoJSON geometry serialized for JavaScript compatibility
Links resolved to absolute URLs

Note that Capella already has a great interactive web map for its open data https://felt.com/map/Capella-Space-Open-Data-bB24xsH3SuiUlpMdDbVRaA?loc=0,-20.5,1.83z and users should refer to this while it's still maintained.

Development Seed provides a great open-source tool called stac-map for visualizing these derived geoparquets -- all you need is the GitHub endpoint to the raw geoparquet file of interest. This should match a structure similar to:

https://developmentseed.org/stac-map/?href=https://raw.githubusercontent.com/Jack-Hayes/commerical-sar-stac/refs/heads/main/parquets/viz/capella/capella_GEC.parquet

Below are hyperlinks to access the respective parquets on this repo:

ICEYE: All ICEYE open data samples
Umbra: All Umbra open data samples
Capella: CPHD | CSI | GEC | GEO | SICD | SIDD | SLC

ARD (Analysis-Ready Data)

Optimized for programmatic analysis:

Asset hrefs expanded as individual columns (e.g., asset_thumbnail, asset_overview)
Full STAC properties preserved
Minimal transformations (e.g. serializing cols with mixed dtypes) for easier filtering/analysis

Streaming Parquet Files Directly in Python

You can load any of the published GeoParquet files directly into Python using GeoPandas without downloading them first. Simply pass the raw GitHub URL to gpd.read_file():

import geopandas as gpd

# Example: Load Capella CPHD ARD parquet directly from GitHub
url = "https://raw.githubusercontent.com/Jack-Hayes/commerical-sar-stac/main/parquets/ard/capella/capella_CPHD.parquet"
gdf = gpd.read_file(url)

This works for any of the Parquet files; just replace the URL with the desired dataset.

NOTE: It is important to use the 'ARD' Parquet files for Python streaming and local GIS software, as they are serialized specifically for programmatic use, as opposed to the 'VIZ' files.

Downloading the Parquet Files

You can download the latest generated Parquet files directly using command-line tools like curl (for Linux/macOS) or Invoke-WebRequest (for Windows PowerShell)

Bash (Linux/macOS)

# Download 'ARD' format (for analysis)
curl -L -o iceye_ard.parquet "https://raw.githubusercontent.com/Jack-Hayes/commerical-sar-stac/main/parquets/ard/iceye/iceye.parquet"

curl -L -o umbra_ard.parquet "https://raw.githubusercontent.com/Jack-Hayes/commerical-sar-stac/main/parquets/ard/umbra/umbra.parquet"

curl -L -o capella_GEC_ard.parquet "https://raw.githubusercontent.com/Jack-Hayes/commerical-sar-stac/main/parquets/ard/capella/capella_GEC.parquet"

PowerShell (Windows)

# Download 'ARD' format (for analysis)
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/Jack-Hayes/commerical-sar-stac/main/parquets/ard/iceye/iceye.parquet" -OutFile "iceye_ard.parquet"

Invoke-WebRequest -Uri "https://raw.githubusercontent.com/Jack-Hayes/commerical-sar-stac/main/parquets/ard/umbra/umbra.parquet" -OutFile "umbra_ard.parquet"

Invoke-WebRequest -Uri "https://raw.githubusercontent.com/Jack-Hayes/commerical-sar-stac/main/parquets/ard/capella/capella_GEC.parquet" -OutFile "capella_GEC_ard.parquet"

Data and API Usage Disclaimer

This repository contains open-source code for accessing and processing sample datasets provided by commercial companies including Capella Space, Umbra, and ICEYE.

All datasets and APIs are governed by their respective providers' terms of use. This repository does not redistribute or claim ownership of any proprietary or commercial data.

Users are responsible for ensuring their use of data and APIs complies with the terms set by:

Capella Space: https://www.capellaspace.com/legal/
Umbra: https://umbra.space/legal/
ICEYE: https://www.iceye.com/sar-data/api

Methodology

The ingestion process follows cloud-optimized best practices:

Discovery: For nested catalogs (Umbra, Capella), the script uses s3fs or recursive aiohttp calls to efficiently discover all STAC Item URLs. For flat catalogs (ICEYE), it directly parses the collection.
Fetching: All STAC Item JSON files are fetched concurrently using aiohttp for high performance.
Processing: The raw JSONs are parsed into a uniform, flattened structure in-memory using Pandas. This includes extracting asset URLs and ensuring correct geometry representation with Shapely.
Creation: A GeoDataFrame is created from the processed records.
Storage: The final, cleaned GeoDataFrame for each provider (or product type) is saved as a GeoParquet file.

Repository Structure

.github/workflows/: Contains GitHub Actions for CI (testing, linting) and weekly data updates.
parquets/: Stores the output GeoParquet files, organized by format (/viz or /ard) and provider.
scripts/: The main Python source code for data ingestion and processing.
tests/: pytest tests to validate endpoints and data structures.
environment.yml: The Conda environment file to ensure reproducibility.

Local Setup and Usage

Clone the repository:

git clone https://github.com/Jack-Hayes/commerical-sar-stac.git
cd commerical-sar-stac

Create and activate the Conda environment:

mamba env create -f environment.yml
mamba activate commercial-sar

Run the script: You can process specific providers by passing their names as command-line arguments.

# Process all providers in both formats (default)
python -m scripts.main capella iceye umbra

# Process only VIZ format
python -m scripts.main capella iceye umbra --format viz

# Process only ARD format
python -m scripts.main capella iceye umbra --format ard

# Process specific providers
python -m scripts.main capella iceye

KMZ export tool (tools/get_kmz.py)

Warning: This tool is under active development and currently supports Capella datasets only.

Overview

The tools/get_kmz.py CLI tool exports a KMZ that visualizes acquisition geometry for a single STAC item taken from the local repository Parquet files. Given a provider, item id, and product dtype (SLC, GEO, CPHD, etc.), the tool points to the matching local Parquet (/ard) file, reads extended metadata for the item, and emits a KMZ with:

a satellite track built from state vectors,
look vectors (rays) drawn from every Nth state vector to the image using satellite attitude quaternions
a thumbnail overlaid (draped) on the ground so the acquisition footprint and look geometry can be inspected in Google Earth or Google Earth Engine, and
a popup table showing basic STAC fields plus waveform / sampling / pointing metadata (i.e. sampling frequency, PRF, pulse bandwidth, pulse duration, beamwidths, range/azimuth/ground resolutions, NESZ, and other available image geometry fields).

Behavior and inputs

Currently, the tool determines the input Parquet by mapping the supplied inputs to a local ARD file path: parquets/ard/capella/capella_<DTYPE>.parquet (DTYPE is the --dtype argument).
The CLI expects the ARD file to contain a row with id matching the supplied --id. The script reads the row, resolves asset_metadata (STAC item JSON) and asset_thumbnail (thumbnail) from the row, fetches metadata, and generates the KMZ.
The KMZ contains doc.kml and, if available, preview.png inside the KMZ archive.

Requirements

You must have the parquety files from this repo downloaded locally.
Dependencies to build KMZ (this will hopefully be fixed soon with an env handler, note that version pins aren't absolute):
- pyproj==3.7.2 (for ECEF->LLA transforms)
- simplekml==1.3.2 (for KML/KMZ creation)
- scipy==1.16.3
If these optional packages are not installed, the CLI will exit with an actionable message explaining how to install them.

Example usage

Run from the project root. This example writes the KMZ to your Desktop.

python -m tools.get_kmz --provider capella \
  --id CAPELLA_C13_SP_SLC_HH_20251220124212_20251220124224 \
  --dtype SLC \
  --output-dir /tmp

Help output

You can view full CLI options with:

python -m tools.get_kmz -h

Contributing

Contributions are welcome!
This repository follows standard GitHub workflows with a protected main branch.

How to contribute

Fork this repository to your own GitHub account.
Create a feature branch from main in your fork (for example, feature/my-improvement).
Commit your changes using clear, signed commits.
Open a Pull Request (PR) against the main branch of this repository.

All pull requests:

Must pass automated checks and code quality scans.
Require at least one review approval (by a repository admin, me 😃).
Cannot be force-pushed or merged directly into main.

Once reviewed and approved, your PR will be merged following a linear history (no merge commits).

Licensing

This project is released under the MIT License

By contributing, you agree that your contributions will be licensed under the same terms.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.github/workflows		.github/workflows
parquets		parquets
scripts		scripts
tests		tests
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Commercial SAR STAC Catalogs

Parquet Formats

VIZ (Visualization)

ARD (Analysis-Ready Data)

Streaming Parquet Files Directly in Python

Downloading the Parquet Files

Bash (Linux/macOS)

PowerShell (Windows)

Data and API Usage Disclaimer

Methodology

Repository Structure

Local Setup and Usage

KMZ export tool (tools/get_kmz.py)

Overview

Behavior and inputs

Requirements

Example usage

Help output

Contributing

How to contribute

Licensing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Commercial SAR STAC Catalogs

Parquet Formats

VIZ (Visualization)

ARD (Analysis-Ready Data)

Streaming Parquet Files Directly in Python

Downloading the Parquet Files

Bash (Linux/macOS)

PowerShell (Windows)

Data and API Usage Disclaimer

Methodology

Repository Structure

Local Setup and Usage

KMZ export tool (tools/get_kmz.py)

Overview

Behavior and inputs

Requirements

Example usage

Help output

Contributing

How to contribute

Licensing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages