Skip to content

Destine#196

Merged
koldunovn merged 6 commits intomainfrom
destine
Mar 3, 2026
Merged

Destine#196
koldunovn merged 6 commits intomainfrom
destine

Conversation

@kuivi
Copy link
Copy Markdown
Collaborator

@kuivi kuivi commented Mar 2, 2026

Add DestinE Climate DT Data Retrieval Tool

Summary

Adds live data retrieval from the Destination Earth (DestinE) Climate Digital Twin to ClimSight's data analysis agent. This enables on-demand download of high-resolution climate projection time series (SSP3-7.0, IFS-NEMO model, 2020–2039) for any point location, with 82 available climate parameters discoverable via RAG semantic search.

What's new

DestinE retrieval tool (src/climsight/tools/destine_retrieval_tool.py)

  • Two-step workflow: search (RAG over 82 parameters via Chroma vector store) → retrieve (download via earthkit.data + polytope API)
  • Point time series extraction with hourly resolution (24 timesteps/day)
  • Automatic caching — repeated requests return instantly from local Zarr store
  • Authentication via ~/.polytopeapirc token file (obtained by running desp-authentication.py)

Agent integration (data_analysis_agent.py)

  • Both tools registered when use_destine_data: true in config
  • Agent prompt describes the two-step workflow and available date range
  • Downloads default to full 2020–2039 period for maximum coverage

UI integration (streamlit_interface.py)

  • Toggle to enable/disable DestinE data
  • Token file status indicator (found/not found)

Configuration (config.yml)

  • use_destine_data toggle
  • destine_settings section (Chroma DB path, collection name)

State management (climsight_classes.py, sandbox_utils.py)

  • destine_data_dir and destine_tool_response fields in AgentState
  • Sandbox path for destine_data/ directory

Testing

  • test/test_destine_tool.py — dedicated test suite (RAG search, data retrieval, end-to-end workflow, caching, error handling)
  • Tests marked with destine marker, skipped by default in normal pytest runs
  • Run with: pytest -m destine -v
  • test/conftest.py — auto-skip logic for destine-marked tests

Utility scripts

  • src/climsight/scripts/download_destine_simple.py — single-request download example
  • src/climsight/scripts/download_destine_example.py — parallel yearly download with timing
  • src/climsight/scripts/era5_fetch.py — ERA5 data fetch utility
  • test/plot_destine_data.py — quick Zarr data inspection and plotting
  • test/era5_tool_manual.py — ERA5 tool manual test

Documentation

  • README.md — DestinE authentication instructions with links to destination-earth.eu and polytope-examples
  • CLIMATE_DATA_ARCHITECTURE.md — data architecture overview

Requirements

  • earthkit-data package (for polytope API access)
  • langchain-chroma, chromadb, langchain-openai (for RAG parameter search)
  • ~/.polytopeapirc token file (run desp-authentication.py to obtain)

- New tool: destine_retrieval_tool.py with two-step workflow:
  1. search_destine_parameters: RAG semantic search over 82 DestinE parameters via Chroma vector store
  2. retrieve_destine_data: download point time series via earthkit.data + polytope
- Authentication via ~/.polytopeapirc token (from desp-authentication.py)
- UI toggle for DestinE data with token file status check
- DestinE test suite (pytest -m destine), skipped by default
- Updated README with DestinE authentication instructions
Move os.chdir(REPO_ROOT) from module level to an autouse fixture that
restores the original cwd after each test, preventing side effects on
other test files that use relative paths.
… add utility scripts

- Fix lat/lon swap in polytope request (was [lon, lat], now [lat, lon])
- Remove "keep date ranges SHORT" limits — default to full 2020-2039 period
- Simplify intro_agent prompt
- Add standalone DestinE download scripts (simple + parallel yearly)
- Add ERA5 fetch script and test utilities
@kuivi kuivi requested review from dmpantiu and koldunovn March 2, 2026 12:05
…ring

- Guide data_analysis_agent to download ERA5/DestinE variables in parallel (all in one response)
- Relax intro_agent exclusion rules to allow analysis instructions (download data, plot time series, compute statistics)
@koldunovn koldunovn merged commit ef6a3a3 into main Mar 3, 2026
4 checks passed
@kuivi kuivi deleted the destine branch March 3, 2026 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants