Skip to content

Commit 4c75386

Browse files
author
Grzegorz Pustulka
committed
fix readme
1 parent 6ba27fd commit 4c75386

File tree

2 files changed

+34
-317
lines changed

2 files changed

+34
-317
lines changed

Dockerfile

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
FROM python:3.12-slim
2+
3+
WORKDIR /app
4+
5+
ENV APP_HOST=0.0.0.0 \
6+
APP_PORT=8080 \
7+
ENVIRONMENT=production \
8+
BACKEND=opensearch \
9+
STAC_FASTAPI_TITLE=stac-fastapi-opensearch \
10+
STAC_FASTAPI_DESCRIPTION="A STAC FastAPI with an OpenSearch backend" \
11+
STAC_FASTAPI_VERSION=2.1 \
12+
RELOAD=false \
13+
WEB_CONCURRENCY=1 \
14+
STAC_FASTAPI_RATE_LIMIT=null \
15+
ES_HTTP_COMPRESS=true \
16+
ENABLE_DIRECT_RESPONSE=true \
17+
ES_HOST=pgstac03.intra.cloudferro.com \
18+
ES_PORT=9200 \
19+
ES_USER=pgstac \
20+
ES_PASS=19rs98j2jhrd34h9hr99j \
21+
ES_USE_SSL=false \
22+
ES_VERIFY_CERTS=false \
23+
OPENSEARCH_VERSION=2.19.1
24+
25+
COPY . .
26+
27+
RUN pip install -e ./stac_fastapi/opensearch
28+
RUN pip install -e ./stac_fastapi/core
29+
RUN pip install -e ./stac_fastapi/sfeos_helpers
30+
31+
EXPOSE 8080
32+
33+
CMD ["python", "-m", "stac_fastapi.opensearch.app"]

README.md

Lines changed: 1 addition & 317 deletions
Original file line numberDiff line numberDiff line change
@@ -841,320 +841,4 @@ The system uses a precise naming convention:
841841
- Helps prevent API abuse and maintains system stability
842842
- Ensures fair resource allocation among all clients
843843
844-
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
845-
846-
sed index selection using collection IDs. Requires indexes in format: STAC_ITEMS_INDEX_PREFIX_collection-id_start_year-start_month-start_day-end_year-end_month-end_day, e.g. items_sentinel-2-l2a_2025-06-06-2025-09-22. | `false` | Optional |
847-
| `DATETIME_INDEX_MAX_SIZE_GB` | Maximum size limit in GB for datetime-based indexes. When an index exceeds this size, a new time-partitioned index will be created. Note: This value should account for ~25% overhead due to OS/ES caching of data structures and metadata. Only applies when`ENABLE_DATETIME_INDEX_FILTERING` is enabled. | `25` | Optional |
848-
849-
> [!NOTE]
850-
> The variables `ES_HOST`, `ES_PORT`, `ES_USE_SSL`, `ES_VERIFY_CERTS` and `ES_TIMEOUT` apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to `OS_` even if you're using OpenSearch.
851-
852-
## Interacting with the API
853-
854-
- **Creating a Collection**:
855-
```shell
856-
curl -X "POST" "http://localhost:8080/collections" \
857-
-H 'Content-Type: application/json; charset=utf-8' \
858-
-d $'{
859-
"id": "my_collection"
860-
}'
861-
```
862-
863-
- **Adding an Item to a Collection**:
864-
```shell
865-
curl -X "POST" "http://localhost:8080/collections/my_collection/items" \
866-
-H 'Content-Type: application/json; charset=utf-8' \
867-
-d @item.json
868-
```
869-
870-
- **Searching for Items**:
871-
```shell
872-
curl -X "GET" "http://localhost:8080/search" \
873-
-H 'Content-Type: application/json; charset=utf-8' \
874-
-d $'{
875-
"collections": ["my_collection"],
876-
"limit": 10
877-
}'
878-
```
879-
880-
- **Filtering by Bbox**:
881-
```shell
882-
curl -X "GET" "http://localhost:8080/search" \
883-
-H 'Content-Type: application/json; charset=utf-8' \
884-
-d $'{
885-
"collections": ["my_collection"],
886-
"bbox": [-180, -90, 180, 90]
887-
}'
888-
```
889-
890-
- **Filtering by Datetime**:
891-
```shell
892-
curl -X "GET" "http://localhost:8080/search" \
893-
-H 'Content-Type: application/json; charset=utf-8' \
894-
-d $'{
895-
"collections": ["my_collection"],
896-
"datetime": "2020-01-01T00:00:00Z/2020-12-31T23:59:59Z"
897-
}'
898-
```
899-
900-
## Configure the API
901-
902-
- **API Title and Description**: By default set to `stac-fastapi-<backend>`. Customize these by setting:
903-
- `STAC_FASTAPI_TITLE`: Changes the API title in the documentation
904-
- `STAC_FASTAPI_DESCRIPTION`: Changes the API description in the documentation
905-
906-
- **Database Indices**: By default, the API reads from and writes to:
907-
- `collections` index for collections
908-
- `items_<collection name>` indices for items
909-
- Customize with `STAC_COLLECTIONS_INDEX` and `STAC_ITEMS_INDEX_PREFIX` environment variables
910-
911-
- **Root Path Configuration**: The application root path is the base URL by default.
912-
- For AWS Lambda with Gateway API: Set `STAC_FASTAPI_ROOT_PATH` to match the Gateway API stage name (e.g., `/v1`)
913-
914-
915-
## Collection Pagination
916-
917-
- **Overview**: The collections route supports pagination through optional query parameters.
918-
- **Parameters**:
919-
- `limit`: Controls the number of collections returned per page
920-
- `token`: Used to retrieve subsequent pages of results
921-
- **Response Structure**: The `links` field in the response contains a `next` link with the token for the next page of results.
922-
- **Example Usage**:
923-
```shell
924-
curl -X "GET" "http://localhost:8080/collections?limit=1&token=example_token"
925-
```
926-
927-
## Ingesting Sample Data CLI Tool
928-
929-
- **Overview**: The `data_loader.py` script provides a convenient way to load STAC items into the database.
930-
931-
- **Usage**:
932-
```shell
933-
python3 data_loader.py --base-url http://localhost:8080
934-
```
935-
936-
- **Options**:
937-
```
938-
--base-url TEXT Base URL of the STAC API [required]
939-
--collection-id TEXT ID of the collection to which items are added
940-
--use-bulk Use bulk insert method for items
941-
--data-dir PATH Directory containing collection.json and feature
942-
collection file
943-
--help Show this message and exit.
944-
```
945-
946-
- **Example Workflows**:
947-
- **Loading Sample Data**:
948-
```shell
949-
python3 data_loader.py --base-url http://localhost:8080
950-
```
951-
- **Loading Data to a Specific Collection**:
952-
```shell
953-
python3 data_loader.py --base-url http://localhost:8080 --collection-id my-collection
954-
```
955-
- **Using Bulk Insert for Performance**:
956-
```shell
957-
python3 data_loader.py --base-url http://localhost:8080 --use-bulk
958-
```
959-
960-
## Elasticsearch Mappings
961-
962-
- **Overview**: Mappings apply to search index, not source data. They define how documents and their fields are stored and indexed.
963-
- **Implementation**:
964-
- Mappings are stored in index templates that are created on application startup
965-
- These templates are automatically applied when creating new Collection and Item indices
966-
- The `sfeos_helpers` package contains shared mapping definitions used by both Elasticsearch and OpenSearch backends
967-
- **Customization**: Custom mappings can be defined by extending the base mapping templates.
968-
969-
## Managing Elasticsearch Indices
970-
971-
### Snapshots
972-
973-
- **Overview**: Snapshots provide a way to backup and restore your indices.
974-
975-
- **Creating a Snapshot Repository**:
976-
```shell
977-
curl -X "PUT" "http://localhost:9200/_snapshot/my_fs_backup" \
978-
-H 'Content-Type: application/json; charset=utf-8' \
979-
-d $'{
980-
"type": "fs",
981-
"settings": {
982-
"location": "/usr/share/elasticsearch/snapshots/my_fs_backup"
983-
}
984-
}'
985-
```
986-
- This creates a snapshot repository that stores files in the elasticsearch/snapshots directory in this git repo clone
987-
- The elasticsearch.yml and compose files create a mapping from that directory to /usr/share/elasticsearch/snapshots within the Elasticsearch container and grant permissions for using it
988-
989-
- **Creating a Snapshot**:
990-
```shell
991-
curl -X "PUT" "http://localhost:9200/_snapshot/my_fs_backup/my_snapshot_2?wait_for_completion=true" \
992-
-H 'Content-Type: application/json; charset=utf-8' \
993-
-d $'{
994-
"metadata": {
995-
"taken_because": "dump of all items",
996-
"taken_by": "pvarner"
997-
},
998-
"include_global_state": false,
999-
"ignore_unavailable": false,
1000-
"indices": "items_my-collection"
1001-
}'
1002-
```
1003-
- This creates a snapshot named my_snapshot_2 and waits for the action to be completed before returning
1004-
- This can also be done asynchronously by omitting the wait_for_completion parameter, and queried for status later
1005-
- The indices parameter determines which indices are snapshotted, and can include wildcards
1006-
1007-
- **Viewing Snapshots**:
1008-
```shell
1009-
# View a specific snapshot
1010-
curl http://localhost:9200/_snapshot/my_fs_backup/my_snapshot_2
1011-
1012-
# View all snapshots
1013-
curl http://localhost:9200/_snapshot/my_fs_backup/_all
1014-
```
1015-
- These commands allow you to check the status and details of your snapshots
1016-
1017-
- **Restoring a Snapshot**:
1018-
```shell
1019-
curl -X "POST" "http://localhost:9200/_snapshot/my_fs_backup/my_snapshot_2/_restore?wait_for_completion=true" \
1020-
-H 'Content-Type: application/json; charset=utf-8' \
1021-
-d $'{
1022-
"include_aliases": false,
1023-
"include_global_state": false,
1024-
"ignore_unavailable": true,
1025-
"rename_replacement": "items_$1-copy",
1026-
"indices": "items_*",
1027-
"rename_pattern": "items_(.+)"
1028-
}'
1029-
```
1030-
- This specific command will restore any indices that match items_* and rename them so that the new index name will be suffixed with -copy
1031-
- The rename_pattern and rename_replacement parameters allow you to restore indices under new names
1032-
1033-
- **Updating Collection References**:
1034-
```shell
1035-
curl -X "POST" "http://localhost:9200/items_my-collection-copy/_update_by_query" \
1036-
-H 'Content-Type: application/json; charset=utf-8' \
1037-
-d $'{
1038-
"query": {
1039-
"match_all": {}
1040-
},
1041-
"script": {
1042-
"lang": "painless",
1043-
"params": {
1044-
"collection": "my-collection-copy"
1045-
},
1046-
"source": "ctx._source.collection = params.collection"
1047-
}
1048-
}'
1049-
```
1050-
- After restoring, the item documents have been restored in the new index (e.g., my-collection-copy), but the value of the collection field in those documents is still the original value of my-collection
1051-
- This command updates these values to match the new collection name using Elasticsearch's Update By Query feature
1052-
1053-
- **Creating a New Collection**:
1054-
```shell
1055-
curl -X "POST" "http://localhost:8080/collections" \
1056-
-H 'Content-Type: application/json' \
1057-
-d $'{
1058-
"id": "my-collection-copy"
1059-
}'
1060-
```
1061-
- The final step is to create a new collection through the API with the new name for each of the restored indices
1062-
- This gives you a copy of the collection that has a resource URI (/collections/my-collection-copy) and can be correctly queried by collection name
1063-
1064-
### Reindexing
1065-
1066-
- **Overview**: Reindexing allows you to copy documents from one index to another, optionally transforming them in the process.
1067-
1068-
- **Use Cases**:
1069-
- Apply changes to documents
1070-
- Correct dynamically generated mappings
1071-
- Transform data (e.g., lowercase identifiers)
1072-
- The index templates will make sure that manually created indices will also have the correct mappings and settings
1073-
1074-
- **Example: Reindexing with Transformation**:
1075-
```shell
1076-
curl -X "POST" "http://localhost:9200/_reindex" \
1077-
-H 'Content-Type: application/json' \
1078-
-d $'{
1079-
"source": {
1080-
"index": "items_my-collection-lower_my-collection-hex-000001"
1081-
},
1082-
"dest": {
1083-
"index": "items_my-collection-lower_my-collection-hex-000002"
1084-
},
1085-
"script": {
1086-
"source": "ctx._source.id = ctx._source.id.toLowerCase()",
1087-
"lang": "painless"
1088-
}
1089-
}'
1090-
```
1091-
- In this example, we make a copy of an existing Item index but change the Item identifier to be lowercase
1092-
- The script parameter allows you to transform documents during the reindexing process
1093-
1094-
- **Updating Aliases**:
1095-
```shell
1096-
curl -X "POST" "http://localhost:9200/_aliases" \
1097-
-H 'Content-Type: application/json' \
1098-
-d $'{
1099-
"actions": [
1100-
{
1101-
"remove": {
1102-
"index": "*",
1103-
"alias": "items_my-collection"
1104-
}
1105-
},
1106-
{
1107-
"add": {
1108-
"index": "items_my-collection-lower_my-collection-hex-000002",
1109-
"alias": "items_my-collection"
1110-
}
1111-
}
1112-
]
1113-
}'
1114-
```
1115-
- If you are happy with the data in the newly created index, you can move the alias items_my-collection to the new index
1116-
- This makes the modified Items with lowercase identifiers visible to users accessing my-collection in the STAC API
1117-
- Using aliases allows you to switch between different index versions without changing the API endpoint
1118-
1119-
## Auth
1120-
1121-
- **Overview**: Authentication is an optional feature that can be enabled through Route Dependencies.
1122-
- **Implementation Options**:
1123-
- Basic authentication
1124-
- OAuth2 with Keycloak
1125-
- Custom route dependencies
1126-
- **Configuration**: Authentication can be configured using the `STAC_FASTAPI_ROUTE_DEPENDENCIES` environment variable.
1127-
- **Examples and Documentation**: Detailed examples and implementation guides can be found in the [examples/auth](examples/auth) directory.
1128-
1129-
## Aggregation
1130-
1131-
- **Supported Aggregations**:
1132-
- Spatial aggregations of points and geometries
1133-
- Frequency distribution aggregation of any property including dates
1134-
- Temporal distribution of datetime values
1135-
1136-
- **Endpoint Locations**:
1137-
- Root Catalog level: `/aggregations`
1138-
- Collection level: `/<collection_id>/aggregations`
1139-
1140-
- **Implementation Details**: The `sfeos_helpers.aggregation` package provides specialized functionality for both Elasticsearch and OpenSearch backends.
1141-
1142-
- **Documentation**: Detailed information about supported aggregations can be found in [the aggregation docs](./docs/src/aggregation.md).
1143-
1144-
1145-
## Rate Limiting
1146-
1147-
- **Overview**: Rate limiting is an optional security feature that controls API request frequency on a remote address basis.
1148-
1149-
- **Configuration**: Enabled by setting the `STAC_FASTAPI_RATE_LIMIT` environment variable:
1150-
```
1151-
STAC_FASTAPI_RATE_LIMIT=500/minute
1152-
```
1153-
1154-
- **Functionality**:
1155-
- Limits each client to a specified number of requests per time period (e.g., 500 requests per minute)
1156-
- Helps prevent API abuse and maintains system stability
1157-
- Ensures fair resource allocation among all clients
1158-
1159-
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.
1160-
844+
- **Examples**: Implementation examples are available in the [examples/rate_limit](examples/rate_limit) directory.

0 commit comments

Comments
 (0)