From 12decd04ce8022843e0c2b7391e499893b059d74 Mon Sep 17 00:00:00 2001 From: runllm Date: Fri, 18 Apr 2025 13:14:12 +0000 Subject: [PATCH 1/2] Update openapi-usage-guide.md --- docs/api/openapi/openapi-usage-guide.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/docs/api/openapi/openapi-usage-guide.md b/docs/api/openapi/openapi-usage-guide.md index 59a331b724cfa..2cc9380071c25 100644 --- a/docs/api/openapi/openapi-usage-guide.md +++ b/docs/api/openapi/openapi-usage-guide.md @@ -26,6 +26,26 @@ in the OpenAPI UI are all autogenerated at build time from the PDL models to JSO While the full OpenAPI spec is always available at [**GMS_SERVER_HOST:GMS_PORT/openapi/swagger-ui/index.html**](http://localhost:8080/openapi/swagger-ui/index.html), here's a quick overview of the main OpenAPI endpoints and their purpose. +### Python Program Structure for OpenAPI Integration + +This section provides a detailed guide on structuring a Python program for creating a `schemaField` for a BigQuery column using DataHub's OpenAPI. This guide is intended for developers looking to programmatically interact with DataHub using Python. + +#### Prerequisites +- Install the `requests` library using `pip install requests`. +- Set up any necessary authentication for interacting with DataHub's API. + +#### Schema Field Definition +- Define a schema field in Python, specifying necessary attributes and types. + +#### API Request Construction +- Construct the API request by setting up headers, defining the payload, and handling the response. + +#### Error Handling +- Implement best practices for error handling when making API requests. + +#### Authentication +- Follow detailed steps to authenticate requests, especially if DataHub requires specific authentication methods. + ### Entities (/entities) From 3a5b1c0fc629b45b96e7d4589a7dc1cd96732f2d Mon Sep 17 00:00:00 2001 From: runllm Date: Fri, 18 Apr 2025 13:14:13 +0000 Subject: [PATCH 2/2] Update datahub-apis.md --- docs/api/datahub-apis.md | 71 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) diff --git a/docs/api/datahub-apis.md b/docs/api/datahub-apis.md index c46aacde3a0cb..88ca4daa7418f 100644 --- a/docs/api/datahub-apis.md +++ b/docs/api/datahub-apis.md @@ -38,6 +38,77 @@ Learn more about the GraphQL API: - **[GraphQL API →](docs/api/graphql/getting-started.md)** +## Python Program Structure for OpenAPI Integration + +This section provides a detailed guide on structuring a Python program for creating a `schemaField` for a BigQuery column using DataHub's OpenAPI. This guide is intended for developers looking to programmatically interact with DataHub using Python. + +### Prerequisites +- Install the `requests` library: `pip install requests` +- Set up any necessary authentication for interacting with DataHub's API. + +### Schema Field Definition +Define a schema field in Python, including necessary attributes and types. Example: +```python +schema_field = { + "fieldPath": "column_name", + "type": { + "type": { + "__type": "StringType" + } + }, + "nativeDataType": "STRING" +} +``` + +### API Request Construction +Construct the API request by setting up headers, defining the payload, and handling the response. Example: +```python +import requests +import json + +# Define the DataHub server URL +datahub_url = "http://localhost:8080" + +# Define the dataset URN +dataset_urn = "urn:li:dataset:(urn:li:dataPlatform:bigquery,project.dataset.table,PROD)" + +# Define the SchemaMetadata aspect +schema_metadata = { + "__type": "SchemaMetadata", + "schemaName": "SampleSchema", + "platform": "urn:li:dataPlatform:bigquery", + "fields": [schema_field] +} + +# Define the request payload +payload = { + "aspect": schema_metadata, + "entityType": "dataset", + "entityUrn": dataset_urn +} + +# Set up headers, including authentication if needed +headers = { + "Content-Type": "application/json", + # "Authorization": "Bearer YOUR_ACCESS_TOKEN" # Uncomment if authentication is required +} + +# Make the POST request to the DataHub OpenAPI endpoint +response = requests.post(f"{datahub_url}/entities", headers=headers, data=json.dumps(payload)) + +# Check the response +if response.status_code == 200: + print("Schema field created successfully!") +else: + print(f"Failed to create schema field: {response.status_code} - {response.text}") +``` + +### Error Handling +Implement best practices for error handling when making API requests. + +### Authentication +Detailed steps on how to authenticate requests, especially if DataHub requires specific authentication methods. + ## DataHub API Comparison DataHub supports several APIs, each with its own unique usage and format.