diff --git a/README.md b/README.md index d1353ad..61ca0bc 100644 --- a/README.md +++ b/README.md @@ -1,102 +1,367 @@ # The Supergraph Manifesto -A supergraph is an architecture pattern and a federated operating model to help teams create a self-serve platform for data access, API integration/composition or GraphQL APIs. +## Introduction -![Before / After Supergraph](https://github.com/hasura/supergraph-io/assets/131160/2421b94e-724f-4e94-afee-61b2c81f38b7) +A Supergraph is a method for designing data APIs. -When a supergraph is built with a GraphQL federation stack, the engine is often called a gateway or a router and the subgraph connectors are often GraphQL services. +A Supergraph provides guidelines and best practices for developing APIs and by using this approach, different teams +in an organization can work together to create a unified, modular system of data and logic. This results in a single, +powerful endpoint that API consumers can easily use. -## Supergraph platform strategy -A supergraph approach aims to build a flywheel of growth to keep improving self-service access to data and APIs. +## Table of Definitions +| Term | Definition | +|--------------------|--------------------------------------------------------------------------------------------------------------------------------| +| **Supergraph** | A Supergraph is an API creation methodology that offers reference specifications, design principles and an operating model. | +| **Supergraph API** | A modular, interconnected resource of data and logic as a single, powerful, self-serve API endpoint. | +| **Engine** | Often referred to as a gateway or router to manage API requests and responses. | +| **Data Domain** | A distinct area of functionality or data. | +| **Subgraph** | A modular component that acts as a self-contained entity within a larger supergraph and often represents a single data domain. | +| **API Consumer** | An entity that uses APIs to access data and functionality. | +| **API Producer** | An entity that creates and maintains APIs. | -Supergraph platform flywheel +## Before the Supergraph +In conventional API design, data consumers need to manually access, integrate, and compose data from multiple +endpoints to get the data they need. This is time-consuming, error-prone, and requires deep domain knowledge, often +resulting in brittle integrations. -### I. CONNECT domains -Domain owners (aka data owners or API producers) should be able to seamlessly connect their domains to the platform. A major challenge in building supergraph is the resistance to change by the domain owners. They often oppose having to build, operate and maintain another API layer, such as a GraphQL server that creates another wrapper on their domain. This reluctance and concern is understandable and completely valid and must be systematically addressed by the supergraph platform strategy and the supergraph reference architecture. +Infrastructure must also be built and maintained to support each of these data-access processes, and individual +routes or resolvers built and maintained for each data access requirement. -This has two main implications for the subgraph connector's lifecycle and runtime: -1. **Subgraph connector CI/CD**: As domain owners change their domains, the API contract published via the supergraph engine, must stay in sync with the least amount of overhead for the domain owner. The SDLC, change-management or CI/CD process of the domain owners must involve updating their API contract (eg: versioning), prevent breaking changes and keeping documentation up to date. -2. **Subgraph connector performance**: The subgraph connector must not _reduce_ performance as compared to what is provided by accessing the underlying domain directly. API performance characteristics as measured by latency, payload size & concurrency. +![supergraphioERD.png](assets/supergraphioERD.png) -Guaranteeing a smooth CI/CD process and high-performance connectivity gives domain owners confidence that they can connect their domains to the supergraph platform and iterate on changes to their domains fearlessly. +## Benefits +A Supergraph provides the following key benefits: -**This unlocks self-serve connectivity for domain owners.** +1. **Self-Serve API Consumer Composition**: Enables API integration, orchestration, and aggregation in a self-serve + manner. +2. **Federated Data Access Layer**: Provides a federated data layer that allows real-time access to data sources with + cross-domain composability (joins, filtering, etc.). +3. **Incremental Adoption**: Offers a stable API that supports zero-downtime and incremental adoption. -### II. CONSUME APIs +## Supergraph Lifecycle -API consumers should be able to discover and consume APIs in a way that doesn't require manual API integration, aggregation or composition effort as far as possible. -API consumers have several common needs when they're dealing with fixed API endpoints or specific data queries: -1. fetch different projections of data to prevent over-fetching -2. join data from multiple places to prevent under-fetching -3. filter, paginate, sort and aggregate data from multiple places +![ConnectConsumeDiscoverERD.png](assets/ConnectConsumeDiscoverERD.png) -To provide an API experience that makes the consumption experience truly self-serve, there are two key requirements: -1. **Composable API design**: The API presented by the supergraph engine must allow for on-demand composability. GraphQL is a great API to express composability semantics, but regardless of the API format used, a standardized, composable API design is a critical requirement. -2. **API portal**: High-quality search, discovery and documentation of both the API and the underlying API models is critical to enable self-serve consumption. The more information that can be made available to API consumers the better. Eg: Data lineage, Authorization policies etc as appropriate. +### Connect Data Domains +As an API producer, you should easily be able to connect your data domains to the platform. The Supergraph strategy +and architecture addresses common challenges and reluctance from domain owners by ensuring: -**This unlocks self-serve consumption for API consumers** - -### III. DISCOVER demand - -Understanding how API consumers use their domain and identify their unmet needs is crucial for API producers. This insight allows API producers to enhance their domain. It also helps discover new domain owners to connect their domain into the supergraph. - -This necessitates 2 key capabilities of the supergraph platform to create a consumer-first, agile culture: -1. API consumption, API schema & portal analytics: A supergraph is analogous to a marketplace and needs to provide the marketplace owners and producers with insights to help improve the marketplace for the consumers. -2. Ecosystem integrations: The supergraph platform should be able to integrate with existing communication and catalog tools, in particular to help understand _unmet_ demand of API consumers. - -**This closes the loop and allows the supergraph platform to create a virtuous cycle of success for producers and consumers**. - -## Supergraph reference architecture - -### CI/CD and build system (control plane) -The control plane of the supergraph is critical to help domain owners [connect their domains](#i.-connect-domains) to the supergraph. - -There are 3 components in the control plane of the supergraph -1. The domain itself -2. The subgraph -3. The supergraph - -Supergraph control plane components - -The control plane should define the following SDLC to help keep the supergraph in sync with the domain as the underlying domain changes. - -Supergraph CI/CD - -### Distributed data plane -The supergraph data plane is critical to enable high performance access to upstream domains so that API producers can maintain their domain without hidden future maintenance costs: - -Supergraph data plane - -### Supergraph API schema - -**Standardization** - -A supergraph API schema should create standardized conventions on the following: -- Queryable models vs Commands - - Models are collections of data that can be queried in standardized ways - - Commands are methods that map to particular pieces of business logic that might return references to other commands or models -- Standardized conventions on queryable model: While each model might only expose some ways of querying it, the syntax and conventions for standard query operations should be standardized - - Joins - - Filtering - - Pagination - - Sorting - - Aggregations - -**Composability** +1. **Subgraph CI/CD**: Keeps the API contract in sync with minimal overhead for the domain owner. +2. **Subgraph Performance**: Maintains or improves performance compared to direct access to the underlying data domain. + +### Consume APIs +API consumers should be able to discover and consume APIs without manual integration, aggregation, or composition +efforts as far as possible. Key requirements include: + +1. **Composable API Design**: Allows on-demand composability, often utilizing GraphQL for expressive API design. +2. **API Portal**: Provides high-quality search, discovery, and documentation to facilitate self-serve consumption. + +### Discover Demand +Understanding how API consumers use domains and identifying their unmet needs is crucial for API producers. The +Supergraph platform supports: + +1. **API Consumption Analytics**: Provides insights to improve the API marketplace for consumers. +2. **Ecosystem Integrations**: Integrates with existing tools to understand unmet demand and enhance the ecosystem. + +### A Supergraph creates a virtuous cycle of success for API producers and consumers. + +## Supergraph Architecture + +### Control Plane (CI/CD and Build System) + +The control plane ensures seamless connection of data domains to the Supergraph. + +There are three components in the control plane of the Supergraph: + +#### 1. The data domain itself +A database, API or lambda function that provides the data. + +#### 2. The subgraph +API models, documentation, relationships, authorization policies. + +#### 3. The Supergraph +Centralized auth, governance, API conventions. + +### Control Plane Lifecycle + +![control-plane-lifecycle.png](assets/control-plane-lifecycle.png) + +### Distributed Data Plane +The data plane ensures high-performance access to upstream domains, maintaining domain performance without hidden +future maintenance costs. + +![data-plane-flow.png](assets/data-plane-flow.png) + +## API Schema Design Guide + + + +### Standardization + +In relation to **data resources (models)** and **data logic (commands)**, a supergraph API schema should create standardized +conventions on the following: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Standardization AttributeCapability
S1 + Separating models and commands +
+ Example + +
    +
  • Models are collections of data that can be queried in standardized source-agnostic ways
  • +
  • Commands are methods that map to particular pieces of business logic that might return references to other commands or models
  • +
+ +```graphql + # A standardized way to fetch a list of authors + query GetAuthors { + authors { + id + name + } + } + + # A specific method to search for authors + query findAuthors { + search_authors(args: {search: "Einstein"}) { + id + name + } + } +``` +
+
S2 + Model filtering +
+
+ Example + Get a list of articles published this year + +```graphql + query articlesThisYear { + articles(where: {publishDate: {_gt: "2024-01-01"}}) { + id + name + } + } +``` +
+
S3 + Model sorting +
+ Example + Get a list of articles sorted in reverse by the date of publishing + + ```graphql + query sortedArticles { + article(order_by: {publishDate: desc}) { + id + title + author_id + } + } + ``` +
+
S4 + Model pagination +
+ Example + Paginate the above list with 20 objects per page and fetch the 3rd page + + ```graphql + query sortedArticlesThirdPage { + article(order_by: {publishDate: desc}, offset: 40, limit: 20) { + id + title + author_id + } + } + ``` +
+
S5 + Model aggregations over fields +
+ Example + Get a count of authors and their average age + + ```graphql + query authorStatistics { + author_aggregate { + aggregate { + count # basic aggregation support by any model + avg { # supported over any numeric fields of a type + age + } + + } + } + } + ``` +
+
+ + +> [!NOTE] +> ### Prior art +> - [Google Cloud API design guide](https://cloud.google.com/apis/design/resources) +> - Resource: A resource-oriented API is generally modeled as a resource hierarchy, where each node is either a simple resource or a collection resource +> - Method: Resources are manipulated via a small set of methods + + +### Composability The supergraph API is typically a GraphQL / JSON API. There are varying degrees of composability an API can offer, as listed out in the following table: -| Composability Attribute | Capability | Description | -| :-- | :-- | :-- | -| C1 | Joining data | Join related data together in a "foreign key" like join | -| C2 | Nested filtering | Filter a parent by a property of its child (aka a property of a related entity) | -| C3 | Nested sorting | Sort a parent by a property of its child (aka a property of a related entity) | -| C4 | Nested pagination | Fetch a paginated list of parents, along with a paginated & sorted list of children for each parent | -| C5 | Nested aggregation | Aggregate a child in the context of its parent | + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Composability AttributeCapability Description
C1 Joining dataJoin related data together in a "foreign key" like join +
+ Example +Get a list of authors and their articles + +```graphql +query authorWithArticles { + author { + id + name + articles { + id + title + } + } +} +``` +
+
C2 Nested filteringFilter a parent by a property of its child (i.e. a property of a related entity) +
+ Example +Get a list of authors whose have published an article this year + +```graphql +query recentlyActiveAuthors { + author(where: {articles: {publishDate: {_gt: "2024-01-01"}}}) { + id + name + } +} +``` +
+
C3 Nested sorting Sort a parent by a property of its child (i.e. a property of a related entity) +
+ Example +Get a list of articles sorted by the names of their author + +```graphql +query sortedArticles { + article(order_by: {author: {name: asc}}) { + id + title + } +} +``` +
+
C4 Nested pagination Fetch a paginated list of parents, along with a paginated & sorted list of children for each parent +
+ Example +Get the 2nd page of a list of authors and the first page of their articles, sorted by the article's title field + +```graphql +query paginatedAuthorsWithSortedPaginatedArticles { + author(offset: 10, limit: 20) { + id + name + articles(offset: 0, limit: 25, order_by: {title: asc}) { + title + publishDate + } + } +} +``` +
+
C5 Nested aggregation Aggregate a child/parent in the context of its parent/child +
+ Example +Get a list of authors and the number of articles written by each author + +```graphql +query prolificAuthors { + author (limit: 10) { + id + name + articles_aggregate { + count + } + } +} +``` +
+
These composability attributes are what increase the level of self-serve composition and reduce the need for manual API aggregation and composition. ## More reading - [Use cases](/use-cases) +- [Reference API schema](/reference-api-schema) - [FAQ](/faq) diff --git a/assets/ConnectConsumeDiscoverERD.png b/assets/ConnectConsumeDiscoverERD.png new file mode 100644 index 0000000..f6b56d2 Binary files /dev/null and b/assets/ConnectConsumeDiscoverERD.png differ diff --git a/assets/control-plane-lifecycle.png b/assets/control-plane-lifecycle.png new file mode 100644 index 0000000..1c02520 Binary files /dev/null and b/assets/control-plane-lifecycle.png differ diff --git a/assets/data-plane-flow.png b/assets/data-plane-flow.png new file mode 100644 index 0000000..17c8ae9 Binary files /dev/null and b/assets/data-plane-flow.png differ diff --git a/assets/supergraphioERD.png b/assets/supergraphioERD.png new file mode 100644 index 0000000..16c6c22 Binary files /dev/null and b/assets/supergraphioERD.png differ