Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 101 additions & 0 deletions docs/best-practices/knowledge-hub/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
---
id: architecture
title: Temporal Architecture
sidebar_label: Architecture
description: Enterprise Temporal architecture covering Namespace conventions, Worker deployment patterns, network connectivity, and disaster recovery procedures.
toc_max_heading_level: 3
keywords:
- temporal architecture
- temporal namespace
- temporal connectivity
- temporal worker deployment
tags:
- Best Practices
- Knowledge Hub
---

:::info
This page is part of the [Temporal Knowledge Hub](./index.md).
:::

:::note
Customize this section to describe the architectural decisions and guardrails that shape how your developers build with Temporal.
:::

This document defines our enterprise Temporal architecture, covering Namespace conventions, Worker deployment patterns, network connectivity, and disaster recovery procedures.

## Temporal Cloud

At ABC Financial, we use Temporal Cloud, which is a fully managed Temporal service. It offers a hassle-free way to run our Temporal Applications without the need to manage the underlying infrastructure.

Our Workers and Temporal Applications connect to the Temporal Cloud service, which takes care of the persistence layer, scalability, and availability for you.

## Namespace

A Temporal Cloud [Namespace](https://docs.temporal.io/namespaces) is a unit of isolation within the Temporal platform. It ensures that Workflow executions, Task Queues, and resources are logically separated.

:::note
Define a Namespace naming convention based on the Temporal [Namespace Best Practices](../managing-namespace.mdx).
:::

At ABC Financial, we adhere to the following standards for our Temporal Cloud Namespaces:

1. The naming convention is `<business-unit>-<domain>-<environment>`
1. Use at most 10 characters for business units (e.g. `consumer`, `commercial`, `investment`).
2. Use at most 10 characters for domain (e.g. `payment`, `mortgage`).
3. Use one of the support environments: `dev`, `stg`, `prd`.

:::note
Link to your internal Namespace provisioning process so developers can self-serve.
:::

File an internal service ticket to request for a new Temporal Cloud Namespace.

:::note
List the default features and guardrails applied to new Namespaces by environment.
:::

Based on the environment (i.e. `dev`, `stg`, `prd`), the following features are configured by our automation:

| Feature | Development | Staging | Production |
| :---- | ----- | ----- | ----- |
| [Deletion Protection](https://docs.temporal.io/cloud/namespaces#delete-protection) | ✅ | ✅ | ✅ |
| [Private Connectivity](https://docs.temporal.io/cloud/connectivity) | ✅ | ✅ | ✅ |
| [Custom Encryption](https://docs.temporal.io/default-custom-data-converters) | ✅ | ✅ | ✅ |
| [Codec Server](https://docs.temporal.io/codec-server) | ✅ | ✅ | ✅ |
| [API Key](https://docs.temporal.io/cloud/api-keys) | ✅ | ✅ | ✅ |
| [API Key Rotation](https://docs.temporal.io/cloud/api-keys#rotate-an-api-key) | ✅ | ✅ | ✅ |
| [Observability](https://docs.temporal.io/evaluate/development-production-features/observability) | ✅ | ✅ | ✅ |
| [Audit Logs](https://docs.temporal.io/cloud/audit-logs) | ✅ | ✅ | ✅ |
| [Workflow History Export](https://docs.temporal.io/cloud/export) | ❌ | ❌ | ✅ |
| [Multi-Region Replication](https://docs.temporal.io/cloud/high-availability#multi-region-replication) | ❌ | ❌ | ✅ |

## Connectivity

:::note
Describe your network connectivity requirements so developers understand how Workers connect to Temporal Cloud.
:::

At ABC Financial, private connectivity is required for all Temporal Cloud Namespaces for compliance reasons. [Private connectivity](https://docs.temporal.io/cloud/connectivity) eliminates traffic over public internet to Temporal Cloud.

For reference, see below for official Temporal documentations on AWS and GCP private connectivity:

* [AWS PrivateLink Connectivity | Temporal Platform Documentation](https://docs.temporal.io/cloud/connectivity/aws-connectivity)
* [Google Private Service Connect Connectivity | Temporal Platform Documentation](https://docs.temporal.io/cloud/connectivity/gcp-connectivity)

## Worker

:::note
Document your Worker deployment standards so developers know where and how to deploy.
:::

At ABC Financial, Temporal Workers are deployed as containerized applications on Kubernetes clusters across AWS EKS and GCP GKE.

All worker deployments are managed through [Helm](https://helm.sh/) charts, ensuring:

* Standardized deployment configurations across clouds
* Version-controlled infrastructure as code
* Simplified rollbacks and updates
* Environment-specific value overrides

[KEDA](https://keda.sh/docs/2.18/scalers/) is configured to auto-scale Workers based on Temporal Task Queue backlog.
70 changes: 70 additions & 0 deletions docs/best-practices/knowledge-hub/cost.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
id: cost
title: Temporal Cloud Cost
sidebar_label: Cost
description: Understanding Temporal Cloud's consumption-based pricing model and tips for building cost-effective Workflows.
toc_max_heading_level: 3
keywords:
- temporal cloud cost
- temporal pricing
- temporal actions
- temporal storage
tags:
- Best Practices
- Knowledge Hub
---

:::info
This page is part of the [Temporal Knowledge Hub](./index.md).
:::

:::note
Add cost-saving tips to help developers optimize Temporal Cloud spending.
:::

As we scale our usage of Temporal Cloud, understanding the cost model is critical for designing cost-efficient workflows. Temporal Cloud is consumption-based, and its pricing is based on Action and Storage.

Our Enterprise contract covers base fees and support, but your specific namespace usage drives the variable costs.

## Action

Actions are the primary unit of consumption-based pricing for Temporal Cloud. They track billable operations within the Temporal Cloud Service.

### What counts as an Action?

* **Workflow Start**: Starting a Workflow execution.
* **Activity Start and Retry**: Starting and retrying an Activity.
* **Signals**: Sending a signal to a Workflow.
* **Timers**: A Timer firing.
* **Child Workflows**: Starting a Child Workflow.
* **Search Attribute upsert**: occurs for each invocation of `UpsertSearchAttributes` command

For a complete list of billable Actions, see [Temporal Cloud Actions](https://docs.temporal.io/cloud/actions).

### Cost-saving tip #1: Configure exponential backoff for Activity Retry

Ensure your Activity Retry Policy uses a `BackoffCoefficient` > 1.0 (e.g. 2.0) and a reasonable `MaximumInterval`.

**Why**: Each retry attempt counts as a billable Action. Aggressive, constant-interval retries during downstream outages will skyrocket Action usage and costs without progressing the workflow.

## Storage

Storage is charged based on Gigabyte-Hours (GB-h). There are two tiers:

1. **Active Storage (higher cost)**:
* This is the storage used by `Open` workflows.
* It is 40x more expensive than Retained storage.
2. **Retained Storage (lower cost)**:
* This is the Event History of `Closed` Workflows.
* We pay this to keep the history available for debugging (based on the Namespace Retention policy).

### Cost-saving tip #2: Use Continue-As-New for long-running Workflows

Trigger `ContinueAsNew` periodically (e.g. every ~4,000 events or daily) for long-running or indefinite workflows.

**Why**: This closes the current run, moving its Event History from Active Storage (expensive) to Retained Storage (cheap). This creates a ~97% reduction in storage costs for that history data.

## What's next

* [Temporal Cloud pricing](https://docs.temporal.io/cloud/pricing)
* [Temporal Cloud Actions](https://docs.temporal.io/cloud/actions)
138 changes: 138 additions & 0 deletions docs/best-practices/knowledge-hub/decision-framework.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
---
id: decision-framework
title: Temporal Decision Framework
sidebar_label: Decision Framework
description: A guide to help you determine whether Temporal is the right solution for your use case.
toc_max_heading_level: 3
keywords:
- temporal decision framework
- when to use temporal
- temporal use cases
- temporal alternatives
tags:
- Best Practices
- Knowledge Hub
---

:::info
This page is part of the [Temporal Knowledge Hub](./index.md).
:::

This guide helps you quickly determine whether Temporal is the right solution for your use case.

## Temporal decision framework

:::note
Tailor these questions to match your organization's technical landscape.
:::

To decide whether Temporal is a suitable solution for your use case, ask yourself 3 questions:

1. **Does your digital process have multiple steps that can fail independently?**
2. **Do you need the process to survive failures?**
3. **Does your process span multiple services, APIs, or long time periods (i.e. &gt;10 seconds)?**

If you answered "**yes**" to 2 or more questions, Temporal is likely a good fit. Continue reading.

If you answered "**no**" to all three questions, consider alternatives first. Skip to [Bad use cases for Temporal](#bad-use-cases-for-temporal) to explore alternative solutions.

## Temporal benefits

:::note
Highlight benefits that address your developers' pain points.
:::

1. **Durable Execution** - your code will always complete.
* Automatic retry, recovery from infrastructure failures, durable state persistence, and exactly-once execution semantics—all without custom code.
2. **Developer velocity** - ship faster with less code to maintain.
* Write business logic in familiar languages, collaborate with developers across language barriers, eliminate boilerplate infrastructure code, and leverage built-in testing for rapid iteration.
3. **Audit trail** - complete visibility in your digital process.
* Immutable execution history, self-documenting Workflow execution, and operational transparency.
4. **Priority and Fairness** - enterprise-grade multi-tenancy.
* Priority-based execution, and fair distribution of Workflow Executions across your customer base or tenant.
5. **Workflow fabric** - break down development silo.
* Cross-team Workflow orchestration with reusable operations, cross-namespace coordination, and service registry for discoverability.

## Good use cases for Temporal

:::note
Replace with use cases from your domain. See [Customer Stories](https://temporal.io/in-use) for inspiration.
:::

### Business transactions

1. **Payment processing**
* **Why Temporal is perfect**: Multi-party coordination with compensation logic, audit requirements, idempotency guarantees, timeout handling for authorizations that expire, and scalability to support more than billions of transactions per day.
2. **Order management**
* **Why Temporal is perfect**: Long-running state machines spanning hours to days with complex state transitions, human intervention, parallel operations, different order priority, variable timing per order, and support for more than millions of orders per hour.
3. **Mortgage underwriting**
* **Why Temporal is perfect**: Weeks-long processes with complex decision trees, multiple external integrations, human approvals, strict compliance requirements, and durable state persistence.

### Customer experience

1. **Marketing campaign**
* **Why Temporal is perfect**: Multi-channel orchestration with time-based sequencing and long campaign durations with dynamic personalization.
2. **Customer onboarding**
* **Why Temporal is perfect:** Great for long-running, multi-step, and sometimes human-in-the-loop processes that onboarding often requires.

### Data engineering

1. **Document processing**
* **Why Temporal is perfect**: Multi-stage pipelines with variable processing times, external service dependencies, rate limit requirements, and coordinated large-scale processing.
2. **Data pipeline**
* **Why Temporal is perfect**: Data orchestration with complex dependencies, incremental processing, backfill coordination, cross-system dependencies, SLA monitoring, and idempotent execution.
3. **Video processing**
* **Why Temporal is perfect**: Long-running compute, resource-intensive GPU activities, complex pipelines with parallel variant generation, failure isolation, and cost-optimized scheduling.

### AI/ML

1. **ML inference**
* **Why Temporal is perfect**: Multi-model orchestration with fallback logic, batch and real-time handling, feature engineering, and comprehensive audit trail.
2. **RAG**
* **Why Temporal is perfect**: Multi-step retrieval with hybrid search, context assembly from multiple sources, LLM orchestration with retries and fallbacks, and evaluation pipeline tracking.
3. **AI agents**
* **Why Temporal is perfect**: Long-running autonomous execution with tool orchestration, planning and replanning, human-in-the-loop controls, durable memory management, and safety guardrails.

### Operational

1. **Infrastructure management**
* **Why Temporal is perfect**: Multi-step provisioning with automatic rollback on failure, idempotent cloud operations, change management, and complete auditability.
2. **CI/CD**
* **Why Temporal is perfect**: Complex pipeline stages with environment promotion gates, parallel test execution, conditional deployment strategies, automatic rollback monitoring, and approval gates.

## Bad use cases for Temporal

:::note
Add anti-patterns specific to your organization's domain and technology stack.
:::

1. **Simple Request-Response APIs**
* No failure recovery needed
* Better alternative: REST / gRPC server
2. **Real-time stream processing**
* High throughput (&gt;1M events/sec)
* Ultra-low latency requirements (&lt;100ms)
* No durable state needed
* Better alternative: Flink, Amazon Kinesis, Google Cloud Dataflow
3. **Database triggers & stored procedures**
* Logic tightly coupled to database
* Needs transactional guarantees within single DB
* No external service calls
* Better alternative: database native features
4. **Pure Compute Workloads**
* CPU/GPU intensive calculations
* No I/O or service calls
* No state management needed
* Better alternative: AWS Lambda, Spark, Ray

## Next steps

:::note
Add relevant links (i.e. support channel) for your developers to explore next.
:::

To learn more:

* [Run your first Temporal Workflow in under 30 minutes](./getting-started.md)
* Schedule a discovery session with the Temporal platform team to validate your use case
* [See how other teams are using Temporal today](./temporal-overview.md#temporal-use-cases-at-abc-financial)
26 changes: 26 additions & 0 deletions docs/best-practices/knowledge-hub/faqs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
id: faqs
title: Frequently Asked Questions
sidebar_label: FAQs
description: Common questions and answers about using Temporal at your organization.
toc_max_heading_level: 3
keywords:
- temporal faqs
- temporal questions
- temporal help
tags:
- Best Practices
- Knowledge Hub
---

:::info
This page is part of the [Temporal Knowledge Hub](./index.md).
:::

:::note
Add and remove frequently asked questions from your engineering teams.
:::

## When should I use Temporal?

There are many reasons why you should use Temporal. Use the [Temporal Decision Framework](./decision-framework.md) to help you decide.
Loading