Tutorials

Step-by-step guides for common orchestrator workflows.

Tutorial 1: Setting up orchestrator with a MySQL topology

This tutorial walks you through setting up orchestrator to manage an existing MySQL master-replica topology.

What you will need

A running MySQL master with one or more replicas (MySQL 5.7+ or 8.0+)
Go 1.25+ installed
Network access from the orchestrator host to all MySQL instances on port 3306

Step 1: Build orchestrator

git clone https://github.com/proxysql/orchestrator.git
cd orchestrator
go build -o bin/orchestrator ./go/cmd/orchestrator

Step 2: Create a MySQL user for orchestrator

On your MySQL master (this will replicate to all replicas automatically):

CREATE USER 'orc_topology'@'orchestrator-host' IDENTIFIED BY 'a_secure_password';
GRANT SUPER, PROCESS, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'orc_topology'@'orchestrator-host';

Replace orchestrator-host with the hostname or IP of the machine running orchestrator. Use % for any host.

Step 3: Create a MySQL backend database

For production use, orchestrator should store its data in MySQL rather than SQLite. On a MySQL instance (can be the same master, or a separate server):

CREATE DATABASE orchestrator;
CREATE USER 'orc_server'@'localhost' IDENTIFIED BY 'another_secure_password';
GRANT ALL ON orchestrator.* TO 'orc_server'@'localhost';

Step 4: Write the configuration file

Create orchestrator.conf.json:

{
  "Debug": false,
  "ListenAddress": ":3000",
  "MySQLTopologyUser": "orc_topology",
  "MySQLTopologyPassword": "a_secure_password",
  "MySQLOrchestratorHost": "127.0.0.1",
  "MySQLOrchestratorPort": 3306,
  "MySQLOrchestratorDatabase": "orchestrator",
  "MySQLOrchestratorUser": "orc_server",
  "MySQLOrchestratorPassword": "another_secure_password",
  "DefaultInstancePort": 3306,
  "DiscoverByShowSlaveHosts": true,
  "InstancePollSeconds": 5,
  "ReasonableReplicationLagSeconds": 10,
  "RecoverMasterClusterFilters": ["*"],
  "RecoverIntermediateMasterClusterFilters": ["*"],
  "ApplyMySQLPromotionAfterMasterFailover": true,
  "FailureDetectionPeriodBlockMinutes": 60,
  "RecoveryPeriodBlockSeconds": 3600
}

Step 5: Start orchestrator

bin/orchestrator -config orchestrator.conf.json http

Step 6: Discover the topology

curl http://localhost:3000/api/discover/your-master-host/3306

Wait a few seconds for orchestrator to crawl the replicas, then verify:

curl -s http://localhost:3000/api/topology/your-master-host/3306

You should see your full replication tree printed as indented text.

Step 7: Verify in the web UI

Open http://localhost:3000 in your browser. Click on Clusters in the navigation to see your topology visualized as a tree.

Step 8: Test a topology operation

Move a replica to a different position (dry run with the API):

# List replicas of the master
curl -s http://localhost:3000/api/instance-replicas/your-master-host/3306

You now have a fully operational orchestrator instance managing your MySQL topology.

Tutorial 2: Configuring ProxySQL failover hooks

This tutorial sets up orchestrator to automatically update ProxySQL hostgroups during master failover, so your application traffic is rerouted without any custom scripts.

Prerequisites

A working orchestrator setup (see Tutorial 1)
ProxySQL installed and running with the Admin interface accessible
Your MySQL servers already configured as backends in ProxySQL

Step 1: Verify ProxySQL Admin access

mysql -h 127.0.0.1 -P 6032 -u admin -padmin -e "SELECT * FROM runtime_mysql_servers;"

You should see your MySQL servers listed with their hostgroups.

Step 2: Note your hostgroup IDs

Identify which hostgroup ID is used for writers and which for readers:

mysql -h 127.0.0.1 -P 6032 -u admin -padmin \
  -e "SELECT hostgroup_id, hostname, port, status FROM runtime_mysql_servers;"

For example, if writers are in hostgroup 10 and readers in hostgroup 20, you will use those values below.

Step 3: Add ProxySQL settings to orchestrator config

Add these fields to your orchestrator.conf.json:

{
  "ProxySQLAdminAddress": "127.0.0.1",
  "ProxySQLAdminPort": 6032,
  "ProxySQLAdminUser": "admin",
  "ProxySQLAdminPassword": "admin",
  "ProxySQLWriterHostgroup": 10,
  "ProxySQLReaderHostgroup": 20,
  "ProxySQLPreFailoverAction": "offline_soft"
}

Field	Description
`ProxySQLWriterHostgroup`	The hostgroup ID where the current master lives. Must be > 0 to enable hooks.
`ProxySQLReaderHostgroup`	The hostgroup ID for read replicas. Optional but recommended.
`ProxySQLPreFailoverAction`	What to do with the old master before failover: `offline_soft` (drain connections), `weight_zero`, or `none`.

Step 4: Restart orchestrator

# Stop the running instance (Ctrl+C), then:
bin/orchestrator -config orchestrator.conf.json http

Step 5: Verify ProxySQL connectivity

curl -s http://localhost:3000/api/proxysql/servers | python3 -m json.tool

You should see your ProxySQL server list returned as JSON.

Step 6: Understand the failover flow

When orchestrator detects a dead master and performs recovery:

Pre-failover: The old master is set to OFFLINE_SOFT in ProxySQL (no new connections)
Topology recovery: Orchestrator promotes a replica to be the new master
Post-failover: The new master is added to the writer hostgroup; the old master is removed
ProxySQL applies changes immediately via LOAD MYSQL SERVERS TO RUNTIME

ProxySQL hooks are non-blocking: if ProxySQL is unreachable, the MySQL failover still proceeds.

Step 7: Test with a graceful takeover

To verify everything works without an actual failure, perform a graceful master takeover:

# Identify the current master
curl -s http://localhost:3000/api/clusters

# Perform a graceful takeover (promotes a replica, demotes the master)
curl -s http://localhost:3000/api/graceful-master-takeover/your-cluster-alias/your-new-master-host/3306

Check ProxySQL to confirm the hostgroups updated:

mysql -h 127.0.0.1 -P 6032 -u admin -padmin \
  -e "SELECT hostgroup_id, hostname, port, status FROM runtime_mysql_servers;"

For more details, see the full ProxySQL hooks documentation.

Tutorial 3: Monitoring orchestrator with Prometheus

This tutorial sets up Prometheus to scrape orchestrator metrics and shows useful queries for alerting.

Prerequisites

A running orchestrator instance
Prometheus installed (see prometheus.io/docs)

Step 1: Enable Prometheus metrics in orchestrator

Prometheus metrics are enabled by default. Verify by adding this to your orchestrator.conf.json (or confirm it is not explicitly disabled):

{
  "PrometheusEnabled": true
}

Restart orchestrator if you changed the config.

Step 2: Verify the metrics endpoint

curl -s http://localhost:3000/metrics | head -20

You should see Prometheus-formatted metrics output.

Step 3: Configure Prometheus to scrape orchestrator

Add a scrape job to your prometheus.yml:

scrape_configs:
  - job_name: orchestrator
    static_configs:
      - targets: ['orchestrator-host:3000']
    metrics_path: /metrics
    scrape_interval: 15s

Replace orchestrator-host with the actual hostname or IP. Reload Prometheus:

kill -HUP $(pgrep prometheus)
# or restart the Prometheus service

Step 4: Verify in Prometheus

Open the Prometheus UI (typically http://prometheus-host:9090) and query:

orchestrator_instances_total

You should see the number of MySQL instances orchestrator is managing.

Step 5: Useful queries

Total known instances and clusters:

orchestrator_instances_total
orchestrator_clusters_total

Discovery error rate (over last 5 minutes):

rate(orchestrator_discovery_errors_total[5m])

Recovery operations by type:

sum by (type) (orchestrator_recoveries_total)

Recovery duration (p95 over last hour):

histogram_quantile(0.95, rate(orchestrator_recovery_duration_seconds_bucket[1h]))

Step 6: Set up alerting rules

Create an alerting rule file (e.g., orchestrator-alerts.yml):

groups:
  - name: orchestrator
    rules:
      - alert: OrchestratorHighDiscoveryErrors
        expr: rate(orchestrator_discovery_errors_total[5m]) > 0.1
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Orchestrator has a high discovery error rate"
          description: "More than 0.1 discovery errors/second for the last 10 minutes."

      - alert: OrchestratorRecoveryOccurred
        expr: increase(orchestrator_recoveries_total[5m]) > 0
        labels:
          severity: critical
        annotations:
          summary: "Orchestrator performed a recovery"
          description: "A failover or recovery event occurred in the last 5 minutes."

      - alert: OrchestratorDown
        expr: up{job="orchestrator"} == 0
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Orchestrator is unreachable"

Reference this file in your prometheus.yml:

rule_files:
  - orchestrator-alerts.yml

Step 7: Kubernetes health endpoints

If running orchestrator in Kubernetes, use the built-in health check endpoints for liveness and readiness probes:

livenessProbe:
  httpGet:
    path: /api/status
    port: 3000
  initialDelaySeconds: 10
  periodSeconds: 10
readinessProbe:
  httpGet:
    path: /api/status
    port: 3000
  initialDelaySeconds: 5
  periodSeconds: 5

For the full list of metrics, see the Observability documentation.

Tutorial 4: Using the API v2

This tutorial introduces the v2 REST API, which provides structured JSON responses and proper HTTP status codes.

Prerequisites

A running orchestrator instance with at least one discovered topology

Step 1: Understand the response format

All v2 endpoints return a consistent JSON envelope:

{
  "status": "ok",
  "data": { ... }
}

On errors:

{
  "status": "error",
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable description"
  }
}

HTTP status codes (200, 400, 404, 500, 503) are used correctly, unlike the v1 API which always returns 200.

Step 2: List all clusters

curl -s http://localhost:3000/api/v2/clusters | python3 -m json.tool

Example response:

{
  "status": "ok",
  "data": [
    {
      "clusterName": "master.example.com:3306",
      "clusterAlias": "production",
      "instanceCount": 5
    }
  ]
}

Step 3: Get cluster details

curl -s http://localhost:3000/api/v2/clusters/master.example.com:3306 | python3 -m json.tool

Step 4: List instances in a cluster

curl -s http://localhost:3000/api/v2/clusters/master.example.com:3306/instances | python3 -m json.tool

Step 5: Get a specific instance

curl -s http://localhost:3000/api/v2/instances/replica1.example.com/3306 | python3 -m json.tool

Step 6: View the topology

curl -s http://localhost:3000/api/v2/clusters/master.example.com:3306/topology | python3 -m json.tool

Step 7: Check orchestrator health

curl -s -o /dev/null -w "%{http_code}" http://localhost:3000/api/v2/status

A 200 response means the node is healthy. A 500 response means it is not.

Step 8: View recent recoveries

# All recent recoveries
curl -s http://localhost:3000/api/v2/recoveries | python3 -m json.tool

# Filter by cluster
curl -s "http://localhost:3000/api/v2/recoveries?cluster=master.example.com:3306" | python3 -m json.tool

# Active recoveries only
curl -s http://localhost:3000/api/v2/recoveries/active | python3 -m json.tool

Step 9: Query ProxySQL servers via API v2

If ProxySQL hooks are configured:

# All servers
curl -s http://localhost:3000/api/v2/proxysql/servers | python3 -m json.tool

If ProxySQL is not configured, you will receive a 503 status:

{
  "status": "error",
  "error": {
    "code": "PROXYSQL_NOT_CONFIGURED",
    "message": "ProxySQL is not configured"
  }
}

Step 10: Scripting with the v2 API

The structured responses make scripting straightforward. Example: get all instance hostnames in a cluster using jq:

curl -s http://localhost:3000/api/v2/clusters/master.example.com:3306/instances \
  | jq -r '.data[].Key.Hostname'

Check if any recoveries happened in the last hour:

STATUS=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:3000/api/v2/recoveries/active)
if [ "$STATUS" = "200" ]; then
  ACTIVE=$(curl -s http://localhost:3000/api/v2/recoveries/active | jq '.data | length')
  echo "Active recoveries: $ACTIVE"
fi

For the full endpoint reference, see the API v2 documentation. An OpenAPI 3.0 specification is also available for client generation.

Tutorial 5: Setting up orchestrator with PostgreSQL streaming replication

This tutorial walks you through configuring orchestrator to manage a PostgreSQL streaming replication topology. Orchestrator discovers PostgreSQL primaries and standbys, monitors replication health, and can perform automated failover when a primary fails.

What you will need

PostgreSQL 12+ primary with one or more streaming replication standbys already configured
Go 1.25+ installed (for building from source)
Network access from the orchestrator host to all PostgreSQL instances on port 5432

Step 1: Build orchestrator

git clone https://github.com/proxysql/orchestrator.git
cd orchestrator
go build -o bin/orchestrator ./go/cmd/orchestrator

Step 2: Create an orchestrator user on PostgreSQL

On your PostgreSQL primary (this user must exist on all instances -- primary and standbys):

CREATE USER orchestrator WITH PASSWORD 'orch_pass';
GRANT pg_monitor TO orchestrator;

The pg_monitor role grants read access to pg_stat_replication, pg_stat_wal_receiver, and other monitoring views that orchestrator needs for discovery.

Note: If you are using PostgreSQL 9.6 (not recommended), you need to grant SELECT on the individual monitoring views instead of using pg_monitor.

Step 3: Ensure `pg_hba.conf` allows connections

On each PostgreSQL instance, ensure pg_hba.conf allows connections from the orchestrator host:

# TYPE  DATABASE    USER           ADDRESS              METHOD
host    all         orchestrator   orchestrator-host/32  md5

Reload PostgreSQL after editing:

psql -c "SELECT pg_reload_conf();"

Step 4: Configure orchestrator for PostgreSQL

Create orchestrator.conf.json:

{
  "Debug": true,
  "ListenAddress": ":3000",
  "ProviderType": "postgresql",
  "PostgreSQLTopologyUser": "orchestrator",
  "PostgreSQLTopologyPassword": "orch_pass",
  "PostgreSQLSSLMode": "require",
  "BackendDB": "sqlite",
  "SQLite3DataFile": "/tmp/orchestrator.sqlite3",
  "DefaultInstancePort": 5432,
  "InstancePollSeconds": 5,
  "RecoverMasterClusterFilters": ["*"],
  "RecoverIntermediateMasterClusterFilters": ["*"],
  "FailureDetectionPeriodBlockMinutes": 60,
  "RecoveryPeriodBlockSeconds": 3600
}

Key fields explained:

Field	Purpose
`ProviderType`	Set to `"postgresql"` to enable PostgreSQL mode. Default is `"mysql"`.
`PostgreSQLTopologyUser` / `Password`	Credentials orchestrator uses to connect to your PostgreSQL instances.
`PostgreSQLSSLMode`	SSL mode for PostgreSQL connections: `disable`, `require`, `verify-ca`, or `verify-full`.
`DefaultInstancePort`	Set to `5432` for PostgreSQL (default is 3306 for MySQL).

Step 5: Start orchestrator

bin/orchestrator -config orchestrator.conf.json http

You should see output indicating the service has started and is listening on port 3000.

Step 6: Discover your PostgreSQL topology

Tell orchestrator about your PostgreSQL primary. Replace pg-primary with the actual hostname or IP:

curl http://localhost:3000/api/discover/pg-primary/5432

Expected output:

{
  "Key": {"Hostname": "pg-primary", "Port": 5432},
  "Uptime": 1,
  "FlavorName": "PostgreSQL",
  "Version": "16.2",
  "ReadOnly": false
}

Orchestrator connects to the primary, queries pg_stat_replication to discover connected standbys, and recursively probes each standby.

Step 7: Verify discovery in the web UI

Open your browser to http://localhost:3000. You should see your PostgreSQL replication topology visualized as a tree:

The primary node at the top (read-only: false)
Standby nodes underneath (read-only: true)
Replication lag displayed for each standby

Step 8: Verify via the API

List discovered clusters:

curl -s http://localhost:3000/api/clusters

View the topology:

curl -s http://localhost:3000/api/topology/pg-primary/5432

Check replication analysis (should show NoProblem if everything is healthy):

curl -s http://localhost:3000/api/replication-analysis

Example healthy output:

[]

An empty array means no problems detected.

Step 9: Test graceful failover

To verify failover works, you can simulate a primary failure by stopping PostgreSQL on the primary:

# On the primary host:
pg_ctl stop -D /var/lib/postgresql/16/main -m fast

Within a few seconds, orchestrator will detect the DeadPrimary condition and, if automated recovery is enabled, will:

Select the best standby for promotion (lowest lag, most up-to-date WAL position)
Call pg_promote() on the selected standby
Reconfigure remaining standbys to replicate from the new primary via ALTER SYSTEM SET primary_conninfo and pg_reload_conf()

Monitor the recovery in the web UI or via the API:

curl -s http://localhost:3000/api/replication-analysis

After recovery completes:

curl -s http://localhost:3000/api/topology/new-primary-host/5432

You should see the new primary at the top with the remaining standbys underneath.

Step 10: Verify the recovered topology

Check that all standbys are replicating from the new primary:

# On the new primary:
psql -c "SELECT client_hostname, state, sent_lsn, replay_lsn FROM pg_stat_replication;"

Check orchestrator's view:

curl -s http://localhost:3000/api/v2/clusters | python3 -m json.tool

Differences from MySQL mode

When running in PostgreSQL mode, be aware of these differences:

No intermediate masters. PostgreSQL streaming replication does not support cascading replication in the same way as MySQL. Orchestrator treats all standbys as direct replicas of the primary.
No GTID/Pseudo-GTID. PostgreSQL uses WAL (Write-Ahead Log) positions (LSN) instead of GTIDs. Orchestrator maps LSN to its internal binlog coordinate system.
Promotion uses pg_promote(). Instead of STOP SLAVE / RESET SLAVE / CHANGE MASTER, orchestrator calls pg_promote() on the standby to make it a primary.
Standby reconfiguration uses ALTER SYSTEM. To repoint a standby to a new primary, orchestrator updates primary_conninfo via ALTER SYSTEM and reloads the configuration.
ProxySQL integration is not supported in PostgreSQL mode. Use PgBouncer or another PostgreSQL-aware connection pooler.

Next steps

Database providers documentation -- architecture and provider details
Configuration reference -- all PostgreSQL-related configuration fields
User manual -- PostgreSQL sections in chapters 2-5

FilesExpand file tree

tutorials.md

Latest commit

History

tutorials.md

File metadata and controls

Tutorials

Tutorial 1: Setting up orchestrator with a MySQL topology

What you will need

Step 1: Build orchestrator

Step 2: Create a MySQL user for orchestrator

Step 3: Create a MySQL backend database

Step 4: Write the configuration file

Step 5: Start orchestrator

Step 6: Discover the topology

Step 7: Verify in the web UI

Step 8: Test a topology operation

Tutorial 2: Configuring ProxySQL failover hooks

Prerequisites

Step 1: Verify ProxySQL Admin access

Step 2: Note your hostgroup IDs

Step 3: Add ProxySQL settings to orchestrator config

Step 4: Restart orchestrator

Step 5: Verify ProxySQL connectivity

Step 6: Understand the failover flow

Step 7: Test with a graceful takeover

Tutorial 3: Monitoring orchestrator with Prometheus

Prerequisites

Step 1: Enable Prometheus metrics in orchestrator

Step 2: Verify the metrics endpoint

Step 3: Configure Prometheus to scrape orchestrator

Step 4: Verify in Prometheus

Step 5: Useful queries

Step 6: Set up alerting rules

Step 7: Kubernetes health endpoints

Tutorial 4: Using the API v2

Prerequisites

Step 1: Understand the response format

Step 2: List all clusters

Step 3: Get cluster details

Step 4: List instances in a cluster

Step 5: Get a specific instance

Step 6: View the topology

Step 7: Check orchestrator health

Step 8: View recent recoveries

Step 9: Query ProxySQL servers via API v2

Step 10: Scripting with the v2 API

Tutorial 5: Setting up orchestrator with PostgreSQL streaming replication

What you will need

Step 1: Build orchestrator

Step 2: Create an orchestrator user on PostgreSQL

Step 3: Ensure pg_hba.conf allows connections

Step 4: Configure orchestrator for PostgreSQL

Step 5: Start orchestrator

Step 6: Discover your PostgreSQL topology

Step 7: Verify discovery in the web UI

Step 8: Verify via the API

Step 9: Test graceful failover

Step 10: Verify the recovered topology

Differences from MySQL mode

Next steps

Step 3: Ensure `pg_hba.conf` allows connections