diff --git a/docs/health.md b/docs/health.md new file mode 100644 index 000000000..1ea37c4de --- /dev/null +++ b/docs/health.md @@ -0,0 +1,30 @@ + +# Instance tracking for health checks + +To facilitate health checks to named workers through Hyperbahn, workers may +advertise both the service name (e.g., `tcollector`) and a instance name (e.g., +`tcollector01-1-dc1`). +The advertisement is broadcast to every affine Hyperbahn relay for that +service. + +With partial affinity, each relay is responsible for a proportional subset of +the advertising service workers. +Relays determine the relay responsible for a worker by sorting the known relay +and worker lists and projecting an index from one to the other. + +Every affine relay tracks an additional mapping from worker host:port to the +last known worker identifier. + +Requests may have an additional `in` (instance name) transport header. +Egress relays respect the `in` by forwarding to a relay responsible for +maintaining an open connection. +When a relay receives a request: + +1. If the relay is not an exit node for `sn` (service name), forwards to one +that is. +2. Looks up the host:port for the instance name (instance address). +3. Using the known relays and known workers, discerns the set of host:port for +relays responsible for connections to the instance address (instance relays). +4. If the relay is not among instance relays, forwards to one that is. +5. Forwards the request to the instance address. + diff --git a/docs/protocol.md b/docs/protocol.md index c0f2dfe4c..66a109478 100644 --- a/docs/protocol.md +++ b/docs/protocol.md @@ -624,7 +624,8 @@ valid or not in a call req or res. Following sections will elaborate on details | `re` | Y | N | Retry Flags | `se` | Y | N | Speculative Execution | `fd` | Y | Y | Failure Domain -| `sk` | Y | N | Shard key +| `sk` | Y | N | Shard Key +| `in` | Y | N | Instance Name ### Transport Header `as` -- Arg Scheme @@ -718,6 +719,16 @@ For example you may want to keep some in memory state, i.e. cache, aggregation. You can use read the `sk` and forward the call request to a specific process that has ownership for the shard key. +### Transport Header `in` -- Instance Name + +An instance name is a unique identifier for an instance that may be more +consistent than a host:port address, particularly if the instance delegates to +the operating system to choose a port. +Instances may advertise their instance name in addition to their service name. +TChannel relays may respect the instance name header by forwarding to the exact +host:port of the instance instead of forwarding to any instance with the +request service name. + ### A note on `host:port` header values While these `host:port` fields are indeed strings, the intention is to provide