Skip to content
This repository was archived by the owner on Dec 6, 2024. It is now read-only.

Commit 323c9f8

Browse files
authored
Rewrite linkerd-tcp (#63)
linkerd-tcp 0.1.0 constitutes a major rewrite. Previously, linkerd-tcp did not properly utilize tokio's task model, which lead to a number of performance and correctness problems. Furthermore, linkerd-tcp's configuration interface was substantially different from linkerd's, which caused some confusion. Now, linkerd-tcp has been redesigned: - to better-leverage tokio's reactor; - to support connection and stream timeouts; - to provide much richer metrics insight; - to be structured like a linkerd-style router; - general correctness improvements. Fixes #26 #40 #49 #50 Depends on linkerd/tacho#20
1 parent 8ff694e commit 323c9f8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+3957
-2710
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
11
target
2+
tmp.discovery

Cargo.lock

Lines changed: 220 additions & 197 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[package]
22
name = "linkerd-tcp"
33
description = "A native TCP proxy for the linkerd service mesh"
4-
version = "0.0.2"
4+
version = "0.1.0"
55
authors = [
66
"Oliver Gould <ver@buoyant.io>",
77
"Steve Jenson <stevej@buoyant.io>",
@@ -19,16 +19,18 @@ bytes = "0.4"
1919
clap = "2.24"
2020
futures = "0.1"
2121
# We use not-yet-released tokio integration on master:
22-
hyper = { git = "https://github.com/hyperium/hyper", rev = "ca22eae" }
22+
hyper = { git = "https://github.com/hyperium/hyper", rev = "09fe9e6" }
2323
log = "0.3"
24-
rand = "0.3"
24+
ordermap = "0.2.10"
2525
pretty_env_logger = "0.1"
26+
rand = "0.3"
2627
rustls = "0.8"
2728
serde = "1.0"
2829
serde_derive = "1.0"
2930
serde_json = "1.0"
3031
serde_yaml = "0.7"
31-
tacho = "0.3"
32+
# tacho = { path = "../tacho" }
33+
tacho = "0.4"
3234
tokio-core = "0.1"
3335
tokio-io = "0.1"
3436
tokio-service = "0.1"

README.md

Lines changed: 72 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Status: _beta_
2626
## Quickstart ##
2727

2828
1. Install [Rust and Cargo][install-rust].
29-
2. Configure and run [namerd][namerd].
29+
2. Run [namerd][namerd]. `./namerd.sh` fetches, configures, and runs namerd using a local-fs-backed discovery (in ./tmp.discovery).
3030
3. From this repository, run: `cargo run -- example.yml`
3131

3232
We :heart: pull requests! See [CONTRIBUTING.md](CONTRIBUTING.md) for info on
@@ -52,34 +52,79 @@ ARGS:
5252
### Example configuration ###
5353

5454
```yaml
55-
proxies:
55+
56+
# Administrative control endpoints are exposed on a dedicated HTTP server. Endpoints
57+
# include:
58+
# - /metrics -- produces a snapshot of metrics formatted for prometheus.
59+
# - /shutdown -- POSTing to this endpoint initiates graceful shutdown.
60+
# - /abort -- POSTing to this terminates the process immediately.
61+
admin:
62+
port: 9989
63+
64+
# By default, the admin server listens only on localhost. We can force it to bind
65+
# on all interfaces by overriding the IP.
66+
ip: 0.0.0.0
67+
68+
# Metrics are snapshot at a fixed interval of 10s.
69+
metricsIntervalSecs: 10
70+
71+
# A process exposes one or more 'routers'. Routers connect server traffic to
72+
# load balancers.
73+
routers:
74+
75+
# Each router has a 'label' for reporting purposes.
5676
- label: default
77+
5778
servers:
58-
# Listen on two ports, one using a self-signed TLS certificate.
59-
- kind: io.l5d.tcp
60-
addr: 0.0.0.0:7474
61-
- kind: io.l5d.tls
62-
addr: 0.0.0.0:7575
63-
defaultIdentity:
64-
privateKey: private.pem
65-
certs:
66-
- cert.pem
67-
- ../eg-ca/ca/intermediate/certs/ca-chain.cert.pem
68-
69-
# Lookup /svc/google in namerd.
70-
namerd:
71-
url: http://127.0.0.1:4180
72-
path: /svc/google
73-
74-
# Require that the downstream connection be TLS'd, with a `subjectAltName` including
75-
# the DNS name _www.google.com_ using either our local CA or the host's default
76-
# openssl certificate.
79+
80+
# Each router has one or more 'servers' listening for incoming connections.
81+
# By default, routers listen on localhost. You need to specify a port.
82+
- port: 7474
83+
dstName: /svc/default
84+
# You can limit the amount of time that a server will wait to obtain a
85+
# connection from the router.
86+
connectTimeoutMs: 500
87+
88+
# By default each server listens on 'localhost' to avoid exposing an open
89+
# relay by default. Servers may be configured to listen on a specific local
90+
# address or all local addresses (0.0.0.0).
91+
- port: 7575
92+
ip: 0.0.0.0
93+
# Note that each server may route to a different destination through a
94+
# single router:
95+
dstName: /svc/google
96+
# Servers may be configured to perform a TLS handshake.
97+
tls:
98+
defaultIdentity:
99+
privateKey: private.pem
100+
certs:
101+
- cert.pem
102+
- ../eg-ca/ca/intermediate/certs/ca-chain.cert.pem
103+
104+
# Each router is configured to resolve names.
105+
# Currently, only namerd's HTTP interface is supported:
106+
interpreter:
107+
kind: io.l5d.namerd.http
108+
baseUrl: http://localhost:4180
109+
namespace: default
110+
periodSecs: 20
111+
112+
# Clients may also be configured to perform a TLS handshake.
77113
client:
78-
tls:
79-
dnsName: "www.google.com"
80-
trustCerts:
81-
- ../eg-ca/ca/intermediate/certs/ca-chain.cert.pem
82-
- /usr/local/etc/openssl/cert.pem
114+
kind: io.l5d.static
115+
# We can also apply linkerd-style per-client configuration:
116+
configs:
117+
- prefix: /svc/google
118+
connectTimeoutMs: 400
119+
# Require that the downstream connection be TLS'd, with a
120+
# `subjectAltName` including the DNS name _www.google.com_
121+
# using either our local CA or the host's default openssl
122+
# certificate.
123+
tls:
124+
dnsName: "www.google.com"
125+
trustCerts:
126+
- ../eg-ca/ca/intermediate/certs/ca-chain.cert.pem
127+
- /usr/local/etc/openssl/cert.pem
83128
```
84129
85130
### Logging ###
@@ -89,7 +134,7 @@ debugging, set `RUST_LOG=trace`.
89134

90135
## Docker ##
91136

92-
To build the linkerd/linkerd-tcp docker image, run:
137+
To build the linkerd/linkerd-tcp docker image, run:
93138

94139
```bash
95140
./dockerize latest

example.yml

Lines changed: 11 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,18 @@
11
admin:
2-
addr: 0.0.0.0:9989
2+
port: 9989
33
metricsIntervalSecs: 10
44

5-
proxies:
5+
routers:
66

77
- label: default
88
servers:
9-
- kind: io.l5d.tcp
10-
addr: 0.0.0.0:7474
11-
# - kind: io.l5d.tls
12-
# addr: 0.0.0.0:7575
13-
# identities:
14-
# localhost:
15-
# privateKey: ../eg-ca/localhost.tls/private.pem
16-
# certs:
17-
# - ../eg-ca/localhost.tls/cert.pem
18-
# - ../eg-ca/localhost.tls/ca-chain.cert.pem
9+
- port: 7474
10+
dstName: /svc/default
11+
connectTimeoutMs: 500
12+
connectionLifetimeSecs: 60
1913

20-
namerd:
21-
url: http://127.0.0.1:4180
22-
path: /svc/default
23-
intervalSecs: 5
24-
25-
# client:
26-
# tls:
27-
# dnsName: "www.google.com"
28-
# trustCerts:
29-
# - ../eg-ca/www.google.com.tls/ca-chain.cert.pem
30-
# #- /usr/local/etc/openssl/cert.pem
14+
interpreter:
15+
kind: io.l5d.namerd.http
16+
baseUrl: http://localhost:4180
17+
namespace: default
18+
periodSecs: 20

namerd.sh

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
#!/bin/sh
2+
3+
set -e
4+
5+
version="1.0.2"
6+
bin="target/namerd-${version}-exec"
7+
sha="338428a49cbe5f395c01a62e06b23fa492a7a9f89a510ae227b46c915b07569e"
8+
url="https://github.com/linkerd/linkerd/releases/download/${version}/namerd-${version}-exec"
9+
10+
validbin() {
11+
checksum=$(openssl dgst -sha256 $bin | awk '{ print $2 }')
12+
[ "$checksum" = $sha ]
13+
}
14+
15+
if [ -f "$bin" ] && ! validbin ; then
16+
echo "bad $bin" >&2
17+
mv "$bin" "${bin}.bad"
18+
fi
19+
20+
if [ ! -f "$bin" ]; then
21+
echo "downloading $bin" >&2
22+
curl -L --silent --fail -o "$bin" "$url"
23+
chmod 755 "$bin"
24+
fi
25+
26+
if ! validbin ; then
27+
echo "bad $bin. delete $bin and run $0 again." >&2
28+
exit 1
29+
fi
30+
31+
mkdir -p ./tmp.discovery
32+
if [ ! -f ./tmp.discovery/default ]; then
33+
echo "127.1 9991" > ./tmp.discovery/default
34+
fi
35+
36+
"$bin" -- - <<EOF
37+
admin:
38+
port: 9991
39+
40+
namers:
41+
- kind: io.l5d.fs
42+
rootDir: ./tmp.discovery
43+
44+
storage:
45+
kind: io.l5d.inMemory
46+
namespaces:
47+
default: /svc => /#/io.l5d.fs;
48+
49+
interfaces:
50+
- kind: io.l5d.httpController
51+
EOF

router.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# Rust Stream Balancer Design
2+
3+
## Prototype
4+
5+
The initial implementation is basically a prototype. It proves the concept, but it has
6+
severe deficiencies that cause performance (and probably correctness) problems.
7+
Specifically, it implements its own polling... poorly.
8+
9+
At startup, the configuration is parsed. For each **proxy**, the namerd and serving
10+
configurations are split and connectd by an async channel so that namerd updates are
11+
processed outside of the serving thread. All of the namerd watchers are collected to be
12+
run together with the admin server. Once all of the proxy configurations are processed,
13+
the application is run.
14+
15+
The admin thread is started, initiating all namerd polling and starting the admin server.
16+
17+
Simultaneously, all of the proxies are run in the main thread. For each of these, a
18+
**connector** is created to determine how all downstream connections are established for
19+
the proxy. A **balancer** is created with the connector and a stream of namerd updates. An
20+
**acceptor** is created for each listening interface, which manifests as a stream of
21+
connections, connections. The balancer is made shareable across servers by creating an
22+
async channel and each server's connections are streamed into a sink clone. The balancer
23+
is driven to process all of these connections.
24+
25+
The balancer implements a Sink that manages _all_ I/O and connection management. Each
26+
time `Balancer::start_send` or `Balancer::poll_complete` is called, the following work is
27+
done:
28+
- _all_ conneciton streams are checked for I/O and data is transfered;
29+
- closed connections are reaped;
30+
- service discovery is checked for updates;
31+
- new connections are established;
32+
- stats are recorded;
33+
34+
## Lessons/Problems
35+
36+
### Inflexible
37+
38+
This model doesn't really reflect that of linkerd. We have no mechanism to _route_
39+
connections. All connections are simply forwarded. We cannot, for instance, route based on
40+
client credentials or SNI destination.
41+
42+
### Inefficient
43+
44+
Currently, each balancer is effectively a scheduler, and a pretty poor one at that. I/O
45+
processing should be far more granular and we shouldn't update load balancer endpoints in
46+
the I/O path (unless absolutely necessary).
47+
48+
### Timeouts
49+
50+
We need several types of timeouts that are not currently implemented:
51+
- Connection timeout: time from incoming connection to outbound established.
52+
- Stream lifetime: maximum time a stream may stay open.
53+
- Idle timeout: maximum time a connection may stay open without transmitting data.
54+
55+
## Proposal
56+
57+
linkerd-tcp should become a _stream router_. In the same way that linkerd routes requests,
58+
linkerd-tcp should route connections. The following is a rough, evolving sketch of how
59+
linkerd-tcp should be refactored to accomodate this:
60+
61+
The linkerd-tcp configuration should support one or more **routers**. Each router is
62+
configured with one or more **servers**. A server, which may or may not terminate TLS,
63+
produces a stream of incoming connections comprising an envelope--a source identity (an
64+
address, but maybe more) and a destination name--and a bidirectional data stream. The
65+
server may choose the destination by static configuration or as some function of the
66+
connection (e.g. client credentials, SNI, etc). Each connection envelope may be annotated
67+
with a standard set of metadata including, for example, an optional connect deadline,
68+
stream deadline, etc.
69+
70+
The streams of all incoming connections for a router are merged into a single stream of
71+
enveloped connections. This stream is forwarded to a **binder**. A binder is responsible
72+
for maintaining a cache of balancers by destination name. When a balancer does not exist
73+
in the cache, a new namerd lookup is initiated and its result stream (and value) is cached
74+
so that future connections may resolve quickly. The binder obtains a **balancer** for each
75+
destination name that maintains a list of endpoints and their load (in terms of
76+
connections, throughput, etc).
77+
78+
If the inbound connection has not expired (i.e. due to a timeout), it is dispatched to the
79+
balancer for processing. The balancer maintains a reactor handle and initiates I/O and
80+
balancer state management on the reactor.
81+
82+
```
83+
------ ------
84+
| srv0 | ... | srvN |
85+
------ | ------
86+
|
87+
| (Envelope, IoStream)
88+
V
89+
------------------- -------------
90+
| binder |----| interpreter |
91+
------------------- -------------
92+
|
93+
V
94+
----------
95+
| balancer |
96+
----------
97+
|
98+
V
99+
----------
100+
| endpoint |
101+
----------
102+
|
103+
V
104+
--------
105+
| duplex |
106+
--------
107+
```

0 commit comments

Comments
 (0)