You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/blog/journaling/index.md
+33-31Lines changed: 33 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,17 +49,19 @@ social:
49
49
# for details, and please remove these comments before submitting for review.
50
50
---
51
51
52
-
Pulumi saves a snapshot of the current state of your cloud infrastructure at every deployment, and also at every step of the deployment. This means that Pulumi always has a current view of the state even if there are crashes during an operation. However this comes with a performance penalty especially for large stacks. Today we're introducing an improvement that can speed up deployments up to 10x. Read on for benchmarks and some technical details of the implementation.
52
+
Pulumi saves a snapshot of the current state of your cloud infrastructure at every deployment, and also at every step of the deployment. This means that Pulumi always has a current view of the state even if there are crashes during an operation. However, this comes with a performance penalty especially for large stacks. Today we're introducing an improvement that can speed up deployments up to 10x. Read on for benchmarks and some technical details of the implementation.
53
53
54
54
<!--more-->
55
55
56
56
## Benchmarks
57
57
58
-
Before getting into the more technical details, here's a number of benchmarks demonstrating what this new experience looks like. To run the benchmarks end we picked a couple of Pulumi projects, one that can be set up massively parallel, which is the worst case scenario for the old snapshot system, and another one that looks a little more like a real world example. Note that all of these benchmarks were conducted in Europe connecting to Pulumi Cloud, which runs in `us-west-2`, so exact numbers may vary on based on your location and internet connection. This should however give a good indication of the performance improvements.
58
+
Before getting into the more technical details, here's a number of benchmarks demonstrating what this new experience looks like. To run the benchmarks we picked a couple of Pulumi projects, one that can be set up massively parallel, which is the worst case scenario for the old snapshot system, and another one that looks a little more like a real world example. Note that all of these benchmarks were conducted in Europe connecting to Pulumi Cloud, which runs in `us-west-2`, so exact numbers may vary based on your location and internet connection. This should however give a good indication of the performance improvements.
59
59
60
-
We're benchmarking two somewhat large stacks, both of which are or were used at Pulumi. The first program sets up a website using AWS bucket objects. We're using the [example-ts-static-website](https://github.com/pulumi/examples/tree/master/aws-ts-static-website) example here, but expand it a little bit to set up what is a version of our docs site. This means we're setting up more than 3000 bucket objects, with 3222 resources in total.
60
+
We're benchmarking two somewhat large stacks, both of which are or were used at Pulumi. The first program sets up a website using AWS bucket objects. We're using the [example-ts-static-website](https://github.com/pulumi/examples/tree/master/aws-ts-static-website) example here, but expand it a little bit to set up a version of our docs site. This means we're setting up more than 3000 bucket objects, with 3222 resources in total.
61
61
62
-
The time for the benchmarks is used in the console using the `time` built-in command, and we're capturing the network traffic using `tcpdump`, and then use `tshark` to count the bytes.
62
+
The benchmarks were measured using `time` built-in command and using the best time in a best-of-three benchmarks. The network traffic using `tcpdump`, limiting the measured traffic to only the IP addresses for Pulumi Cloud. Finally `tshark` was used to process the packet captures and count the bytes sent.
63
+
64
+
All the benchmarks are run with journaling off (the default experience), with journaling on (the new experience), and finally with `PULUMI_SKIP_CHECKPOINTS=true` set. The last one means we skip uploading intermediate checkpoints to the backend, which in turn means potentially losing track of changes that are in flight if Pulumi exits unexpectedly due to any reason.
63
65
64
66
|| Time | Bytes sent |
65
67
|--------------------|--------|------------|
@@ -85,7 +87,9 @@ Pulumi keeps track of all resources in a stack in a snapshot. This snapshot is s
85
87
86
88
To make sure there are never any resources that are not tracked, even if a deployment is aborted unexpectedly (for example due to network issues, power outages, or bugs), Pulumi creates a new snapshot at the beginning and at the end of each operation.
87
89
88
-
At the beginning of the operation, Pulumi adds a new "pending operation" to the snapshot. Pending operations declare the intent to mutate a resource. If a pending operation is left in the snapshot (in other words the operation started, but Pulumi couldn't record the end of it), in the next operation Pulumi asks the user to check the actual state of the resource, and then either removes it from the snapshot, or imports it depending on the users input. This is because it is possible that the resource has been set up correctly, or it is possible that the resource creation failed. If Pulumi aborted midway through the operation it's impossible to know which it is.
90
+
At the beginning of the operation, Pulumi adds a new "pending operation" to the snapshot. Pending operations declare the intent to mutate a resource. If a pending operation is left in the snapshot (in other words the operation started, but Pulumi couldn't record the end of it), in the next operation Pulumi asks the user to check the actual state of the resource, and then either removes it from the snapshot, or imports it depending on the users input.
91
+
92
+
This is because it is possible that the resource has been set up correctly, or it is possible that the resource creation failed. If Pulumi aborted midway through the operation it's impossible to know which it is.
89
93
90
94
Once an operation finished, the pending operation is removed, as we now know the final state of the resource, and the final state of the resource is updated in the snapshot.
91
95
@@ -111,21 +115,21 @@ After this introduction, we can dive into what's slow, how we fixed it, and some
111
115
112
116
## Why is it slow?
113
117
114
-
To make sure the state is always as up-to-date as possible, even if there are any network hiccups/power outages etc., a step won't start until the snapshot that includes the pending operation. Similarly an operation won't be considered finished until the snapshot with an updated resources list is confirmed to be stored in the backend.
118
+
To make sure the state is always as up-to-date as possible, even if there are any network hiccups/power outages etc., a step won't start until the snapshot that includes the pending operation is confirmed to be stored in the backend. Similarly an operation won't be considered finished until the snapshot with an updated resources list is confirmed to be stored in the backend.
115
119
116
-
To send the current state to the backend, we simply serialize it as a JSON file, and send it to the backend. However, as mentioned above, steps can be executed in parallel. If we uploaded the snapshot at the beginning and end of every step with no serialization, there would be a risk that we overwrite the a new snapshot with an older one, leading to incorrect data.
120
+
To send the current state to the backend, we simply serialize it as a JSON file, and send it to the backend. However, as mentioned above, steps can be executed in parallel. If we uploaded the snapshot at the beginning and end of every step with no serialization, there would be a risk that we overwrite a new snapshot with an older one, leading to incorrect data.
117
121
118
122
Our workaround for that is to serialize the snapshot uploads, uploading one snapshot at a time. This gives us the data integrity properties we want, however it can slow step execution down, especially on internet connections with lower bandwidth, and/or high latency.
119
123
120
124
This impacts performance especially for large stacks, as we upload the whole snapshot every time, which can take some time if the snapshot is getting big. For the Pulumi Cloud backend we improved on this a little [at the end of 2022](https://github.com/pulumi/pulumi/pull/10788). We implemented a diff based protocol, which is especially helpful for large snapshots, as we only need to send the diff between the old and the new snapshot, and Pulumi Cloud can then reconstruct the full snapshot based on that. This reduces the amount of data that needs to be transferred, thus improving performance.
121
125
122
-
However the snapshotting is still a major bottleneck for large Pulumi deployments. Having to serially upload the snapshot twice for each step does still have a big impact on performance, especially if many resources are modified in parallel.
126
+
However, the snapshotting is still a major bottleneck for large Pulumi deployments. Having to serially upload the snapshot twice for each step does still have a big impact on performance, especially if many resources are modified in parallel.
123
127
124
128
## Fast, but lacking data integrity?
125
129
126
130
As long as Pulumi can complete its operation, there's no need for the intermediate checkpoints. It is possible to set the `PULUMI_SKIP_CHECKPOINTS` variable to a truthy value, and skip all the uploading of the intermittent checkpoints to the backend. This, of course, avoids the single serialization point we have sending the snapshots to the backend, and thus makes the operation much more performant.
127
131
128
-
However it also has the big disadvantage that it's compromising some of the data integrity guarantees Pulumi gives you. If anything goes wrong during the update, Pulumi has no notion of what happened until then, potentially leaving orphaned resources in the provider, or leaving resources in the state that no longer exist.
132
+
However, it also has the big disadvantage that it's compromising some of the data integrity guarantees Pulumi gives you. If anything goes wrong during the update, Pulumi has no notion of what happened until then, potentially leaving orphaned resources in the provider, or leaving resources in the state that no longer exist.
129
133
130
134
Neither of these solutions is very satisfying, as the tradeoff is either performance or data integrity. We would like to have our cake and eat it too here, and that's exactly what we're doing.
131
135
@@ -139,7 +143,7 @@ Making that happen is possible because of three facts:
139
143
- Every step the engine executes affects only one resource.
140
144
- We have a service that can reconstruct a snapshot from what is given to it.
141
145
142
-
(The third point here already hints at it, but this feature is only available and made possible by Pulumi Cloud, but not on the DIY backend).
146
+
(The third point here already hints at it, but this feature is only available and made possible by Pulumi Cloud, not on the DIY backend).
143
147
144
148
What if instead of sending the whole snapshot, or a diff of the snapshot, we could send the individual changes to the base snapshot to the service, which could then apply it, and reconstruct a full snapshot from it? This is exactly what we are doing here, in the form of what we call journal entries. Each journal entry has the following form:
145
149
@@ -190,7 +194,7 @@ type JournalEntry struct {
190
194
}
191
195
```
192
196
193
-
These journal entries encode all the information needed to reconstruct the snapshot from them. Each journal entry can be sent in parallel from the engine, and the snapshot will still be fully valid. All journal entries have a Sequence ID attached to them, and they need to be replayed in that order on the service side to make sure we get a valid snapshot. It is however okay to replay with journal entries that have not yet been received by the service, and whose sequence ID is thus missing. This is because the engine only sends entries in parallel whose parents/dependencies have been fully created and confirmed by the service.
197
+
These journal entries encode all the information needed to reconstruct the snapshot from them. Each journal entry can be sent in parallel from the engine, and the snapshot will still be fully valid. All journal entries have a Sequence ID attached to them, and they need to be replayed in that order on the service side to make sure we get a valid snapshot. It is however okay to replay with journal entries that have not yet been received by the service, and whose sequence ID is thus missing. This is safe because the engine only sends entries in parallel whose parents/dependencies have been fully created and confirmed by the service.
194
198
195
199
This way we make sure that the resources list is always in the correct partial order that is required by the engine to function correctly, and for the snapshot to be considered valid.
0 commit comments