Skip to content

Conversation

@dpmerrell
Copy link

From the docs (https://cloud.google.com/storage/docs/gsutil/addlhelp/GlobalCommandLineOptions)

Using the -m option can consume a significant amount of network bandwidth and cause problems or make your
performance worse if you use a slower network. For example, if you start a large rsync operation over a network
link that's also used by a number of other important jobs, there could be degraded performance in those jobs.
Similarly, the -m option can make your performance worse, especially for cases that perform all operations
locally, because it can "thrash" your local disk.

To prevent such issues, reduce the values for parallel_thread_count and parallel_process_count, or stop using the
-m option entirely. One tool that you can use to limit how much I/O capacity gsutil consumes and prevent it from
monopolizing your local disk is ionice (built in to many
Linux systems).

I would guess the typical network speeds between GCP storage and compute are fast. A complete/robust solution may involve configuring the parallel_thread_count or parallel_process_count, or invoking ionice.

Thought I would open this PR anyways, just to get this started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant