mcurl: support passing curl options with various new features and fixes #10
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
mcurl確實是個好東西! 👍在我的一個項目流程中,需要在一台運行舊 linux 的 vm 上對使用 Digest Auth 驗證的網站做下載。
但大多數的多線程下載工具比如
aria2都不支持,而wget2又無法在這台 vm 上安裝,就只剩wget和curl是支持: https://curl.se/docs/comparison-table.html所以我就轉成尋找讓 wget / curl 變成多線程下載的方法,最後找到這個
mcurl!然而當前版本的
mcurl沒有選項支持傳入自定義 curl 選項,比如--digest -u "${USER}:${PASSWD}"等。我本來是直接修改到 script 內使用,但這樣的用法並不 general。
於是我就嘗試對 script 進行更多改動,測試過程中也 fix 了些 bug。
越改越多東西,所以不如也開個 PR 😃
改動有點多,我盡可能將 commit 拆分得仔細和清晰一些
新功能
url後的所有 argument 都當成 curl options./mcurl.sh -s4 https://some.url -L --digest -u "${USER}:${PASSWD}"-L --digest -u "${USER}:${PASSWD}"都會傳到底層的 curl 調用ctrl+c中斷 script 時做 clean upkill -- -$$連同 background 的 curl process 一併 kill 掉rm -i方式進行詢問,不回覆y的話會取消操作-f|--force選項,讓底層轉用rm -f不作提示downloaded file size * 100 / total size以顯示百分比gitbashpgrep,我在 AI 建議下改用了jobs,應該是 portable 的pkill "^curl"來測試優化
mv就可以了,節省一次cat修復
du 1024 blocksize方式計算 size 不準確wc -c🤔strace查看過,wc -c {files}底層是會用stat()方式而不會實際 read filefstat()後再lseek(): https://github.com/coreutils/coreutils/blob/3a5c9c5537227eafc38c5657024584cdad63112a/src/wc.c#L340C1-L369C34running jobs == 0的判斷失效:當前確實是0,但實際上還有 jobs 未被 spawnrunning jobs == 0測試過的環境
GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)GNU bash, version 5.2.37(1)-release (x86_64-pc-msys)GNU bash, version 5.2.21(1)-release (x86_64-pc-cygwin)GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin20)GNU bash, version 4.2.46(1)-release (x86_64-redhat-linux-gnu)另這個網站有不同 size 的 test file,有需要可以做更多測試: https://www.thinkbroadband.com/download
English Version (translated by deepseek-v3)
click to toggle
mcurlis truly a great tool! 👍During one of my project workflows, I needed to download from a website using Digest Auth on an old Linux VM. However, most multi-threaded download tools like
aria2don't support this, andwget2couldn't be installed on this VM, leaving onlywgetandcurlas options: https://curl.se/docs/comparison-table.htmlSo I started looking for ways to make wget/curl work with multi-threaded downloads, and eventually found
mcurl! However, the current version ofmcurldoesn't support passing custom curl options like--digest -u "${USER}:${PASSWD}", etc. I initially modified the script directly for my use case, but this approach wasn't generalizable. Then I attempted to make more modifications to the script, fixing some bugs along the way during testing.The changes kept growing, so I thought it would be better to open a PR 😃 There are quite a few modifications, and I've tried to keep the commits as detailed and clear as possible.
New Features
urlas curl options./mcurl.sh -s4 https://some.url -L --digest -u "${USER}:${PASSWD}"-L --digest -u "${USER}:${PASSWD}"will be passed to the underlying curl callsctrl+cinterruptionkill -- -$$to also kill background curl processesrm -ifor confirmation, canceling operation if response isn'ty-f|--forceoption to userm -fwithout promptingdownloaded file size * 100 / total sizeto show percentagegitbashpgrepbuilt-in, replaced withjobs(more portable)pkill "^curl"Optimizations
mvfor the first slice to save onecatoperationFixes
du 1024 blocksizegave inaccurate size calculationswc -cseems more portablestracethatwc -c {files}usesstat()without actual file readingfstat()thenlseek(): https://github.com/coreutils/coreutils/blob/3a5c9c5537227eafc38c5657024584cdad63112a/src/wc.c#L340C1-L369C34running jobs == 0check to fail (technically correct but jobs not spawned yet)running jobs == 0in main loopTested Environments
GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)GNU bash, version 5.2.37(1)-release (x86_64-pc-msys)GNU bash, version 5.2.21(1)-release (x86_64-pc-cygwin)GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin20)GNU bash, version 4.2.46(1)-release (x86_64-redhat-linux-gnu)This site has test files of various sizes if more testing is needed: https://www.thinkbroadband.com/download