Skip to content

Linkcheck runtime improvement #567

@s-makin

Description

@s-makin

During my experiments with AI, I was able to wring out a genuine improvement to the linkchecker runtime, with a simple configuration change. In my docs (which are large, and have well over a thousand links to check), the linkchecker takes about 10 minutes with the default settings currently in the SP:

# give linkcheck multiple tries on failure
# linkcheck_timeout = 30
linkcheck_retries = 3

If the settings are changed instead to:

# Give linkcheck multiple tries on failure
linkcheck_timeout = 15
linkcheck_retries = 2

# Number of parallel workers for linkcheck (default is 5)
# Higher values work well for network I/O-bound tasks
linkcheck_workers = 20

I cut down the runtime to 1.5 minutes (over 80% improvement).

I'm leaving this here as an issue rather than a PR because I'm not sure how the workers setting would impact other projects - the parallelism works well in the Server docs, but I could see a case for not including it in the SP if a project contains a high volume of links to a specific site, which could lead to linkcheck failures due to rate limiting. This part needs more general testing (but I recommend the timeout and retries both be reduced for the general performance improvement).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions