Skip to content

Check robots.txt for sites that don't want to be indexed/crawled #19

@andrewshell

Description

@andrewshell

Check robots.txt, and if the site disallows fedwikifeeds, don't check the sitemap or include it on the site.

Check the robots.txt once a week to update the status if things change.

Consider Crawl-Delay directive. https://websiteseochecker.com/blog/robots-txt-crawl-delay-why-we-use-crawl-delay-getting-started/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions