Investigate Responsible Web Scraping enhancements

Consider adding functionality related to responsible web scraping/crawling.

- [ ] Respect crawl delay and other non-standard info from robots.txt
- [ ] Include info from response headers and meta tags (see [here](https://developers.google.com/search/docs/advanced/crawling/block-indexing))

Reference: https://www.zyte.com/blog/how-to-crawl-the-web-politely-with-scrapy/