This module allows Spiderfoot to search local databases. This might be useful, when you want to include databases of leaks and breaches.
You can find the original Spiderfoot project here: https://github.com/smicallef/spiderfoot. They are amazing <3!
It is not included in Spiderfoots standard installation, because it is not how Spiderfoot was designed to work. But because I have developed this module, since I think it is useful, I wanted to make it available to others.
To use it, just copy the sfp_local_data.py to the modules directory of your Spiderfoot installation
If you want to read more about it, you can check out my blog: https://security-by-accident.com/beyond-the-web-spiderfoot/
Datasets have to be stored on the local machine. If you want to make them available for Spiderfoot, you have to include their full path in the settings of the module. Multiple files can be specified in a comma separated list. Below the path configuration you can select what kinds of data the module should listen for. So, if that kind of data is found during your scan, the module will automatically start searching for it in your local files. Do not forget to click “Save Changes”.
Afterwards you can select the module in the scan settings in the “By Module” tab.
If the scan finds anything in your local dataset, it will report the filename, line and content back as “Raw Data from RiRs/APIs”.
This implementation was able to search a 30GB file in 3 minutes, which will slow the search process down, if a lot of data is present. Possible speed improvements could be done by implementing ripgrep, but this would require additional libraries. If you have any improvements, I am looking forward to your Pull Requests on Github.

