This is a demo of Github Code Search using Anaconda as a virtual environment and PyGithub to abstract the API calls
Activate the Anaconda environment using:
conda activate ./env
Run demo once the environment is activated with:
python3 search.py
The demo performs 2 queries that look for HuggingFace signatures using the GitHub code search API. The MODULE_SEARCH_QUERY searches for the huggingface_hub api being included as a module within any projects, while the API_SEARCH_QUERY looks for any calls to the huggingface website. These queries were constructed using the help of the REST API search endpoint and GitHub Code Search Syntax
It is worth noting that there is a discrepancy in results returned by each query.
Each query when performed through the github website returns 50k+ files, but only seems to return around 800 when running the demo.
We are unsure why, but a good place to start might be in what is actually indicated by results.totalCount.
It may that this returns the number of pages, and that the actually number of files is consistent regardles of how the query is made.
Here is the module query when made through the website
Here is the api_query when made through the website