Open-source GitHub scrapers in two forms:
- copy-pasteable Olostep parser scripts for the Parsers dashboard
- a local Chrome extension that parses the active GitHub tab in the popup
This first version supports:
- repository root pages:
https://github.com/{owner}/{repo} - personal user profile root pages:
https://github.com/{username}
Unsupported in v1:
- organization pages
- repository subpages like
issues,pulls,actions - profile tab routes like
?tab=repositories
parsers/github-repository.parser.js: Olostep parser for repository root pagesparsers/github-user-profile.parser.js: Olostep parser for user profile root pagesextension/: Manifest V3 Chrome extensiondocs/github-parser-notes.md: parser and blog-post notes
Each parser is written to match the observed Olostep dashboard contract:
async function parse(htmlString, pageUrl) {
// pageUrl is optional
}The implementation relies on htmlString and DOMParser, so the same file stays portable between the dashboard and API execution.
- Open the Parsers dashboard.
- Create a new parser.
- Paste either
parsers/github-repository.parser.jsorparsers/github-user-profile.parser.js. - Add a parser name and a GitHub run target URL.
- Save and continue.
Olostep invokes saved parsers through the scrape API using the parser id.
const endpoint = "https://api.olostep.com/v1/scrapes";
const payload = {
formats: ["json", "html"],
parser: { id: "@your-parser-id" },
url_to_scrape: "https://github.com/octocat"
};Observed response shape:
- parsed JSON appears in
result.json_content - raw HTML appears in
result.html_content - hosted artifacts appear in
result.json_hosted_urlandresult.html_hosted_url
The extension runs locally and does not call the Olostep API. It mirrors the same parsing logic in the popup.
- Open
chrome://extensions - Enable Developer Mode
- Click Load unpacked
- Select the
extensionfolder
- Open a supported GitHub repository root or personal profile root.
- Open the extension popup.
- Click
Parse current page. - Review or copy the generated JSON.
Repository parser fields:
successtypetimestampurlownernamefullNamedescriptionprimaryLanguagestarsforkswatcherslicensetopicsreadmeSummary
User profile parser fields:
successtypetimestampurlusernamedisplayNamebiofollowersfollowingcompanylocationwebsitejoinDatesocialLinkspinnedRepositories
- Rotate any Olostep API key that has been pasted into logs or chat.
- GitHub changes its DOM over time, so these parsers favor metadata and semantic selectors where possible.


