Personal job aggregator. Polls hh.ru / hh.kz, career.habr.com and public Telegram channels for QA Automation, Manual QA and ML/Data Analyst openings. Deduplicates, writes results to a Google Sheet and sends Telegram notifications with an inline "hide" button.
diepoor/
├── diepoor/ source package
│ ├── sources/ hh, habr_career, getmatch, telegram_channels
│ └── sinks/ telegram_bot, google_sheets
├── scripts/ seed_dedup, prepare_telethon_session, cleanup_sheets
├── cloudflare-worker/ inline-button handler (JS)
├── .github/workflows/ run.yml (poll), cleanup.yml (nightly)
├── config.yaml profiles, queries, filters
└── channels.txt Telegram channels list
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env # fill in tokens
./run.sh --dry-run # validate sources without writing
./run.sh # real run- Fork / clone.
- Add repository secrets (Settings → Secrets and variables → Actions):
| Secret | Source |
|---|---|
TELEGRAM_BOT_TOKEN |
@BotFather → /newbot |
TELEGRAM_CHAT_ID |
Your numeric chat id (see Telegram setup below) |
GOOGLE_SHEET_URL |
Full URL of the target spreadsheet |
GOOGLE_CREDENTIALS_JSON |
Entire JSON of a Google service account |
TELEGRAM_API_ID / TELEGRAM_API_HASH / TELEGRAM_PHONE |
Optional, for Telegram channels source |
TELETHON_SESSION_B64 |
Optional, see scripts/prepare_telethon_session.py |
- Actions →
diepoor→ Run workflow. Polling runs every 15 minutes; cleanup runs nightly at 22:00 UTC.
- Create a project at console.cloud.google.com.
- Enable Google Sheets API and Google Drive API.
- IAM → Service Accounts → Create service account → Keys → Add key → JSON.
- Save the file as
credentials/gcp-service-account.json. - Share the spreadsheet with the
client_emailfrom the JSON as Editor.
@BotFather→/newbot→ store the token.- Open the bot chat and press Start (otherwise the bot cannot send you messages).
https://api.telegram.org/bot<TOKEN>/getUpdates— find yourchat.idin the response.
Each notification has a 🗑 button. Handling it requires a small webhook — the Cloudflare Worker
in cloudflare-worker/. Deploy:
cd cloudflare-worker
npx wrangler login
npx wrangler secret put TELEGRAM_BOT_TOKEN
npx wrangler secret put ALLOWED_CHAT_ID
npx wrangler deployRegister the webhook with Telegram:
curl "https://api.telegram.org/bot<TOKEN>/setWebhook?url=<WORKER_URL>"Without the webhook the button is visible but inert.
Profiles and filters live in config.yaml:
profiles:
- name: qa_auto_junior
queries: ["QA automation junior", "автотестировщик junior"]
experience: [noExperience, between1And3]
keywords_must_any: [automation, selenium, playwright, pytest, sdet]
keywords_exclude: [senior, lead, manager]| Source | Notes |
|---|---|
hh |
HTML scraping of hh.ru/search/vacancy via data-qa selectors. Multiple hosts via area codes (113 RU, 40 KZ). |
habr_career |
HTML scraping of career.habr.com/vacancies. Extracts grade and work mode from the meta block. |
getmatch |
Disabled — listings are rendered only for authenticated users. |
telegram_channels |
Telethon reader, channel list in channels.txt. |
One worksheet Вакансии with columns:
Дата найдена, Источник, Профиль, Компания, Должность, Локация, Формат, Зарплата,
Опыт, Ссылка, Дедлайн, Статус, Заметки.
Two columns — Статус and Заметки — are yours to fill in (applied, interview, rejected, etc.).
Nightly cleanup:
- rows older than 7 days move to the
Архивsheet - rows in
Архивolder than 30 days with emptyСтатусare dropped
python main.py [--dry-run] [--verbose]
python scripts/seed_dedup.py # seed dedup DB from existing Sheet
python scripts/cleanup_sheets.py # archive + prune
python scripts/prepare_telethon_session.py # print base64 Telethon sessionIf you prefer running on your own machine instead of GitHub Actions:
cp launchd/com.dbkarashev.diepoor.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.dbkarashev.diepoor.plistThis project is licensed under the MIT License.