Automated website visual change monitoring using GitHub Actions, Playwright, and Telegram notifications.
- Simulates real user visits through Playwright's headless Chromium browser with full page load until
networkidle - Extracts key text content (warnings and update dates) from HTML for notifications
- Captures screenshots of a specific page region between two text markers and compares MD5 hashes with the previous state
- Sends notifications to a Telegram work channel with text, screenshot, and a "Subscribe" button when changes are detected
- Sends detailed technical logs to a separate log channel after each run (successful or failed)
Current average GitHub Actions runtime for this workflow is about 46 seconds, which is acceptable for a 5-minute cron schedule and leaves enough room for retries or temporary slowdowns.
The target website does not provide stable text or images that can be reliably parsed and compared between runs. The area of interest consists of many dynamic blocks with colors and variables loaded from external sources, making DOM/text-based comparison fragile even when actual content hasn't changed.
The chosen approach:
- Full page render in Playwright browser
- Identifies visual region boundaries using static text anchors (e.g., "Дата оновлення інформації" and last occurrence of "робіт")
- Captures a screenshot of this zone and calculates MD5 hash of the image bytes for comparison with
last_hash.json
This provides stable "visual regression" monitoring: the tool reacts only to actual visual changes, not internal technical markup modifications.
The script implements two independent Playwright sessions with different responsibilities:
- Loads the page, retrieves HTML, and extracts required text (warnings, dates, metadata) using BeautifulSoup
- Provides clean parsing logic without dependencies on element coordinates or scroll operations
- Separately opens the page, finds anchor elements via Playwright locators, and constructs a screenshot rectangle between them
- Runs in an isolated browser lifecycle to ensure previous actions don't affect layout and positioning during screenshot capture
This separation:
- Reduces code coupling: each function solves its own task (content vs. visual) and can evolve independently
- Makes behavior more deterministic in CI environments, where side effects from one scenario can cause flaky failures in visual tests
- Provides clear architecture instead of a "monolithic" scenario with hard-to-debug mixed logic
-
monitor.py– main monitoring logic:- content retrieval (
get_schedule_content) - screenshot capture and MD5 hash calculation (
take_screenshot_between_elements) - state persistence in
last_hash.json - Telegram channel and log channel messaging
- global exception handler for convenient debugging
- content retrieval (
-
last_hash.json– stores previous state: message hash, text, date, screenshot hash, and timestamp of last successful run -
.github/workflows/monitor.yml– GitHub Actions workflow:ubuntu-latestenvironment with Python 3.11, dependency installation, and Playwright setup- scheduled runs (cron with minimum 5-minute intervals per GitHub Actions limitations) and manual trigger via
workflow_dispatch - commits updated
last_hash.jsonback to repository, skipping re-trigger with[skip ci]in commit message
Create the following secrets in your repository (Settings → Secrets and variables → Actions):
URL– target page for monitoringTELEGRAM_BOT_TOKEN– Telegram bot tokenTELEGRAM_CHANNEL_ID– main work channel for notificationsTELEGRAM_LOG_CHANNEL_ID– log channel for technical logsSUBSCRIBE– URL or deep link for the subscribe buttonPAT_TOKEN– personal GitHub token with push permissions to this repository
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txt
playwright install chromium
python monitor.py
For local runs, environment variables (URL, Telegram tokens, etc.) can be passed via .env file or export/set in terminal.
This project is licensed under the MIT License – see the LICENSE file for details.