WGMM is a Bilibili video monitor that adjusts its polling interval with a
Weighted Gaussian Mixture Model. It keeps the public entrypoint simple
(python monitor.py) while the implementation is split into a small modular
monolith under wgmm_monitor/.
The goal is practical monitoring: detect new uploads and new multi-part video sections promptly while reducing unnecessary network requests.
- Python 3.14+
- The project
.venvvirtual environment yt-dlpavailable onPATH- Bilibili cookies exported to
data/cookies.txt - GitHub Gist and Bark credentials in
data/.env
Check the external yt-dlp executable:
which yt-dlp
yt-dlp --versioncp data/.env.example data/.env
nano data/.envRequired keys:
GITHUB_TOKEN=your_github_token
BARK_DEVICE_KEY=your_bark_key
GIST_ID=your_gist_id
BILIBILI_UID=your_bilibili_uid
BARK_APP_TITLE=your_app_titleCreate data/cookies.txt in Netscape cookie format. The application validates
that this file exists and is not empty before starting.
source .venv/bin/activate
python monitor.py
python monitor.py --dev
python monitor.py --wgmm-core-onlyMode behavior:
python monitor.py: production loop, waits untilnext_check_time, runs one monitor cycle, then repeats.python monitor.py --dev: runs the full detection chain once, does not write WGMM config, and does not send new-video notifications.python monitor.py --wgmm-core-only: runs one WGMM frequency decision and skips Bilibili video detection.
sudo systemctl status video-monitor
sudo systemctl start video-monitor
sudo systemctl stop video-monitor
sudo systemctl restart video-monitor
sudo journalctl -u video-monitor -fwgmm/
├── monitor.py # Thin entrypoint: wgmm_monitor.cli.main()
├── wgmm_monitor/
│ ├── cli.py # Argument parsing and mode selection
│ ├── app.py # Runtime object assembly
│ ├── config.py # data/.env loading
│ ├── models.py # RuntimePaths, AppConfig, WgmmConfig, results
│ ├── runtime_logger.py # Console, urls.log, critical_errors.log
│ ├── clients/
│ │ ├── bark.py # Bark HTTP client
│ │ ├── gist.py # GitHub Gist API client
│ │ └── ytdlp.py # yt-dlp subprocess wrapper
│ ├── services/
│ │ ├── monitor.py # Main three-layer monitoring flow
│ │ ├── bilibili.py # Bilibili and yt-dlp operations
│ │ ├── frequency.py # WGMM service orchestration
│ │ ├── history.py # Upload timestamp generation/maintenance
│ │ └── notification.py # Notification messages
│ ├── stores/
│ │ ├── config_store.py # data/wgmm_config.json
│ │ ├── history_store.py # data/mtime.txt and miss_history.txt
│ │ └── url_store.py # data/local_known.txt
│ ├── wgmm/
│ │ ├── constants.py # Algorithm defaults
│ │ ├── features.py # Time feature extraction
│ │ ├── learning.py # Lambda, weights, sigma, period discovery
│ │ ├── scheduler.py # Next-frequency decision
│ │ └── scoring.py # Point and batch score calculation
│ └── utils/
├── tests/ # unittest coverage for stores/services/WGMM
├── docs/
├── requirements.txt
├── pyproject.toml
└── video-monitor.service
Runtime files:
data/.env # Manual, ignored
data/cookies.txt # Manual, ignored
data/local_known.txt # Generated local URL state
data/wgmm_config.json # Generated WGMM state
data/mtime.txt # Generated positive upload history
data/miss_history.txt # Generated negative check history
urls.log # Main runtime log
critical_errors.log # Critical runtime log
The app uses two URL sets:
memory_urls: URLs read from GitHub Gisturls.txt.known_urls: local complete known state, loaded fromdata/local_known.txtand merged withmemory_urls.
Only URLs missing from both layers are treated as truly new content.
GitHub Gist urls.txt
-> memory_urls
+ data/local_known.txt
-> known_urls
-> compare against current Bilibili scan
-> truly new URLs
-> Bark notification + Gist new.txt update
The monitor flow is implemented in wgmm_monitor/services/monitor.py:
- Sync known URLs from Gist.
- Run multi-part precheck.
- Run latest-video ID precheck.
- If either precheck finds a change, fetch and expand the full video list.
- Save real upload timestamps for new URLs.
- Notify through Bark and write
new.txtto Gist. - Ask WGMM for the next check time.
The pure algorithm layer is under wgmm_monitor/wgmm/.
features.pyencodes day, week, month-week, year-month, and optionalcustom_Nperiods with sin/cos features.learning.pyfilters outliers, learns adaptive lambda/sigma/weights, and discovers non-calendar periods with autocorrelation when enough data exists.scoring.pycomputes positive and negative weighted Gaussian scores.scheduler.pyscans the next 15 days, maps relative score to an interval, applies peak advance based on observedyt-dlpduration, and updatesWgmmConfig.
See docs/wgmm-algorithm.md and docs/wgmm-config-params.md.
source .venv/bin/activate
ruff check monitor.py wgmm_monitor tests
ruff format monitor.py wgmm_monitor tests
python -m unittest discover -s tests
python monitor.py --wgmm-core-only
python monitor.py --devFor code changes, run Ruff and the full unittest suite. Use
--wgmm-core-only to isolate the scheduler from Bilibili/Gist/Bark calls, and
use --dev to exercise the full detection chain without writing WGMM config or
sending new-video notifications.
Useful checks:
source .venv/bin/activate
which yt-dlp
yt-dlp --version
ls -l data/cookies.txt
tail -100 urls.log
cat critical_errors.log
sudo journalctl -u video-monitor -n 100Common cases:
- Missing environment variables: startup prints
缺少必要的环境变量and exits. - Missing or empty cookies: startup logs a critical error and exits.
yt-dlpmissing fromPATH:YtDlpClientlogs an error and returns a failed result; checkwhich yt-dlp.- Gist fetch failure: the cycle logs a critical error. If there is no baseline URL data, the cycle is skipped.
- Bilibili rate limiting or split expansion failure: the cycle logs a warning, skips detection for that run, and records a non-new-content WGMM decision.
- Notification failure: the monitor logs the failure; URL state and frequency decisions still proceed.
- README_CN.md: Chinese user guide
- CONTRIBUTING.md: contribution workflow
- docs/development-guide.md: development and troubleshooting guide
- docs/code_logic_flow.md: current architecture flow
- docs/code-reference.md: module reference
- docs/wgmm-algorithm.md: algorithm details
- docs/wgmm-config-params.md: config fields
- docs/wgmm-universality-analysis.md: algorithm applicability analysis
- docs/adr/002-do-not-adopt-x-algorithm-techniques.md
- docs/adr/003-avoid-large-refactoring.md
- docs/adr/004-fix-cascade-false-detection.md
data/.envanddata/cookies.txtare ignored by Git.- Do not commit Gist tokens, Bark keys, cookies, or logs with sensitive content.
- The systemd unit should run with the provided project paths and sandbox settings.
MIT License