"Welcome, traveler. I am Scrybe. You have just gifted me a fingerprint.
My task is to remember it, enrich it, and test its truth."
Scrybe is a high-fidelity, Rust-powered browser observation system designed to detect and understand automation with forensic granularity. It is equal parts data collector, behavior profiler, and session fingerprint historianβengineered to act as a sophisticated anti-bot detection engine and training ground for resilient bot defenses.
More than a passive observer, Scrybe is a vigilant system that watches browsers with contextual memory and scientific rigor. Its mission is not just to block botsβit's to understand them, adapt to them, and learn from every interaction.
Species: Autonomous Rust Intelligence
Personality: Scholarly, curious, and unflinchingly meticulous
Scrybe documents all who visit its domainβnot to judge, but to remember. Every movement, header, and anomaly becomes a piece of a broader behavioral mosaic.
- Humans find Scrybe charming
- Bots find it uncanny
Canvas, WebGL, and audio fingerprinting:
- Multi-layer canvas tests (anti-spoofing)
- Font enumeration patterns
- DOM feature detection
- WebDriver presence analysis
ML-driven behavioral baselines:
- Percentile-based thresholds (adaptive)
- Deviation vector flagging
- Fingerprint similarity clustering (MinHash)
- Real-time anomaly scoring
GDPR-compliant from the ground up:
- Zero PII collection
- Salted hash fingerprints
- Explicit consent for EU visitors
- Data Processing Agreement templates
- 90-day automatic retention
ββββββββββββββ βββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ
β Browser β ββ> β Ingestion β ββ> β Enrichment & ML β ββ> β ClickHouse β
β (JS SDK) β β Gateway/API β β Fingerprinting β β Storage β
ββββββββββββββ βββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββ
β β
βΌ βΌ
ββββββββββββββββββ ββββββββββββββββββ
β Session Cache β β Analyst UI β
β (Redis) β β Dashboard β
ββββββββββββββββββ ββββββββββββββββββ
- Core Engine: Rust (TigerStyle compliant)
- JavaScript SDK: TypeScript with bounded collections
- Storage: ClickHouse (columnar analytics)
- Session Cache: Redis (sub-millisecond lookups)
- ML Pipeline: Percentile-based anomaly detection
- Security: HMAC-SHA256 auth, TLS 1.3, nonce validation
| Metric | Target | Status |
|---|---|---|
| Ingestion throughput | 100k sessions/sec | π― Designed |
| Query latency (p99) | < 100ms | π― Designed |
| Fingerprint generation | < 5ms | π― Designed |
| Redis lookup | < 1ms | π― Designed |
| Storage compression | 10-20:1 ratio | π― Designed |
- β HMAC-SHA256 API authentication
- β Anti-replay protection (nonce validation)
- β Bounded collections (DoS prevention)
- β Rate limiting per IP and session
- β Security headers (HSTS, CSP, X-Frame-Options)
- β Graceful degradation (circuit breakers)
- β IP hashing (SHA-256 salted)
- β No PII collection
- β GDPR Article 6(1)(a) compliance
- β Explicit consent for EU visitors
- β Data Processing Agreement templates
- β Right to erasure (delete by fingerprint)
- β 90-day TTL with automatic cleanup
This repository contains comprehensive RFC documentation (v0.2.0):
- RFC-0001 - Core Architecture
- RFC-0002 - JavaScript SDK (Browser Agent)
- RFC-0003 - Rust Ingestion Gateway
- RFC-0004 - Fingerprinting & Enrichment
- RFC-0005 - ClickHouse Storage Schema
- RFC-0006 - Redis Session Management
- RFC-0007 - Security & Privacy
Additional Resources:
- Vision Document - Complete product vision
- RFC Index - Master index
- Changelog v0.2.0 - Recent updates
- Review Summary - Design review findings
Scrybe follows TigerStyle principles:
- Safety First - No panics, all errors via
Result - Simplicity - Clear over clever, explicit over implicit
- Correctness - Type-driven design, >90% test coverage
- Performance - Fast by default, profile before optimizing
- Minimal Dependencies - Each dependency justified
At 10,000 requests/second sustained:
| Component | Monthly Cost | Optimization Potential |
|---|---|---|
| ClickHouse (90-day retention) | $3,200 | 66% with 30-day retention |
| Redis (1-hour session cache) | $1,200 | Optimized |
| Data Transfer | $270 | 90% with 10% sampling |
| Backups (S3) | $700 | - |
| Total | $7,264/month | $2,200/month (optimized) |
Version: v0.2.0 (RFC Phase)
Status: π― Design Complete - Ready for Implementation
- β Complete RFC suite (7 documents)
- β Multi-disciplinary review (10 expert perspectives)
- β All critical blockers addressed
- β Security hardening (authentication, replay protection)
- β GDPR compliance (consent, DPA templates)
- β Production readiness (health checks, disaster recovery)
- π¨ Phase 1: Core infrastructure (Weeks 1-2)
- π Phase 2: Security features (Weeks 3-4)
- π§ͺ Phase 3: SDK & enrichment (Weeks 5-6)
- πΎ Phase 4: Storage & reliability (Weeks 7-8)
- β Phase 5: Testing & hardening (Weeks 9-10)
Timeline: 10 weeks to production-ready system
This is a private repository. Contributions are welcome from authorized collaborators.
- Follow TigerStyle guidelines
- Maintain >90% test coverage
- Document all public APIs
- No
unwrap()orpanic!()in production code - Explicit error handling with context
Private & Proprietary
"The best defense is not to be invisible, but to be understood."
Scrybe doesn't just detect botsβit studies them. Every fingerprint, every behavioral anomaly, every timing quirk becomes part of a living knowledge base. The system learns, adapts, and evolves.
Like its namesake suggests, Scrybe is both scribe (recorder of truth) and scrying (diviner of hidden meaning). It sees not just what browsers do, but what they are.
Built with Rust π¦ | Powered by Curiosity π¦ | Guided by TigerStyle π―
