Security: harden tile proxy against third-party abuse (#204)#205
Merged
Security: harden tile proxy against third-party abuse (#204)#205
Conversation
…d budget, and auth rate limits (#204) - Replace fixed-window rate limiter with sliding-window counter approximation to prevent boundary-batching attacks - Add authenticated user rate limiting by userId (default 2000/min) — previously authenticated users bypassed rate limiting entirely - Add outbound request budget (token-bucket at 2/sec, burst 4) to prevent cache-miss cascading from overwhelming upstream OSM - Add X-Content-Type-Options: nosniff on tile responses - Clean shutdown of outbound budget replenisher via IHostApplicationLifetime
…lity - CRITICAL: Add TileRateLimitAuthenticatedPerMinute to admin settings UI, controller tracking, and property assignment; fix incorrect "never limited" text - HIGH: Reduce outbound burst to 2 (OSM 2-connection policy), increase timeout to 10s for cold-cache resilience - HIGH: Fix CTS disposal race in StopReplenisher — cancel without disposing, let GC handle abandoned CTS; mark fields volatile for cross-thread visibility - HIGH: Bump default anonymous rate limit from 500 to 600 to compensate for stricter sliding-window algorithm - MEDIUM: Fall back to IP-based rate limiting when authenticated user lacks NameIdentifier claim instead of pooling into shared "unknown" bucket - MEDIUM: Update sliding-window jitter docs to reflect actual error bound (prevCount × ~0.5 during rotation, not "~1") - MEDIUM: Separate Stop() for production shutdown from ResetForTesting() to avoid unnecessary drain/refill during app shutdown - MEDIUM: Add test for outbound budget exhaustion returning false - LOW: Fix eviction _currentCacheSize undercount — decrement after SaveChangesAsync succeeds, not before - LOW: Move File.Exists check inside _cacheLock in eviction to fix TOCTOU - LOW: Coalesce concurrent cleanup runs with Interlocked flag; reduce cleanup threshold from 100k to 10k entries
Add hidden field fallback for TileRateLimitEnabled and IsRegistrationOpen checkboxes so unchecking them correctly posts "false" to the model binder. Without the hidden field, unchecked checkboxes send no value and ASP.NET falls back to the C# property default (true for TileRateLimitEnabled), making it impossible to disable rate limiting from the admin UI.
…d sweep, concurrency - CRITICAL: Reverse eviction order — commit DB deletions before file deletes to prevent orphaned DB records when SaveChangesAsync fails - MEDIUM: Replace shared static _cleanupInProgress with per-cache flag so independent caches (anonymous, authenticated, image proxy) don't block cleanup - MEDIUM: Add RateLimitCleanupJob (Quartz, every 5 min) to sweep expired entries from all rate limit caches, preventing unbounded memory growth below threshold - LOW: Reorder OutboundBudget.StopReplenisher to cancel-then-replace, eliminating brief window where two replenishers overlap and double-release tokens - LOW: Log warning when authenticated user lacks NameIdentifier claim and falls back to IP-based rate limiting (safe-side default, but diagnostically useful) - LOW: Validate X-Forwarded-For with IPAddress.TryParse before using as rate limit key to prevent arbitrary strings from creating unique buckets
…s, scoping, retry - CRITICAL: Remove global read-lock from tile cache — file reads no longer serialize through _cacheLock; catch IOException as cache miss instead. Writes and deletes retain the exclusive lock. - CRITICAL: Increase outbound budget burst capacity from 2 to 10. OSM's 2-connection policy now enforced at transport layer via SocketsHttpHandler.MaxConnectionsPerServer instead of semaphore capacity. - HIGH: Add eviction coalescing with Interlocked.CompareExchange guard — concurrent CacheTileAsync calls can no longer trigger simultaneous eviction runs. Add DbUpdateConcurrencyException handling in eviction. - HIGH: EvictDbTilesAsync now creates its own IServiceScope instead of using per-request _dbContext, preventing disposed-context failures. - HIGH: CacheTileAsync breaks immediately on outbound budget exhaustion instead of retrying 3x (up to 30s thread block). - MEDIUM: Add cross-field validation — authenticated rate limit must be >= anonymous rate limit in admin settings. - MEDIUM: Document rate limit strategy (auth/anon mutually exclusive).
…est robustness - MEDIUM: Split 304 path by zoom level — zoom >= 9 reads without lock (no sidecar write needed), consistent with other lock-free read paths. Zoom < 9 retains lock for atomic sidecar write + file read. - MEDIUM: PurgeAllCacheAsync now uses DB-first ordering (consistent with EvictDbTilesAsync) — batch DB deletions commit before file deletes. If DB fails, no files are deleted and cache stays consistent. - MEDIUM: Retry loop acquires outbound budget token only on first attempt. Retries on HTTP 5xx reuse the already-consumed token via skipBudget parameter, preventing upstream failures from exhausting burst budget. - MEDIUM: TilesControllerTests clears static rate limit caches in constructor and uses deterministic IPs instead of Random(). - MEDIUM: SingleScopeFactory creates independent DbContext per scope sharing the same in-memory DB name, mirroring production behavior where scoped contexts have separate change trackers.
…, entity tracking - HIGH: PurgeLRUCacheAsync restructured to DB-first ordering (consistent with EvictDbTilesAsync) — commit DB deletions before file deletes. Removed broken explicit transaction that caused retry failures on CommitAsync. - HIGH: PurgeAllCacheAsync orphan query replaced File.Exists (untranslatable to SQL) with client-side HashSet filtering of existing disk files. - MEDIUM: CacheTileAsync concurrency retry uses entry.ReloadAsync() instead of ToObject() to avoid creating an untracked entity that conflicts with the original tracked instance on the next Update() call. - MEDIUM: EvictDbTilesAsync loads candidates with AsNoTracking + re-fetches by ID before delete, so a concurrent LastAccessed update doesn't cause stale RowVersion conflicts that permanently block eviction. - MEDIUM: StopReplenisher captures CTS in local variable before assigning to Lazy lambda, preventing shared CTS between instances on rapid sequential calls.
- MEDIUM: PurgeBatchAsync only decrements _currentCacheSize for DB-tracked tiles (zoom >= 9, Meta != null). Previously decremented for all files including zoom 0-8, driving the counter negative after purge-all and permanently disabling eviction until app restart.
…rrency guards - PurgeBatchAsync/PurgeLRUCacheAsync: clear change tracker + re-fetch by ID inside retry lambda to prevent entity tracking conflicts on retry - PurgeAllCacheAsync orphan query: AsNoTracking + project only Id/TileFilePath to reduce memory footprint from full entity load - RateLimitHelper: swap rotation order to atomic capture-then-assign, eliminating undercount gap where concurrent increments were lost - OutboundBudget: add lock to StopReplenisher preventing concurrent CTS/Lazy interleaving that could orphan a replenisher - Document _dbContext lifecycle safety and Referer check design rationale
…te limit hardening - CRITICAL: Add unique composite index on TileCacheMetadata(Zoom, X, Y) — eliminates sequential scans on every tile request hot path - CRITICAL: PurgeAllCacheAsync bulk-loads metadata in one query instead of O(N) individual DB queries per cached PNG file - HIGH: Eviction _currentCacheSize now computed from re-fetched entity sizes instead of stale projected sizes, reducing drift - HIGH: Outbound budget AcquireTimeout reduced from 10s to 3s to prevent thread pool starvation under cold-cache load; CancellationToken threaded through Send*/Cache*/Retrieve* chain to HttpContext.RequestAborted - HIGH: Eviction and purge file deletion consolidated into single lock acquisition per batch, eliminating convoy effects from per-file locking - MEDIUM: X-Forwarded-For IPs normalized (IPv4-mapped IPv6 → IPv4) to prevent aliasing from creating separate rate-limit buckets - MEDIUM: Rate limit cache hard cap (50K entries) with oldest-entry eviction prevents unbounded memory growth from sustained low-rate attacks - MEDIUM: Periodic _currentCacheSize reconciliation via RateLimitCleanupJob corrects accumulated drift from non-atomic size tracking - MEDIUM: PurgeLRUCacheAsync chunked into 1000-ID batches to prevent PostgreSQL query plan explosion from large IN clauses - MEDIUM: Sliding-window documentation corrected — worst-case jitter is up to full prevCount during rotation, not ~0.5
…drift, revalidation cancellation - HIGH: Add per-IP outbound budget tracking (default 30 cache misses/min/IP) to prevent a single client from monopolizing global outbound tokens via uncached tile requests. Configurable in Admin Settings, 0 = disabled. - HIGH: Normalize IPv4-mapped IPv6 on the direct-IP return path in GetClientIpAddress — previously only normalized on the X-Forwarded-For path, allowing rate limit bypass via dual-stack aliasing. - MEDIUM: PurgeBatchAsync and PurgeLRUCacheAsync now capture actual sizes from re-fetched entities (not stale projected sizes) for _currentCacheSize decrement, consistent with EvictDbTilesAsync pattern. - MEDIUM-LOW: Revalidation coalescing now uses CancellationToken.None for the outbound HTTP request so a disconnecting first caller doesn't cancel the request for all coalesced waiters.
… revalidation budget - MEDIUM: PurgeAllCacheAsync replaces ToDictionaryAsync with foreach-based dictionary build to handle anomalous duplicate TileFilePath values gracefully (last-wins) instead of throwing ArgumentException at runtime. - MEDIUM: PurgeBatchAsync and PurgeLRUCacheAsync file deletion now uses chunked lock acquisition (100 files per lock) instead of holding _cacheLock for the entire batch. Allows CacheTileAsync writes to interleave during large purges. - MEDIUM: Add explicit AddHttpContextAccessor() in ConfigureServices() so TileCacheService's dependency is not fragile on implicit transitive registration. - MEDIUM: Per-IP outbound budget now works for coalesced revalidation flights. Client IP is captured eagerly in RetrieveTileAsync (where HttpContext is available) and threaded through RevalidateTileAsync → SendConditionalTileRequestAsync → SendTileRequestCoreAsync. Previously, HttpContext was null for coalesced tasks, silently bypassing the per-IP budget check.
- HIGH: CacheTileAsync now catches DbUpdateException on concurrent tile insert. Two requests for the same uncached tile both saw null and called Add; one hit the unique index constraint. Now handled as a benign race at debug log level. - MEDIUM: PurgeAllCacheAsync bulk-load projects only Id + TileFilePath with AsNoTracking instead of loading full entities (geometry, ETag, RowVersion). PurgeBatchAsync uses lightweight (int? MetaId, string, long) tuples. - MEDIUM: RetryOperationAsync catches only DbUpdateException instead of all exceptions. Non-transient errors propagate immediately. Retry log level changed from Error to Warning.
ProxyImageRateLimitEnabled and ProxyImageRateLimitPerMinute were defined in ApplicationSettings and consumed by TripViewerController/TripsController but never exposed in the admin settings UI or wired in SettingsController. Admins could not toggle or tune proxy image rate limiting. - Add checkbox + hidden-field fallback for ProxyImageRateLimitEnabled - Add numeric input for ProxyImageRateLimitPerMinute - Add audit tracking and property assignment in SettingsController.Update
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.