Fix/issue 1071#1256
Closed
rootp1 wants to merge 8 commits intoGreedyBear-Project:mainfrom
Closed
Conversation
…reedyBear-Project#1103) * implemented * new * copilot changes * thorugh frontend for /trending * formatting error * FeedsThrottle * copilot changes * qualified format check * different trending scope for throttle * indentation * adding new tests plus env boundaries * migration name fix * migration name change * remove frontend from pr * model mismatch * removing the block from pieline * refactored validation from settings * utils delocate * duplicate test leftovers * BucketRepo + SnapshotRepo style * feed throttling for trending * response Caching, no snapshots * snapshots cleanup * snapshot removed from models * batch size of upsert changed to 10k(optimization) * removing api files out of the pr * validation updated * task scheduled for cleanup of activity bucket * excluding invalid ips in trending * imprvements in trending cronjob * review 2 * later fix, (keeping pr scope maintained) * ruff formatted
…GreedyBear-Project#1224 (GreedyBear-Project#1239) * feat: add ?reason= param to feeds_share and token list endpoint - Accept optional ?reason= query parameter on GET /api/feeds/share. The value is persisted in ShareToken.reason (truncated to 256 chars). Existing tokens are not overwritten (get_or_create semantics). - Add GET /api/feeds/tokens/ (authenticated) — returns only the calling user's tokens with safe metadata: hash_prefix (12 hex chars), reason, created_at, revoked, revoked_at. The raw token is never exposed. - Add comprehensive tests covering both features (13 test cases). Closes GreedyBear-Project#1224 * fix: ensure idempotent feed token generation and improve logging and query efficiency in share views
…roject#1247) * convert hit to dict in _bucket_key_from_hit and use correct type hint * adapt tests
…ject#1181 (GreedyBear-Project#1197) * feat: add ISO country code mapping utility and update statistics API to include country codes * fix: make ISO code lookup case-insensitive in getStandardMapName and update test expectations * feat: add Antarctica, N. Cyprus, Somaliland, and Kosovo to ISO mapping and tests * test: update test data structures to include country codes and expected interaction fields * refactor: reformat migration files for consistent style and readability * test: add expected_interactions parameter to mock data in tests/__init__.py * test: update country code filter test values to IT and FR * refactor: replace custom country name normalization with ISO-3166-1 alpha-2 numeric lookups for map rendering and store aggregation
GreedyBear-Project#1209) * Fix include_similar returning invalid sessions (duration <= 0) and add regression test * Fix queryset union issue causing CI failure * Format files with Ruff * Keep only duration filter in related sessions
…reedyBear-Project#1056 (GreedyBear-Project#1244) * add security hardening settings to settings.py * add tests for security settings * test: verify cookie security matches environment * fix trailing whitespace in test file
Contributor
There was a problem hiding this comment.
Pull request overview
Adds support for “trending attackers” via hourly aggregated activity buckets, tightens a few extraction/Elasticsearch behaviors, and updates feeds sharing + country statistics to include ISO country codes for frontend map rendering.
Changes:
- Introduce
AttackerActivityBucketstorage + repository utilities, bucket update logic in extraction, and a scheduled cleanup cronjob. - Extend feeds sharing with an optional
reasonlabel and add an authenticated endpoint to list a user’s share tokens. - Update attacker-country statistics to return ISO alpha-2 codes; refactor frontend country aggregation + map coloring to use ISO codes (dropping name-normalization).
Reviewed changes
Copilot reviewed 42 out of 43 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_utils.py | Adds unit tests for is_non_global_ip() helper. |
| tests/test_trending_bucket_repository.py | Adds tests for trending bucket repository query/params and DB behaviors. |
| tests/test_tasks.py | Updates task wrapper list to include trending bucket cleanup task. |
| tests/test_elastic_repository.py | Adds regression test for .source() reassignment and asserts required fields include type. |
| tests/greedybear/test_security_settings.py | Adds settings-level security assertions (nosniff, X-Frame-Options, cookie flags). |
| tests/greedybear/management/test_setup_schedules.py | Validates schedule creation includes trending bucket cleanup cron entry. |
| tests/greedybear/cronjobs/test_trending.py | Adds tests for trending helpers, bucket updates, and cleanup cron validation. |
| tests/greedybear/cronjobs/test_extraction_pipeline_edge_cases.py | Ensures bucket updates don’t abort extraction and cache invalidation occurs on updates. |
| tests/api/views/test_statistics_view.py | Extends countries stats test to validate ISO country codes returned. |
| tests/api/views/test_feeds_throttle.py | Removes legacy feed throttle tests. |
| tests/api/views/test_feeds_share_view.py | Adds tests for share-token reason and new /api/feeds/tokens/ endpoint behavior. |
| tests/api/views/test_feeds_advanced_view.py | Adjusts country_code filter test data values. |
| tests/api/views/test_cowrie_session_view.py | Adds coverage for excluding zero-duration sessions from “similar” sessions results. |
| tests/init.py | Updates shared IOC fixtures to include attacker_country_code. |
| greedybear/utils.py | Introduces is_non_global_ip() helper used by extraction/bucketing. |
| greedybear/tasks.py | Adds clean_up_trending_buckets task entrypoint. |
| greedybear/settings.py | Adds security headers/cookie flags and trending settings defaults; adds shared-feed throttle rate. |
| greedybear/models.py | Adds AttackerActivityBucket model. |
| greedybear/migrations/0050_attackeractivitybucket.py | Creates DB table/indexes/constraint for attacker activity buckets. |
| greedybear/cronjobs/trending.py | Adds ranking/growth helper functions for trending calculations. |
| greedybear/cronjobs/schedules.py | Registers hourly trending bucket cleanup schedule. |
| greedybear/cronjobs/repositories/trending_bucket.py | Implements bulk upsert, windowed aggregation, and retention deletion for buckets. |
| greedybear/cronjobs/repositories/elastic.py | Fixes .source() call to use the returned Search object. |
| greedybear/cronjobs/repositories/init.py | Exposes trending bucket repository via wildcard import. |
| greedybear/cronjobs/extraction/utils.py | Uses is_non_global_ip() for non-global filtering in IOC extraction. |
| greedybear/cronjobs/extraction/pipeline.py | Updates extraction pipeline to upsert buckets per chunk + invalidate trending cache version. |
| greedybear/cronjobs/bucket_utils.py | Adds hit→bucket key logic and DB upsert wrapper for per-chunk activity aggregation. |
| greedybear/cronjobs/bucket_cleanup.py | Adds cronjob to enforce bucket retention window. |
| greedybear/consts.py | Adds type to REQUIRED_FIELDS for trending bucketing. |
| frontend/tests/utils/country.test.js | Removes country name normalization unit tests (name-normalization removed). |
| frontend/tests/stores/useAttackerCountriesStore.test.jsx | Updates tests for code-based aggregation and codeless skipping. |
| frontend/tests/components/dashboard/AttackOriginMap.test.jsx | Updates map tests to use numeric geo ids → alpha-2 lookup and code-keyed data. |
| frontend/tests/components/dashboard/AttackOriginCountriesChart.test.jsx | Updates chart tests to include ISO codes and new store state shape. |
| frontend/src/utils/country.js | Removes country name normalization mapping utility. |
| frontend/src/stores/useAttackerCountriesStore.jsx | Aggregates attacker-country counts by ISO alpha-2 code and builds sorted list for charts. |
| frontend/src/components/dashboard/AttackOriginMap.jsx | Colors map using geo numeric id → alpha-2 conversion (no name matching table). |
| frontend/package.json | Adds i18n-iso-countries dependency. |
| frontend/package-lock.json | Locks i18n-iso-countries and transitive diacritics. |
| docker/env_file_template | Documents new shared-feed throttle rate and trending settings env vars. |
| api/views/statistics.py | Changes countries stats response to include code (alpha-2) alongside country and count. |
| api/views/feeds.py | Adds share reason persistence, safer share logging, and new /api/feeds/tokens/ endpoint. |
| api/views/cowrie_session.py | Filters related “similar” sessions to duration__gt=0. |
| api/urls.py | Adds route for /api/feeds/tokens/. |
Files not reviewed (1)
- frontend/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+73
to
+78
| unique_keys = update_activity_buckets_from_hits( | ||
| [ | ||
| Hit({"_source": {"src_ip": "1.1.1.1", "type": "Cowrie", "@timestamp": "2026-03-20T09:15:00"}}), | ||
| Hit({"_source": {"src_ip": "1.1.1.1", "type": "cowrie", "@timestamp": "2026-03-20T09:50:00"}}), | ||
| Hit({"_source": {"src_ip": "2.2.2.2", "type": "Heralding", "@timestamp": "2026-03-20T09:10:00"}}), | ||
| ] |
Comment on lines
+23
to
+42
| def _bucket_key_from_hit(hit: Hit) -> BucketKey | None: | ||
| hit_dict = hit.to_dict() | ||
| attacker_ip = hit_dict.get("src_ip") | ||
| feed_type = hit_dict.get("type") | ||
| timestamp = hit_dict.get("@timestamp") | ||
| if not attacker_ip or not feed_type or not timestamp: | ||
| return None | ||
|
|
||
| normalized_ip = str(attacker_ip) | ||
| try: | ||
| parsed_ip = ip_address(normalized_ip) | ||
| except ValueError: | ||
| return None | ||
|
|
||
| if is_non_global_ip(parsed_ip): | ||
| return None | ||
|
|
||
| try: | ||
| return normalized_ip, str(feed_type).lower(), _bucket_start(timestamp) | ||
| except Exception: |
Comment on lines
+9
to
+11
| from greedybear.cronjobs.extraction.utils import parse_timestamp | ||
| from greedybear.cronjobs.repositories import TrendingBucketRepository | ||
| from greedybear.utils import is_non_global_ip |
Comment on lines
+38
to
+40
| bool: True for loopback/private/multicast/link-local/reserved addresses. | ||
| """ | ||
| return value.is_loopback or value.is_private or value.is_multicast or value.is_link_local or value.is_reserved |
Comment on lines
136
to
139
| for ip, hits in hits_by_ip.items(): | ||
| extracted_ip = ip_address(ip) | ||
| if extracted_ip.is_loopback or extracted_ip.is_private or extracted_ip.is_multicast or extracted_ip.is_link_local or extracted_ip.is_reserved: | ||
| if is_non_global_ip(extracted_ip): | ||
| continue |
Comment on lines
92
to
+107
| qs = ( | ||
| IOC.objects.filter(last_seen__gte=delta) | ||
| .exclude(attacker_country="") | ||
| .filter(honeypots__active=True) | ||
| .values("attacker_country") | ||
| .values("attacker_country", "attacker_country_code") | ||
| .annotate(count=Count("id", distinct=True)) | ||
| .order_by("-count") | ||
| ) | ||
| data = [{"country": item["attacker_country"], "count": item["count"]} for item in qs] | ||
| data = [ | ||
| { | ||
| "country": item["attacker_country"], | ||
| "code": item["attacker_country_code"], | ||
| "count": item["count"], | ||
| } | ||
| for item in qs | ||
| ] |
Comment on lines
+51
to
+61
| def test_is_non_global_ip(self): | ||
| self.assertTrue(is_non_global_ip(ip_address("127.0.0.1"))) | ||
| self.assertTrue(is_non_global_ip(ip_address("10.0.0.1"))) | ||
| self.assertTrue(is_non_global_ip(ip_address("169.254.1.1"))) | ||
| self.assertTrue(is_non_global_ip(ip_address("224.0.0.1"))) | ||
| self.assertTrue(is_non_global_ip(ip_address("240.0.0.1"))) | ||
| self.assertTrue(is_non_global_ip(ip_address("::1"))) | ||
| self.assertTrue(is_non_global_ip(ip_address("fc00::1"))) | ||
|
|
||
| self.assertFalse(is_non_global_ip(ip_address("8.8.8.8"))) | ||
| self.assertFalse(is_non_global_ip(ip_address("2001:4860:4860::8888"))) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Findings
greedybear/cronjobs/bucket_utils.py (line 23) still treats src_ip and type differently from the extraction pipeline: it does not strip whitespace before parsing or normalizing. That means hits like " 8.8.8.8 " or "Cowrie " can be accepted by extraction but silently dropped or bucketed under the wrong feed_type in trending. This is the closest match to the kind of runtime inconsistency that caused #1246.
Open Questions / Residual Risks
greedybear/utils.py (line 30) uses a hand-rolled is_non_global_ip() check instead of not value.is_global. That likely works for the currently tested cases, but it can miss edge categories like unspecified addresses. I’d treat this as a smaller correctness risk, not the primary bug.
api/views/feeds.py (line 271) uses DB cache for trending responses. I previously hit a naive/aware datetime comparison path while probing this through a patched test shell. I wouldn’t call that a confirmed production bug yet, but it’s worth keeping in mind if cache/timezone behavior starts acting strangely.