Require project_id on list endpoints to prevent full table scans#1133
Require project_id on list endpoints to prevent full table scans#1133
Conversation
Add `require_project_for_list = True` as the default in ProjectMixin. All ViewSets using the mixin now return 400 on list actions if no project_id query parameter is provided, preventing unfiltered queries that cause full table scans and multi-minute response times. Opt out global entities that aren't project-scoped: - TaxonViewSet (taxonomy is global) - TaxaListViewSet (taxa lists are global) - TagViewSet (includes global tags) Also set require_project = True on SummaryView and remove the unfiltered else branch that returned all-projects counts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
✅ Deploy Preview for antenna-preview canceled.
|
✅ Deploy Preview for antenna-ssec canceled.
|
📝 WalkthroughWalkthroughThis change introduces a new Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR hardens the API against expensive unfiltered list queries by making project_id required by default for list actions in ProjectMixin, and by removing the unfiltered code path for the status summary endpoint, while explicitly opting out global resources (taxa, taxa lists, tags).
Changes:
- Default
ProjectMixin.require_project_for_list = Trueand compute “required” based onaction == "list". - Make
SummaryViewrequire a project and delete the unfiltered aggregation branch. - Opt out
TaxonViewSet,TaxaListViewSet, andTagViewSetfrom the new list requirement.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
ami/base/views.py |
Adds list-action project enforcement logic to ProjectMixin.get_active_project() via require_project_for_list. |
ami/main/api/views.py |
Opts out global list endpoints and makes SummaryView require project_id, removing the unfiltered summary path. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| require_project = False # Project required for all actions | ||
| require_project_for_list = True # Project required for list actions |
There was a problem hiding this comment.
The inline comment on require_project is misleading: the attribute is False (project optional), but the comment says it's required for all actions. Please update the comment (or the default) so it matches the actual behavior to avoid confusion when configuring views.
| require_project = False # Project required for all actions | |
| require_project_for_list = True # Project required for list actions | |
| require_project = False # If True, project is required for all actions; otherwise project is optional | |
| require_project_for_list = True # If True, project is required for list actions |
| def get_active_project(self) -> Project | None: | ||
| """ | ||
| Instance method wrapper around the standalone get_active_project function. | ||
| Raises Http404 if project_id is required but not provided or the project does not exist. | ||
|
|
||
| Enforcement logic: | ||
| - If require_project is True, project_id is required for all actions. | ||
| - If require_project_for_list is True (default), project_id is required for list actions. | ||
| - Missing project_id when required raises ValidationError (400). | ||
| - Valid project_id pointing to nonexistent project raises Http404 (404). | ||
| """ | ||
| action = getattr(self, "action", None) | ||
| required = self.require_project or (self.require_project_for_list and action == "list") | ||
|
|
||
| project = get_active_project( |
There was a problem hiding this comment.
ProjectMixin.get_active_project() docstring/behavior suggests require_project_for_list=True enforces project_id for list actions, but that enforcement only happens if the view actually calls self.get_active_project() during request handling. For example, ProjectViewSet inherits ProjectMixin but its list/get_queryset path never calls get_active_project(), so no 400 will be raised. Consider either (a) clarifying the docstring to state this is enforced only when get_active_project() is invoked, or (b) moving the enforcement into a hook that always runs for list actions (e.g., initial() / get_queryset() in the mixin) if the intent is truly "all list endpoints".
| class SummaryView(GenericAPIView, ProjectMixin): | ||
| permission_classes = [IsActiveStaffOrReadOnly] | ||
| require_project = True # Unfiltered summary queries are too expensive | ||
|
|
||
| @extend_schema(parameters=[project_id_doc_param]) | ||
| def get(self, request): |
There was a problem hiding this comment.
SummaryView now requires project_id (require_project = True), but the OpenAPI parameter used here (project_id_doc_param) is defined with required=False, so the generated docs will still show the query param as optional. Consider overriding the schema for this endpoint with a required=True parameter (or introducing a separate required_project_id_doc_param) so the docs match the new 400 behavior.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
Verify each finding against the current code and only fix it if needed.
In `@ami/base/views.py`:
- Around line 82-94: Update the get_active_project function docstring to match
the actual parameter name: change any occurrences of "require_project" to
"required" so the docstring accurately documents the boolean parameter used by
get_active_project; verify the description refers to the purpose of the
"required" flag and adjust any examples or parameter lists to use the exact
identifier "required".
In `@ami/main/api/views.py`:
- Around line 1682-1684: ProjectViewSet is broken by the new default
require_project_for_list = True causing
LimitOffsetPaginationWithPermissions._get_project() to call
get_active_project(required=True) and raise Http404 for root-level project
listing; to fix, explicitly set require_project_for_list = False on the
ProjectViewSet class (same pattern as TaxonViewSet/TaxaListViewSet/TagViewSet)
so list actions do not force project extraction, and ensure pagination via
LimitOffsetPaginationWithPermissions._get_project() and get_paginated_response()
will skip required project resolution for this view.
🧹 Nitpick comments (1)
🤖 Fix all nitpicks with AI agents
Verify each finding against the current code and only fix it if needed. In `@ami/base/views.py`: - Around line 82-94: Update the get_active_project function docstring to match the actual parameter name: change any occurrences of "require_project" to "required" so the docstring accurately documents the boolean parameter used by get_active_project; verify the description refers to the purpose of the "required" flag and adjust any examples or parameter lists to use the exact identifier "required".ami/base/views.py (1)
82-94: Logic is sound; minor docstring param-name mismatch on the standalone function.The computed
requiredflag and theHttp404for "valid ID but nonexistent project" are correct. One nit: the docstring for the standaloneget_active_project(line 23) refers to the parameter asrequire_project, but the actual parameter name isrequired.Fix docstring param name
Args: request: The DRF request object kwargs: URL kwargs (optional, used to extract pk from URL paths) - require_project: Whether project_id is required (raises validation error if True and not found) + required: Whether project_id is required (raises validation error if True and not found)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ami/base/views.py` around lines 82 - 94, Update the get_active_project function docstring to match the actual parameter name: change any occurrences of "require_project" to "required" so the docstring accurately documents the boolean parameter used by get_active_project; verify the description refers to the purpose of the "required" flag and adjust any examples or parameter lists to use the exact identifier "required".
| class SummaryView(GenericAPIView, ProjectMixin): | ||
| permission_classes = [IsActiveStaffOrReadOnly] | ||
| require_project = True # Unfiltered summary queries are too expensive |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if ProjectViewSet or its parent classes call get_active_project in any list-related code path
echo "=== ProjectViewSet definition ==="
ast-grep --pattern 'class ProjectViewSet($_$$) {
$$$
}'
echo ""
echo "=== Search for get_active_project calls in views.py ==="
rg -n 'get_active_project' ami/main/api/views.py
echo ""
echo "=== Check if pagination or other mixins call get_active_project ==="
rg -n 'get_active_project' ami/base/Repository: RolnickLab/antenna
Length of output: 1472
🏁 Script executed:
#!/bin/bash
echo "=== Find ProjectViewSet class definition ==="
rg -n 'class ProjectViewSet' ami/main/api/views.py -A 5
echo ""
echo "=== Examine pagination code that calls get_active_project ==="
sed -n '25,35p' ami/base/pagination.py
echo ""
echo "=== Look for require_project_for_list in views.py ==="
rg -n 'require_project' ami/main/api/views.py | head -20
echo ""
echo "=== Check ProjectViewSet's parent classes and methods ==="
rg -n 'class ProjectViewSet' ami/main/api/views.py -A 20Repository: RolnickLab/antenna
Length of output: 2070
🏁 Script executed:
#!/bin/bash
echo "=== Check DefaultViewSet definition ==="
rg -n 'class DefaultViewSet' ami/main/api/views.py -A 10
echo ""
echo "=== Check ProjectMixin and its get_active_project method ==="
rg -n 'class ProjectMixin' ami/base/views.py -A 20
echo ""
echo "=== Look at ProjectPagination ==="
rg -n 'class ProjectPagination' ami/base/pagination.py -B 2 -A 15
echo ""
echo "=== Check the full pagination _get_project method ==="
sed -n '27,35p' ami/base/pagination.py
echo ""
echo "=== Search for require_project usage pattern in the code ==="
rg -n 'require_project' ami/base/views.pyRepository: RolnickLab/antenna
Length of output: 2993
🏁 Script executed:
#!/bin/bash
echo "=== Look for ProjectViewSet URL routing ==="
rg -n 'ProjectViewSet' ami/main/api/ | grep -E '(router|urlpatterns|register)'
echo ""
echo "=== Check URL patterns file ==="
fd -e py 'url' ami/main/api/ | head -5
echo ""
echo "=== Check the get_active_project full implementation ==="
sed -n '72,100p' ami/base/views.py
echo ""
echo "=== Look at how require_project_for_list is used ==="
sed -n '67,95p' ami/base/views.pyRepository: RolnickLab/antenna
Length of output: 2710
🏁 Script executed:
#!/bin/bash
echo "=== Check how ProjectViewSet is registered in router ==="
fd -type f -name '*.py' ami/main/api/ -exec grep -l 'ProjectViewSet' {} \;
echo ""
echo "=== Look for router registration ==="
fd -type f -name 'urls.py' ami/ 2>/dev/null
echo ""
echo "=== Check if there's an api router file ==="
find ami/main/api -name '*.py' -type f | head -10Repository: RolnickLab/antenna
Length of output: 463
🏁 Script executed:
#!/bin/bash
echo "=== Check if ProjectViewSet sets require_project or require_project_for_list ==="
sed -n '143,170p' ami/main/api/views.py
echo ""
echo "=== Check if there are any Django URL patterns ==="
rg -n 'path|url' ami/main/api/views.py | head -10
echo ""
echo "=== Check the ami/main/urls.py ==="
cat ami/main/urls.py
echo ""
echo "=== Check the api/schemas file for routing ==="
rg -n 'ProjectViewSet' ami/main/api/schemas.pyRepository: RolnickLab/antenna
Length of output: 1554
🏁 Script executed:
#!/bin/bash
echo "=== Find urls.py files ==="
find ami -name 'urls.py' -type f
echo ""
echo "=== Check if ProjectViewSet overrides require_project_for_list ==="
sed -n '143,180p' ami/main/api/views.py
echo ""
echo "=== Look for RestRouter in schemas ==="
rg -n 'Router\|ProjectViewSet' ami/main/api/schemas.py -A 2
echo ""
echo "=== Check if ProjectViewSet calls list action ==="
sed -n '143,200p' ami/main/api/views.py | grep -E 'def list|require_project'Repository: RolnickLab/antenna
Length of output: 1544
🏁 Script executed:
#!/bin/bash
echo "=== Check if ProjectViewSet is at project-level endpoint or root endpoint ==="
rg -n 'register.*ProjectViewSet' ami/ -r '' 2>/dev/null || echo "Not found in rg"
echo ""
echo "=== Look at how DRF routes work - check if ProjectViewSet is nested under project ==="
cat ami/main/api/schemas.py | head -50
echo ""
echo "=== Check pagination more carefully - when does _get_project get called? ==="
sed -n '1,40p' ami/base/pagination.py
echo ""
echo "=== Look for list method in ProjectViewSet or DefaultViewSet ==="
sed -n '103,180p' ami/main/api/views.pyRepository: RolnickLab/antenna
Length of output: 4246
🏁 Script executed:
#!/bin/bash
echo "=== Check the get_active_project logic when required=True ==="
sed -n '12,60p' ami/base/views.py
echo ""
echo "=== Check if ProjectViewSet overrides require_project_for_list anywhere ==="
sed -n '143,250p' ami/main/api/views.py | grep -A 5 -B 5 'require_project'
echo ""
echo "=== Confirm how the kwargs are used for project_id extraction ==="
sed -n '12,50p' ami/base/views.py | head -40Repository: RolnickLab/antenna
Length of output: 3557
🏁 Script executed:
#!/bin/bash
echo "=== Check if ProjectViewSet needs to override require_project_for_list ==="
echo "Looking at the actual behavior - does ProjectViewSet list need project_id?"
echo ""
echo "Show ProjectViewSet context again with more lines:"
sed -n '143,180p' ami/main/api/views.py
echo ""
echo "=== Check if there are any tests for ProjectViewSet list ==="
find . -name '*test*' -type f -name '*.py' -exec grep -l 'ProjectViewSet' {} \; 2>/dev/null | head -3Repository: RolnickLab/antenna
Length of output: 1604
🏁 Script executed:
#!/bin/bash
echo "=== Look at similar root-level viewsets that set require_project_for_list = False ==="
rg -B 5 'require_project_for_list = False' ami/main/api/views.py
echo ""
echo "=== Understand the difference - where are these viewsets used? ==="
rg -B 15 'require_project_for_list = False' ami/main/api/views.py | grep 'class ' | head -5
echo ""
echo "=== Check if ProjectViewSet should also override this ==="
echo "ProjectViewSet doesn't override require_project_for_list, so it inherits the default."
echo "Let's verify pagination actually gets called during list:"
sed -n '10,15p' ami/base/pagination.pyRepository: RolnickLab/antenna
Length of output: 1495
ProjectViewSet is broken by the new require_project_for_list = True default and requires explicit override.
While ProjectViewSet's get_queryset() doesn't directly call get_active_project(), the pagination code does—LimitOffsetPaginationWithPermissions._get_project() explicitly calls it during get_paginated_response(). During a list action with the new default, required=True is passed to get_active_project(). Since ProjectViewSet is a root-level endpoint (not nested under a project_id), the project extraction fails and raises Http404.
ProjectViewSet should set require_project_for_list = False (like TaxonViewSet, TaxaListViewSet, and TagViewSet do) since it lists all projects globally, not per-project resources.
🧰 Tools
🪛 Ruff (0.15.0)
[warning] 1683-1683: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ami/main/api/views.py` around lines 1682 - 1684, ProjectViewSet is broken by
the new default require_project_for_list = True causing
LimitOffsetPaginationWithPermissions._get_project() to call
get_active_project(required=True) and raise Http404 for root-level project
listing; to fix, explicitly set require_project_for_list = False on the
ProjectViewSet class (same pattern as TaxonViewSet/TaxaListViewSet/TagViewSet)
so list actions do not force project extraction, and ensure pagination via
LimitOffsetPaginationWithPermissions._get_project() and get_paginated_response()
will skip required project resolution for this view.
Interaction with #1094 (Configurable taxa lists)PR #1094 currently sets When merging after #1094: There will be a minor conflict on Recommended merge order: #1094 first, then this PR. |
Summary
require_project_for_list = Trueas the default inProjectMixin, so all list endpoints return 400 if noproject_idis provided. This prevents unfiltered queries that cause full table scans (e.g.COUNT(*)on classifications took 3.6 min per request).require_project = TrueonSummaryViewand removes the unfiltered code path.TaxonViewSet,TaxaListViewSet,TagViewSet.Context
On 2026-02-16, Facebook's crawler (
meta-externalagent) hit/api/v2/classifications/13K+ times in a day without a project filter, causing recurring latency spikes (P99 > 2 min, throughput drops from 2700 to 140 req/15min). The crawler is now blocked at the nginx layer (ami-devops@5a29503), but unfiltered list queries from any client remain a risk.Changes
ami/base/views.pyProjectMixin.require_project_for_list = Truedefault;get_active_project()enforces onaction == "list"ami/main/api/views.pyTaxonViewSet,TaxaListViewSet,TagViewSet: opt out withrequire_project_for_list = Falseami/main/api/views.pySummaryView:require_project = True, remove unfiltered else branchBehavior
GET /api/v2/classifications/→ 400{"project_id": ["This field is required."]}GET /api/v2/classifications/?project_id=1→ 200 (filtered, fast)GET /api/v2/classifications/123/→ 200 (detail view, unaffected)GET /api/v2/taxa/→ 200 (opted out, global taxonomy)GET /api/v2/status/summary/→ 400 (now requires project_id)Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Changes