Skip to content

Implement null detections#1093

Open
vanessavmac wants to merge 5 commits intomainfrom
484-make-it-clear-what-images-have-not-been-processed
Open

Implement null detections#1093
vanessavmac wants to merge 5 commits intomainfrom
484-make-it-clear-what-images-have-not-been-processed

Conversation

@vanessavmac
Copy link
Collaborator

@vanessavmac vanessavmac commented Jan 20, 2026

Summary

If a pipeline doesn't return detections but still successfully processed the image, it will have a "null" detection.

List of Changes

  • Add null detections test
  • Update save_results
  • Display timeline of processed images
  • Display updated collection stats

Related Issues

Closes #484

How to Test the Changes

docker compose run --rm django python manage.py test -k test_image_with_null_detection --failfast --pdb

Screenshots

Example of non-null detection

image

Example of null detection

Screenshot 2026-02-18 at 9 01 21 PM

Example of no processing/detections

Screenshot 2026-02-18 at 9 02 30 PM

Deployment Notes

Include instructions if this PR requires specific steps for its deployment (database migrations, config changes, etc.)

Checklist

  • I have tested these changes appropriately.
  • I have added and/or modified relevant tests.
  • I updated relevant documentation or comments.
  • I have verified that this PR follows the project's coding standards.
  • Any dependent changes have already been merged to main.

Summary by CodeRabbit

Release Notes

  • New Features

    • Image processing status now tracked throughout the system
    • New "Was processed" data series added to activity plot, visualizing which images have been analyzed
    • New "Total Processed Captures" column in capture sets view, displaying processed image counts and percentages
    • Processing status information now available in timeline data
  • Bug Fixes

    • Detections with invalid or empty bounding boxes are now properly excluded from processing status calculations and detection counts

@netlify
Copy link

netlify bot commented Jan 20, 2026

Deploy Preview for antenna-preview failed. Why did it fail? →

Name Link
🔨 Latest commit fd6d22c
🔍 Latest deploy log https://app.netlify.com/projects/antenna-preview/deploys/69967a7ab83a2800086aae55

@netlify
Copy link

netlify bot commented Jan 20, 2026

Deploy Preview for antenna-ssec failed. Why did it fail? →

Name Link
🔨 Latest commit fd6d22c
🔍 Latest deploy log https://app.netlify.com/projects/antenna-ssec/deploys/69967a7a3eda1c0008808c5a

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

📝 Walkthrough

Walkthrough

Added processing status tracking for source images across database, API, and UI layers. Introduces queryset methods to annotate processed state, creates null detections for unprocessed images, and exposes processed image counts via API serialization and timeline visualizations.

Changes

Cohort / File(s) Summary
Backend Models & Querysets
ami/main/models.py
Added queryset methods with_was_processed() and with_source_images_processed_count() for annotation; extended SourceImage.get_detections_count() and SourceImage.get_was_processed() to exclude null/empty bounding boxes; updated collection-level detection counting to filter out invalid detections.
Admin Interface
ami/main/admin.py
Added get_was_processed to SourceImageAdmin.list_display for visibility in the admin list view.
API Serializers & Views
ami/main/api/serializers.py, ami/main/api/views.py
Exposed source_images_processed_count in SourceImageCollectionSerializer and ordering fields; added was_processed to EventTimelineIntervalSerializer; integrated with_was_processed() and with_source_images_processed_count() in viewsets; filtered DetectionViewSet queryset to exclude invalid bounding boxes.
ML Pipeline & Schemas
ami/ml/schemas.py, ami/ml/models/pipeline.py
Made bbox optional in DetectionRequest and DetectionResponse (changed to BoundingBox | None = None); introduced create_null_detections_for_undetected_images() helper to generate null detections for images without processing results; integrated null-detection creation into save_results().
ML Pipeline Tests
ami/ml/tests.py
Added test_image_with_null_detection() to validate null-detection handling and per-algorithm processing status filtering.
Frontend Data Models
ui/src/data-services/models/timeline-tick.ts, ui/src/data-services/models/capture-set.ts
Added was_processed: boolean field to ServerTimelineTick type with getter wasProcessed(); added numImagesProcessed and numImagesProcessedLabel getters to CaptureSet for processed image metrics.
Frontend UI Components
ui/src/pages/session-details/playback/activity-plot/activity-plot.tsx, ui/src/pages/project/capture-sets/capture-set-columns.tsx, ui/src/pages/project/capture-sets/capture-sets.tsx
Added "Was processed" series to activity plot with y2 axis and hover text; added "Total Processed Captures" column to capture-set table; enabled column by default in capture-sets view.
Frontend Localization
ui/src/utils/language.ts
Added FIELD_LABEL_TOTAL_PROCESSED_CAPTURES enum entry and translation mapping to "Total Processed Captures".

Sequence Diagram

sequenceDiagram
    participant Pipeline as ML Pipeline
    participant SaveResults as save_results()
    participant DB as Database<br/>(Models)
    participant API as API Layer<br/>(Views/Serializers)
    participant UI as Frontend UI

    Pipeline->>SaveResults: Returns detection results
    SaveResults->>SaveResults: Create null detections<br/>for unprocessed images
    SaveResults->>DB: Save all detections<br/>(including null bbox)
    DB->>DB: Annotate was_processed<br/>via Detection.Exists()
    DB->>DB: Count processed images<br/>distinct by image
    API->>DB: Query with annotations
    DB-->>API: Return annotated data<br/>(was_processed, counts)
    API->>API: Serialize processed_count,<br/>was_processed fields
    API-->>UI: Expose in JSON responses
    UI->>UI: Render activity plot<br/>with processed series
    UI->>UI: Display processed counts<br/>in capture-set table
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly Related PRs

Suggested Reviewers

  • mihow

Poem

🐰 Hops through pipelines with glee,
Tracking which captures processed be!
Null detections, counts so bright,
From database down to UI's sight—
Progress made, no image forgotten!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title 'Implement null detections' clearly summarizes the main change: adding handling for null detections when pipelines process images without returning detections.
Description check ✅ Passed PR description includes all required template sections: Summary, List of Changes, Related Issues (#484), How to Test, Screenshots, Deployment Notes, and Checklist.
Linked Issues check ✅ Passed Code changes comprehensively implement issue #484 objectives: track processed images via was_processed field, display in timeline with new visualization, show collection stats including processed count, and add test coverage.
Out of Scope Changes check ✅ Passed All code changes directly support null detections implementation and issue #484 objectives; no unrelated modifications detected across backend models, API, or UI components.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 484-make-it-clear-what-images-have-not-been-processed

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vanessavmac vanessavmac changed the title Add null detections test; update save_results Implement null detections Feb 11, 2026
@vanessavmac vanessavmac marked this pull request as ready for review February 11, 2026 16:27
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ami/main/api/views.py (1)

378-439: ⚠️ Potential issue | 🟠 Major

Fix was_processed for empty intervals (and avoid stale values).

Line 439 uses image after the loop; for intervals with no captures (or events with zero captures) this either reuses the prior interval’s image or can crash. Accumulate during the loop and keep False when no captures.

🐛 Suggested fix
@@
             while image_index < len(source_images) and source_images[image_index]["timestamp"] <= interval_end:
                 image = source_images[image_index]
@@
                 if image["detections_count"] >= max(interval_data["detection_counts"]):
                     interval_data["top_capture"] = SourceImage(pk=image["id"])
+                interval_data["was_processed"] = interval_data["was_processed"] or image["was_processed"]
                 image_index += 1
@@
-            interval_data["was_processed"] = image["was_processed"]
+            # was_processed already accumulated; remain False when no captures
🤖 Fix all issues with AI agents
In `@ami/main/models.py`:
- Around line 1839-1841: Remove the redundant Q(bbox=None) in
get_detections_count() (keep Q(bbox__isnull=True) and the empty-list check
Q(bbox=[])) and then make update_detection_counts() use the same exclusion
filter as get_detections_count() so the cached detections_count is computed
consistently; specifically, update the counting logic in
update_detection_counts() to apply .exclude(Q(bbox__isnull=True) | Q(bbox=[]))
on self.detections before calling .count() so both methods return the same
value.

In `@ami/ml/models/pipeline.py`:
- Around line 882-888: The loop that assigns detector_algorithm_reference over
algorithms_known currently overwrites when multiple detection algorithms exist;
update the logic in the section that iterates algorithms_known to make selection
deterministic or to emit a null detection per detector: either pick one
deterministic algorithm (e.g., lowest key or sorted by name) and log/warn when
multiple AlgorithmTaskType.DETECTION entries are present, or instead create an
AlgorithmReference for each known_algorithm with task_type DETECTION (rather
than a single detector_algorithm_reference) so downstream checks use
per-detector null detections; modify the code around
detector_algorithm_reference, algorithms_known and AlgorithmReference
accordingly and add a warning using the existing logger when multiple detectors
are found.
- Around line 896-902: The appended null detection currently uses a naive
timestamp via datetime.datetime.now(); update the DetectionResponse creation
inside the null_detections_to_add.append call to use the timezone-aware now()
function (already imported and used elsewhere) instead of
datetime.datetime.now(), ensuring the timestamp field for DetectionResponse is
timezone-aware and consistent with other parts of the pipeline.

In `@ui/src/data-services/models/collection.ts`:
- Around line 93-101: Rename the getter numImagesProccessed to
numImagesProcessedLabel in the Collection model so the name is spelled correctly
and follows the "*Label" convention; update the getter implementation (currently
computing numProcessed, pct and returning the formatted string) to the new name,
then find and replace all usages (e.g., any occurrences of
item.numImagesProccessed such as in collection-columns.tsx) to
item.numImagesProcessedLabel, and update any type declarations/exports or tests
that reference the old symbol to the new one to ensure no unresolved references
remain.

In `@ui/src/pages/session-details/playback/activity-plot/activity-plot.tsx`:
- Around line 71-96: The "Was processed" trace currently maps wasProcessed→0/1
onto yaxis 'y2' (shared with detectionsMaxCount) so it becomes invisible when
detectionsMaxCount is large and the mapping is inverted; fix by either (A)
giving this trace its own y-axis (e.g., set yaxis: 'y3' and add a yaxis config
with fixed range [0,1]) and keep y as wasProcessed ? 0 : 1 but rename the trace
to "Not processed" (or add a clarifying comment about the inversion), or (B)
scale the y values to the detections axis by mapping y: timeline.map(t =>
t.numCaptures>0 ? (t.wasProcessed ? 0 : detectionsMaxCount) : 0) and keep the
name "Not processed" (or invert the boolean mapping and rename to "Was
processed" accordingly); update hover/customdata text to match the chosen
semantics.
🧹 Nitpick comments (3)
ui/src/data-services/models/timeline-tick.ts (1)

35-37: Consider adding a nullish fallback for wasProcessed, consistent with other getters.

Other getters in this class (e.g., numDetections, numCaptures) use ?? 0 / ?? false to guard against missing server data. If was_processed is absent from the API response (e.g., during rollout or with an older API version), this will return undefined instead of false.

Suggested fix
   get wasProcessed(): boolean {
-    return this._timelineTick.was_processed
+    return this._timelineTick.was_processed ?? false
   }
ami/main/admin.py (1)

256-269: Reduce admin list N+1 by annotating was_processed.

get_was_processed in list_display can trigger a query per row. Consider annotating in get_queryset and reading the annotation in an admin method.

♻️ Suggested adjustment
@@
     def get_queryset(self, request: HttpRequest) -> QuerySet[Any]:
-        return super().get_queryset(request).select_related("event", "deployment", "deployment__data_source")
+        return (
+            super()
+            .get_queryset(request)
+            .select_related("event", "deployment", "deployment__data_source")
+            .with_was_processed()
+        )
+
+    `@admin.display`(boolean=True, ordering="was_processed", description="Was processed")
+    def get_was_processed(self, obj: SourceImage) -> bool:
+        return getattr(obj, "was_processed", obj.get_was_processed())
ami/main/models.py (1)

3725-3734: Same bbox=None / bbox__isnull=True redundancy as in get_detections_count.

~models.Q(images__detections__bbox__isnull=True) and ~models.Q(images__detections__bbox=None) produce identical SQL. One can be removed for clarity.

Simplify the filter
             source_images_with_detections_count=models.Count(
                 "images",
                 filter=(
                     models.Q(images__detections__isnull=False)
                     & ~models.Q(images__detections__bbox__isnull=True)
-                    & ~models.Q(images__detections__bbox=None)
                     & ~models.Q(images__detections__bbox=[])
                 ),
                 distinct=True,
             )

Comment on lines 882 to 888
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make detector selection deterministic (or create one null detection per detector).

If a pipeline has multiple detection algorithms, the loop currently overwrites the reference, which can make algorithm-specific processed checks unreliable. Consider selecting deterministically and warning, or emitting one null detection per detector.

🛠️ Example deterministic selection
-            detector_algorithm_reference = None
-            for known_algorithm in algorithms_known.values():
-                if known_algorithm.task_type == AlgorithmTaskType.DETECTION:
-                    detector_algorithm_reference = AlgorithmReference(
-                        name=known_algorithm.name, key=known_algorithm.key
-                    )
+            detector_algorithms = [
+                algo for algo in algorithms_known.values() if algo.task_type == AlgorithmTaskType.DETECTION
+            ]
+            if len(detector_algorithms) > 1:
+                job_logger.warning(
+                    "Multiple detection algorithms found; using the first for null detections."
+                )
+            detector_algorithm_reference = (
+                AlgorithmReference(name=detector_algorithms[0].name, key=detector_algorithms[0].key)
+                if detector_algorithms
+                else None
+            )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
detector_algorithm_reference = None
for known_algorithm in algorithms_known.values():
if known_algorithm.task_type == AlgorithmTaskType.DETECTION:
detector_algorithm_reference = AlgorithmReference(
name=known_algorithm.name, key=known_algorithm.key
)
detector_algorithms = [
algo for algo in algorithms_known.values() if algo.task_type == AlgorithmTaskType.DETECTION
]
if len(detector_algorithms) > 1:
job_logger.warning(
"Multiple detection algorithms found; using the first for null detections."
)
detector_algorithm_reference = (
AlgorithmReference(name=detector_algorithms[0].name, key=detector_algorithms[0].key)
if detector_algorithms
else None
)
🤖 Prompt for AI Agents
In `@ami/ml/models/pipeline.py` around lines 882 - 888, The loop that assigns
detector_algorithm_reference over algorithms_known currently overwrites when
multiple detection algorithms exist; update the logic in the section that
iterates algorithms_known to make selection deterministic or to emit a null
detection per detector: either pick one deterministic algorithm (e.g., lowest
key or sorted by name) and log/warn when multiple AlgorithmTaskType.DETECTION
entries are present, or instead create an AlgorithmReference for each
known_algorithm with task_type DETECTION (rather than a single
detector_algorithm_reference) so downstream checks use per-detector null
detections; modify the code around detector_algorithm_reference,
algorithms_known and AlgorithmReference accordingly and add a warning using the
existing logger when multiple detectors are found.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this loop over known_algorithms doesn't need to happen per-image and could be pulled out. I.e, you could find the known_detection_algorithm before looping the images. Do pipelines allow more than one detection algorithm? If so then I agree with coderrabit comment below. If not you could assert here that only one detection algo is found.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do pipelines allow more than one detection algorithm?

I think at the moment we only have one detection algorithm per pipeline. But maybe @mihow could confirm and comment here?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ami/ml/models/pipeline.py (1)

939-944: ⚠️ Potential issue | 🟠 Major

Null detections incorrectly create empty Occurrence records.

create_and_update_occurrences_for_detections receives the full detections list, which includes null Detection objects (where bbox=None). For each null detection without a pre-existing occurrence, it bulk-creates an Occurrence with event, deployment, and project set but no taxon or determination. These phantom occurrences are semantically wrong — a null detection means "no animal found" — and persist in the Occurrence table. Some API endpoints filter out these empty occurrences (checking determination__isnull=False), but the core queryset .valid() only excludes occurrences with zero detections, allowing these to inflate counts in certain views. Filter out null detections before creating occurrences:

🛠️ Proposed fix in save_results
+    real_detections = [d for d in detections if d.bbox is not None]
     create_and_update_occurrences_for_detections(
-        detections=detections,
+        detections=real_detections,
         logger=job_logger,
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ami/ml/models/pipeline.py` around lines 939 - 944, The code passes the full
detections list (including null detections with bbox=None) into
create_and_update_occurrences_for_detections which creates empty Occurrence
records; filter the detections before calling that function (e.g., in
save_results) by keeping only detections where bbox is not None (or detection is
truthy) and then call create_and_update_occurrences_for_detections with that
filtered list so null detections are ignored and no phantom Occurrence rows are
created.
🧹 Nitpick comments (1)
ui/src/data-services/models/capture-set.ts (1)

82-101: numImagesWithDetectionsLabel can render "undefined (0%)" — worth aligning with the new pattern.

The new numImagesProcessedLabel correctly defaults numProcessed to 0 via ?? 0, so it never emits the string "undefined". The adjacent, pre-existing numImagesWithDetectionsLabel uses this.numImagesWithDetections?.toLocaleString() without a fallback; when numImagesWithDetections is undefined, the template literal produces the literal string "undefined (0%)" in the rendered cell.

Since the new getter sets the correct precedent, consider aligning numImagesWithDetectionsLabel for consistency:

♻️ Proposed alignment for numImagesWithDetectionsLabel
 get numImagesWithDetectionsLabel(): string {
+  const numWithDetections = this.numImagesWithDetections ?? 0
   const pct =
-    this.numImagesWithDetections && this.numImages
-      ? (this.numImagesWithDetections / this.numImages) * 100
+    this.numImages && this.numImages > 0
+      ? (numWithDetections / this.numImages) * 100
       : 0

-  return `${this.numImagesWithDetections?.toLocaleString()} (${pct.toFixed(
-    0
-  )}%)`
+  return `${numWithDetections.toLocaleString()} (${pct.toFixed(0)}%)`
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/src/data-services/models/capture-set.ts` around lines 82 - 101, The getter
numImagesWithDetectionsLabel can produce "undefined (0%)" because it calls
this.numImagesWithDetections?.toLocaleString() without a fallback; update it to
mirror numImagesProcessedLabel by defaulting numImagesWithDetections to 0 (use
the nullish coalescing operator, e.g. const numWithDetections =
this.numImagesWithDetections ?? 0), compute pct from numWithDetections and
this.numImages, and use numWithDetections.toLocaleString() in the returned
template so the label never renders "undefined".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ami/main/models.py`:
- Around line 3741-3744: Remove the redundant null-checks in the queryset: drop
the negated Q(models.Q(images__detections__bbox=None)) since
Q(...__bbox__isnull=True) and Q(...__bbox=None) are equivalent for JSONField,
and also remove the now-unnecessary models.Q(images__detections__isnull=False)
because excluding bbox IS NULL already guarantees a detection row; update the
queryset fragment that currently contains
models.Q(images__detections__isnull=False) &
~models.Q(images__detections__bbox__isnull=True) &
~models.Q(images__detections__bbox=None) &
~models.Q(images__detections__bbox=[]) to only keep the meaningful predicates
(e.g., ~models.Q(images__detections__bbox__isnull=True) and
~models.Q(images__detections__bbox=[])).

In `@ami/ml/models/pipeline.py`:
- Around line 917-924: filter_processed_images is incorrectly treating the null
placeholder detections (created by create_null_detections_for_undetected_images,
which use detections with bbox=None) as "unprocessed" because they have no
classifications; add a branch in filter_processed_images before the existing
classifications__isnull check that detects when existing_detections consists
only of null detections (e.g., all detections have bbox==None or an equivalent
null marker) and treat that case as "processed - no animal found" so the image
is skipped on subsequent runs; update any comments/tests to reflect that
null-only detection sets should not cause re-queuing.
- Around line 830-834: The int() casts on detection.source_image_id and
source_img.id can raise ValueError for non-numeric IDs; change the membership
check to use string comparison instead of int conversion by building
source_images_with_detections from str(detection.source_image_id) (iterating
results.detections) and comparing against str(source_img.id) when iterating
results.source_images (or otherwise guard the cast with a try/except and fall
back to string comparison) so comparisons are robust for UUIDs and other
non-integer IDs.

In `@ui/src/pages/project/capture-sets/capture-set-columns.tsx`:
- Around line 113-114: The renderCell parameter is typed as Collection but that
type isn't imported—replace the incorrect type with CaptureSet (i.e., change the
renderCell signature from item: Collection to item: CaptureSet) to match the
other renderCell entries and ensure CaptureSet is imported at the top of the
file so the BasicTableCell usage of item.numImagesProcessedLabel compiles.

---

Outside diff comments:
In `@ami/ml/models/pipeline.py`:
- Around line 939-944: The code passes the full detections list (including null
detections with bbox=None) into create_and_update_occurrences_for_detections
which creates empty Occurrence records; filter the detections before calling
that function (e.g., in save_results) by keeping only detections where bbox is
not None (or detection is truthy) and then call
create_and_update_occurrences_for_detections with that filtered list so null
detections are ignored and no phantom Occurrence rows are created.

---

Duplicate comments:
In `@ami/ml/models/pipeline.py`:
- Around line 835-840: The loop over algorithms_known silently lets the last
detection algorithm win; modify the logic in the block that sets
detector_algorithm_reference so that if a second AlgorithmTaskType.DETECTION is
encountered (i.e., detector_algorithm_reference is already set) you surface a
clear failure (raise a ValueError/RuntimeError or log+raise) instead of
overwriting; reference the symbols algorithms_known,
AlgorithmTaskType.DETECTION, detector_algorithm_reference, and
AlgorithmReference to implement the check and error with a descriptive message
listing conflicting algorithm names/keys.

---

Nitpick comments:
In `@ui/src/data-services/models/capture-set.ts`:
- Around line 82-101: The getter numImagesWithDetectionsLabel can produce
"undefined (0%)" because it calls this.numImagesWithDetections?.toLocaleString()
without a fallback; update it to mirror numImagesProcessedLabel by defaulting
numImagesWithDetections to 0 (use the nullish coalescing operator, e.g. const
numWithDetections = this.numImagesWithDetections ?? 0), compute pct from
numWithDetections and this.numImages, and use numWithDetections.toLocaleString()
in the returned template so the label never renders "undefined".

Comment on lines +3741 to +3744
models.Q(images__detections__isnull=False)
& ~models.Q(images__detections__bbox__isnull=True)
& ~models.Q(images__detections__bbox=None)
& ~models.Q(images__detections__bbox=[])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

~Q(images__detections__bbox=None) is redundant — same issue patched elsewhere in this PR.

In Django ORM, Q(bbox__isnull=True) and Q(bbox=None) are identical for JSONField: both emit bbox IS NULL. Lines 3742 and 3743 therefore negate the exact same SQL predicate, so the second negation is a no-op. The same redundancy was already corrected in get_detections_count() as part of this PR, but was carried over to this new method.

🛠️ Proposed fix
     def with_source_images_with_detections_count(self):
         return self.annotate(
             source_images_with_detections_count=models.Count(
                 "images",
                 filter=(
-                    models.Q(images__detections__isnull=False)
-                    & ~models.Q(images__detections__bbox__isnull=True)
-                    & ~models.Q(images__detections__bbox=None)
-                    & ~models.Q(images__detections__bbox=[])
+                    ~models.Q(images__detections__bbox__isnull=True)
+                    & ~models.Q(images__detections__bbox=[])
                 ),
                 distinct=True,
             )
         )

Q(images__detections__isnull=False) is also redundant once ~Q(bbox__isnull=True) is in place (a non-null bbox already implies a detection row exists).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ami/main/models.py` around lines 3741 - 3744, Remove the redundant
null-checks in the queryset: drop the negated
Q(models.Q(images__detections__bbox=None)) since Q(...__bbox__isnull=True) and
Q(...__bbox=None) are equivalent for JSONField, and also remove the
now-unnecessary models.Q(images__detections__isnull=False) because excluding
bbox IS NULL already guarantees a detection row; update the queryset fragment
that currently contains models.Q(images__detections__isnull=False) &
~models.Q(images__detections__bbox__isnull=True) &
~models.Q(images__detections__bbox=None) &
~models.Q(images__detections__bbox=[]) to only keep the meaningful predicates
(e.g., ~models.Q(images__detections__bbox__isnull=True) and
~models.Q(images__detections__bbox=[])).

Comment on lines +830 to +834
source_images_with_detections = {int(detection.source_image_id) for detection in results.detections}
null_detections_to_add = []

for source_img in results.source_images:
if int(source_img.id) not in source_images_with_detections:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

int() cast can raise ValueError on non-integer source image IDs.

Both int(detection.source_image_id) (line 830) and int(source_img.id) (line 834) will throw if the ID is ever non-numeric (e.g., UUID-based pipelines). The comparison should be done in string space to stay consistent with the schema types, or the cast should be guarded.

🛠️ Proposed fix — compare as strings
-    source_images_with_detections = {int(detection.source_image_id) for detection in results.detections}
+    source_images_with_detections = {str(detection.source_image_id) for detection in results.detections}

     for source_img in results.source_images:
-        if int(source_img.id) not in source_images_with_detections:
+        if str(source_img.id) not in source_images_with_detections:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
source_images_with_detections = {int(detection.source_image_id) for detection in results.detections}
null_detections_to_add = []
for source_img in results.source_images:
if int(source_img.id) not in source_images_with_detections:
source_images_with_detections = {str(detection.source_image_id) for detection in results.detections}
null_detections_to_add = []
for source_img in results.source_images:
if str(source_img.id) not in source_images_with_detections:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ami/ml/models/pipeline.py` around lines 830 - 834, The int() casts on
detection.source_image_id and source_img.id can raise ValueError for non-numeric
IDs; change the membership check to use string comparison instead of int
conversion by building source_images_with_detections from
str(detection.source_image_id) (iterating results.detections) and comparing
against str(source_img.id) when iterating results.source_images (or otherwise
guard the cast with a try/except and fall back to string comparison) so
comparisons are robust for UUIDs and other non-integer IDs.

Comment on lines +917 to +924
# Ensure all images have detections
# if not, add a NULL detection (empty bbox) to the results
null_detections = create_null_detections_for_undetected_images(
results=results,
algorithms_known=algorithms_known,
logger=job_logger,
)
results.detections = results.detections + null_detections
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Null detections will cause images to be re-queued on every subsequent pipeline run.

filter_processed_images (line 87) yields an image if any existing detection lacks classifications:

elif existing_detections.filter(classifications__isnull=True).exists():
    yield image

A null detection (bbox=None) has no classifications, so this condition is always True for it. Every image that was processed and returned no detections will be re-sent on the next job run, defeating the deduplication logic. filter_processed_images needs a new branch before the classifications__isnull check to recognise null-only detections as "processed, no animal found — skip".

🛠️ Proposed fix in filter_processed_images (lines 81–115)
     for image in images:
         existing_detections = image.detections.filter(detection_algorithm__in=pipeline_algorithms)
+        real_detections = existing_detections.filter(bbox__isnull=False)
+        null_only = existing_detections.exists() and not real_detections.exists()
+
         if not existing_detections.exists():
             task_logger.debug(f"Image {image} needs processing: has no existing detections from pipeline's detector")
             yield image
-        elif existing_detections.filter(classifications__isnull=True).exists():
+        elif null_only:
+            task_logger.debug(f"Image {image} was processed with no detections (null detection exists), skipping")
+            continue
+        elif real_detections.filter(classifications__isnull=True).exists():
             task_logger.debug(...)
             yield image
         else:
             pipeline_algorithm_ids = set(classification_algorithms.values_list("id", flat=True))
-            detection_algorithm_ids = set(existing_detections.values_list("classifications__algorithm_id", flat=True))
+            detection_algorithm_ids = set(real_detections.values_list("classifications__algorithm_id", flat=True))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ami/ml/models/pipeline.py` around lines 917 - 924, filter_processed_images is
incorrectly treating the null placeholder detections (created by
create_null_detections_for_undetected_images, which use detections with
bbox=None) as "unprocessed" because they have no classifications; add a branch
in filter_processed_images before the existing classifications__isnull check
that detects when existing_detections consists only of null detections (e.g.,
all detections have bbox==None or an equivalent null marker) and treat that case
as "processed - no animal found" so the image is skipped on subsequent runs;
update any comments/tests to reflect that null-only detection sets should not
cause re-queuing.

Comment on lines +113 to +114
renderCell: (item: Collection) => (
<BasicTableCell value={item.numImagesProcessedLabel} />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Collection is not defined or imported — TypeScript compilation will fail.

Line 113 uses item: Collection as the renderCell parameter type, but Collection is never imported in this file. Every other renderCell in the same array uses item: CaptureSet. This is a copy-paste error that will produce a TypeScript compile error.

🐛 Proposed fix
-    renderCell: (item: Collection) => (
+    renderCell: (item: CaptureSet) => (
       <BasicTableCell value={item.numImagesProcessedLabel} />
     ),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
renderCell: (item: Collection) => (
<BasicTableCell value={item.numImagesProcessedLabel} />
renderCell: (item: CaptureSet) => (
<BasicTableCell value={item.numImagesProcessedLabel} />
),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/src/pages/project/capture-sets/capture-set-columns.tsx` around lines 113 -
114, The renderCell parameter is typed as Collection but that type isn't
imported—replace the incorrect type with CaptureSet (i.e., change the renderCell
signature from item: Collection to item: CaptureSet) to match the other
renderCell entries and ensure CaptureSet is imported at the top of the file so
the BasicTableCell usage of item.numImagesProcessedLabel compiles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make it clear what images have not been processed

2 participants

Comments