Skip to content

Semantic ReID extension#997

Closed
saratpoluri wants to merge 38 commits intorelease-2026.0from
feature/semantic-reid
Closed

Semantic ReID extension#997
saratpoluri wants to merge 38 commits intorelease-2026.0from
feature/semantic-reid

Conversation

@saratpoluri
Copy link
Contributor

@saratpoluri saratpoluri commented Feb 10, 2026

📝 Description

This pull request implements a Semantic ReID Extension that adds 2-tier hybrid search capabilities to the Scene Controller's Re-ID system. The architecture combines TIER 1 metadata filtering (semantic attributes like age, gender) with TIER 2 vector similarity search for more accurate object re-identification.

✨ Type of Change

Select the type of change your PR introduces:

  • 🐞 Bug fix – Non-breaking change which fixes an issue
  • 🚀 New feature – Non-breaking change which adds functionality
  • 🔨 Refactor – Non-breaking change which refactors the code base
  • 💥 Breaking change – Changes that break existing functionality
  • 📚 Documentation update
  • 🔒 Security update
  • 🧪 Tests
  • 🚂 CI

🧪 Testing Scenarios

Describe how the changes were tested and how reviewers can test them too:

  • ✅ Tested manually
  • 🤖 Ran automated end-to-end tests

✅ Checklist

Before submitting the PR, ensure the following:

  • 🔍 PR title is clear and descriptive
  • 📝 For internal contributors: If applicable, include the JIRA ticket number (e.g., ITEP-123456) in the PR title. Do not include full URLs
  • 💬 I have commented my code, especially in hard-to-understand areas
  • 📄 I have made corresponding changes to the documentation
  • ✅ I have added tests that prove my fix is effective or my feature works

@saratpoluri saratpoluri marked this pull request as draft February 10, 2026 21:53
@saratpoluri saratpoluri force-pushed the feature/semantic-reid branch from e319b42 to 9f4c5ae Compare February 11, 2026 03:03
Todo:
reid-unique-count results with purely reid model after detection must
match results before the change
Add a functional test to verify that semantic metadata logic is working
Comment on lines 57 to 60
if name and name != 'detection' and ('reid' not in name and 'embedding' not in name):
categories[name] = {'label': tensor.get('label',''),
'confidence': tensor.get('confidence', 100.0),
'model_name': tensor.get('model_name', '')}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an edge case with this code such that we assume name that we get is unique, however running some experiments with the new models the names are not unique (they are all named classification_layer_name:logits).
When names collide, later tensors overwrite earlier ones in the dict, causing data loss.

We either have to:

  • ensure uniqueness of the name key in our code (e.g. by adding a suffix in case of name clash)
  • ensure uniqueness of the name key from the DLSPS metadata output
  • use an array, not a dict

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array instead of dict will fail eventually. What we want is backward traceability of what created the metadata. Whether the uniqueness is ensured by DLSPS internally or we write it in our sscape_policies doesn't matter except for the long-term maintainability. I believe this is a universal problem rather than scenescape only. So my preference would be for handling inside DLSPS. @jakubsikorski could you file a JIRA ticket in the DLS project and add me and Dorau as watchers.

Thank you for being proactive and highlighting this.

@saratpoluri saratpoluri force-pushed the feature/semantic-reid branch 2 times, most recently from 4c06e34 to 06601e4 Compare February 16, 2026 22:07
@saratpoluri saratpoluri marked this pull request as ready for review February 17, 2026 00:07
networks:
scenescape-test:

secrets:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With secrets added, the vdms server gets stuck at initialization. The reid-unique-count test was passing prior to this change because the failure threshold was too generous. I will work with vdms team to resolve this issue.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements a Semantic ReID Extension that adds 2-tier hybrid search capabilities to the Scene Controller's Re-ID system. The architecture combines TIER 1 metadata filtering (semantic attributes like age, gender) with TIER 2 vector similarity search for more accurate object re-identification.

Changes:

  • Adds semantic metadata support with confidence-based constraint routing (>=0.8 for AND, <0.8 for OR)
  • Implements VDMS adapter for storing/querying reid vectors with metadata
  • Updates data structures to use dict format for reid (embedding_vector + model_name)
  • Includes comprehensive unit tests (841 lines for VDMS adapter, 415 for UUID manager) and functional integration tests

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
tests/sscape_tests/vdms_adapter/test_vdms_adapter.py Comprehensive unit tests for VDMS adapter with interface, initialization, entry addition, matching, and constraint building tests
tests/sscape_tests/uuid_manager/test_uuid_manager.py Unit tests for UUID manager covering reid extraction, semantic metadata extraction, and tracker ID management
tests/functional/tc_reid_data_flow.py End-to-end integration test validating 4 scenarios: no metadata, reid-only, semantic-only, and combined reid+semantic
tests/functional/tc_reid_semantic_unique_count.py Functional test verifying unique detection counts with semantic classification enabled
controller/src/controller/vdms_adapter.py Core VDMS database adapter implementing 2-tier hybrid search with metadata filtering and vector similarity
controller/src/controller/uuid_manager.py UUID manager updates for semantic metadata extraction and feature gathering with timeout-based flushing
controller/src/controller/reid.py ReID interface updates adding metadata parameter to addEntry and renaming findSimilarityScores to findMatches
controller/src/controller/moving_object.py MovingObject changes to support reid dict format and metadata extraction from detection messages
controller/src/controller/scene.py Scene controller updates to deserialize reid from metadata structure
controller/src/controller/detections_builder.py Detection builder updates to output reid in metadata structure
controller/src/schema/metadata.schema.json JSON schema updates adding semantic_metadata and semantic_metadata_attribute definitions
dlstreamer-pipeline-server/user_scripts/gvapython/sscape/sscape_policies.py DLStreamer policy updates to extract reid and classification metadata into new format
controller/docs/user-guide/Extended-ReID.md New documentation explaining 2-tier hybrid search implementation and confidence-based routing
tests/Makefile.sscape Makefile updates adding uuid-manager-unit and vdms-adapter-unit test targets
tests/Makefile.functional Functional test targets for reid-data-flow and reid-semantic-unique-count
tests/compose/vdms.yml VDMS compose configuration with TLS removed (non-TLS mode)
tests/compose/scene_reid.yml Scene compose configuration with VDMS client cert secrets removed
dlstreamer-pipeline-server/queuing-config-reid-semantic.json DLStreamer pipeline configuration for reid with semantic classification

saratpoluri and others added 2 commits February 16, 2026 21:41
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@dmytroye dmytroye enabled auto-merge (squash) February 20, 2026 11:25
Copy link
Contributor

@tdorauintc tdorauintc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fully reviewed yet

return 0


def test_reid_data_flow_end_to_end(params, record_xml_attribute):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[may be done in another PR] I suggest to refactor the function and split into separate functions or a test class with a state and separate methods for test setup, clean-up and each scenario for better maintainability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we make the change (apply ReID on ROI from detection stage) in original dlstreamer-pipeline-server/{queuing|retail}-config-reid.json files and keep a single file for legacy reid and semantic-reid?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The earlier comment is not valid any more, I realized that this file contains additional age-gender classification model. I would advice to add a similar config file for retail scene and document it in user guide.

obj.rotation = obj_data.get('rotation')
obj.reidVector = obj_data.get('reid')
# Extract reid from metadata if present
metadata = obj_data.get('metadata', {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer a separate reid_config.json rather than introducing tagging in the VA pipeline. Reason: the 2nd solution (reid config) is "I tell the component doing ReID how it should interpret its input", the 1st one is: "I tell the component doing VA how it should tell the ReID component how it should interpret my metadata", which also makes sense but has downside of introducing another coupling between VA and ReID implementation.

@param record_xml_attribute Pytest fixture recording the test name.
@return exit_code Indicates test success or failure.
"""
TEST_NAME = "NEX-T10539"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‎tc_reid_unique_count.py (NEX-T10539) - implementation does not check the objective of the Zephyr test.
This test only checks counter integrity. It does not measure unique‑detection counts in two configurations (ReID disabled vs ReID enabled) and does not verify that counts are reduced by at least 50% when Re‑ID is enabled, as required by the Zephyr test steps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixtures mock_vdms and mock_log are not used by any test and should be removed. test_uuid_manager.py uses direct @patch calls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, neither of the fixtures is used in the test.

@saratpoluri saratpoluri changed the base branch from main to release-2026.0 February 24, 2026 04:53
name: {{ .Release.Name }}-tracker-config
- name: reid-config
configMap:
name: {{ .Release.Name }}-tracker-config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name: {{ .Release.Name }}-tracker-config
name: {{ .Release.Name }}-reid-config

Comment on lines +240 to +242
log.info("Waiting for scene controller and VDMS to initialize...")
time.sleep(40)
log.info("Scene controller initialization wait complete")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[To be fixed in future PR]
This might be a root test flakiness, it's better to poll for components readiness (assuming we have it implemented).

Comment on lines +56 to +58
self.stale_feature_timer = threading.Timer(1.0, check_stale_features)
self.stale_feature_timer.daemon = True
self.stale_feature_timer.start()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can avoid copy paste here by calling "parent" function.

Suggested change
self.stale_feature_timer = threading.Timer(1.0, check_stale_features)
self.stale_feature_timer.daemon = True
self.stale_feature_timer.start()
self._start_stale_feature_timer()

self.stale_feature_timer.daemon = True
self.stale_feature_timer.start()

self.stale_feature_timer = threading.Timer(1.0, check_stale_features)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we create const for this magic number?
Should it be configurable? If so, I suggest to postpone it to follow-up PR


self.stale_feature_timer = threading.Timer(1.0, check_stale_features)
self.stale_feature_timer.daemon = True
self.stale_feature_timer.start()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should gracefully stop the timer, the stop() method should be called wherever the UUIDManager is destroyed.

Copy link
Contributor

@tdorauintc tdorauintc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments starting with [future] are not blocking the PR and may be addressed in another PR.

Comment on lines 70 to 73
- source: vdms-client-key
target: certs/scenescape-vdms-c.key
- source: vdms-client-cert
target: certs/scenescape-vdms-c.crt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be removed. I get error: service "scene" refers to undefined secret vdms-client-key: invalid compose project when running out-of-box demo

Comment on lines +317 to +328
log.debug(f"[VDMS] findMatches returned {len(result)} result(s) from {len(reid_vectors)} vector(s)")
for idx, entities in enumerate(result):
if entities:
log.debug(f"[VDMS] Vector {idx}: found {len(entities)} matches")
# Log distance scores for debugging
for match_idx, entity in enumerate(entities[:3]): # Show first 3 matches
distance = entity.get('_distance', 'unknown')
uuid = entity.get('uuid', 'unknown')
rvid = entity.get('rvid', 'unknown')
log.debug(f"[VDMS] Match {match_idx}: uuid={uuid}, rvid={rvid}, distance={distance}")
return result
log.debug("[VDMS] findMatches returned None (no response from VDMS)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[future] This loop is only for debug purposes. For performance reasons it would make sense to execute it conditionally, e.g.:

if log.logger.isEnabledFor(logging.DEBUG):
  ...

# Case for incrementing the counter when there is no re-id vector
if sscape_object.reidVector is None and result is None:
if sscape_object.reid is None and result is None:
self.unique_id_count += 1
Copy link
Contributor

@tdorauintc tdorauintc Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the logic that increments this counter in two cases: here and in pruneInactiveTracks function. Please make sure it is expected behavior.

[future] Incrementing a counter in a function named isNewTrackerID is not intuitive, and hence it may cause double counting if someone in future decides to run it another place in code. I would expect rather that a caller of this function increments the counter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The earlier comment is not valid any more, I realized that this file contains additional age-gender classification model. I would advice to add a similar config file for retail scene and document it in user guide.

@tdorauintc
Copy link
Contributor

One more comment: I did some tests and found that metadata attributes (e.g. age, gender) are not published in tracks, contrary to ReID embeddings which are outputted in data/scene and data/regulated topics messages. It is not consistent, I would rather expect none or both. Is it expected behavior?

@Irakus Irakus deleted the branch release-2026.0 February 25, 2026 15:41
@Irakus Irakus closed this Feb 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants