From ddf91bc4e8242079cb4f137101a29bbb93006631 Mon Sep 17 00:00:00 2001 From: WANGXIAOMIN-HIK Date: Thu, 26 Jun 2025 16:08:03 +0800 Subject: [PATCH 01/11] Natural Language and Image-Based Search Support for Recordings To enhance ONVIF's search capabilities, the following operations have been added to support natural language and image-based search for video recordings: FindImagebyNL Purpose: Starts a search session using natural language descriptions to locate relevant video recordings. Example Query: "Person wearing a red hat." Parameters: StartPoint: Start time for the search. EndPoint: End time for the search. RecordingToken: (Optional) Token for the recording to search. Text: Natural language description for the search. Likelihood: (Optional) Similarity threshold for the search (0~1). MaxMatches: (Optional) Maximum number of matches to return. KeepAliveTime: Time the search session will be kept alive. Response: SearchToken: A unique reference to the search session. GetNLSearchResults Purpose: Retrieves results from a natural language search session initiated by FindImagebyNL. Parameters: SearchToken: Token identifying the search session. MinResults: (Optional) Minimum number of results to return. MaxResults: (Optional) Maximum number of results to return. WaitTime: (Optional) Maximum time to wait for results. Response: ResultList: List of matching results, including metadata such as TargetImageURI, Time, Likelihood, and RecordingToken. FindImagebyImage Purpose: Starts a search session using a target image to locate relevant video recordings. Parameters: StartPoint: Start time for the search. EndPoint: End time for the search. RecordingToken: (Optional) Token for the recording to search. TargetImageURI: URI of the target image to be searched. MaxMatches: (Optional) Maximum number of matches to return. KeepAliveTime: Time the search session will be kept alive. Response: SearchToken: A unique reference to the search session. GetImageSearchResults Purpose: Retrieves results from an image-based search session initiated by FindImagebyImage. Parameters: SearchToken: Token identifying the search session. MinResults: (Optional) Minimum number of results to return. MaxResults: (Optional) Maximum number of results to return. WaitTime: (Optional) Maximum time to wait for results. Response: ResultList: List of matching results, including metadata such as TargetImageURI, Time, Likelihood, and RecordingToken. Schema Updates onvif.xsd: Added complex types for FindImageResult and FindImageResultList to support result structures for both natural language and image-based searches. Included fields like TargetImageURI, Time, Likelihood, and RecordingToken. search.wsdl: Defined operations FindImagebyNL, GetNLSearchResults, FindImagebyImage, and GetImageSearchResults. Added request and response elements for each operation. Documentation Updates RecordingSearch.xml: Added detailed descriptions for FindImagebyNL and GetNLSearchResults operations, explaining their purpose, parameters, and responses. --- doc/RecordingSearch.xml | 195 ++++++++++++++++++++++++++++++ wsdl/ver10/schema/onvif.xsd | 39 ++++++ wsdl/ver10/search.wsdl | 231 +++++++++++++++++++++++++++++++++++- 3 files changed, 464 insertions(+), 1 deletion(-) diff --git a/doc/RecordingSearch.xml b/doc/RecordingSearch.xml index e6840df9a..ec0ae4ef2 100644 --- a/doc/RecordingSearch.xml +++ b/doc/RecordingSearch.xml @@ -1320,6 +1320,193 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace +
+ FindImagebyNL + FindImagebyNL starts a search session, looking for video records based on a natural language description of the content, such as "a person wearing a red hat" or "a blue car". Results from the search are acquired using the GetNLSearchResults request, specifying the search token returned from this request. + The device shall continue searching until one of the following occurs: + + + The entire time range from StartPoint to EndPoint has been searched through. + + + The total number of matches has been found, defined by the MaxMatches parameter. + + + The session has been ended by a client EndSearch request. + + + The session has been ended because KeepAliveTime since the last request related to this session has expired. + + + + + request + + StartPoint [xs:dateTime] + The point of time where the search will start. + EndPoint [xs:dateTime] + The point of time where the search will stop. + RecordingToken - optional [tt:RecordingReference] + Token for the recording to search. + Text [xs:string] + Natural language description for the search. + Likelihood - optional [xs:float] + Likelihood threshold for the search (0-1). + MaxMatches - optional [xs:int] + The search ends after MaxMatches. + KeepAliveTime [xs:duration] + The session timeout after each request concerning this session. + + + + response + + SearchToken [tt:JobToken] + Identifies the search session created by this request. + + + + faults + + env:Receiver - ter:Action - ter:ResourceProblem + Device is unable to create a new search session. + + + + access class + + READ_MEDIA + + + +
+ +
+ GetNLSearchResults + GetNLSearchResults acquires the results from a natural language search session previously initiated by a FindImagebyNL operation. The response shall not include results already returned in previous requests for the same session. + + + request + + SearchToken [tt:JobToken] + Specifies the search session. + MinResults - optional [xs:int] + Specifies the minimum number of results that should be returned. + MaxResults – optional [xs:int] + Specifies the maximum number of results to return. + WaitTime – optional [xs:duration] + Defines the maximum time to block, waiting for results. + + + + response + + ResultList [tt:FindImageResultList] + A structure containing the search results. + + + + faults + + env:Sender - ter:InvalidArgVal - ter:InvalidToken + The search token is invalid. + + + + access class + + READ_MEDIA + + + +
+ +
+ FindImagebyImage + FindImagebyImage starts a search session, looking for video records based on a provided image. Results from the search are acquired using the GetImageSearchResults request, specifying the search token returned from this request. + + + request + + StartPoint [xs:dateTime] + The point of time where the search will start. + EndPoint [xs:dateTime] + The point of time where the search will stop. + RecordingToken - optional [tt:RecordingReference] + Token for the recording to search. + TargetImageURI [xs:anyURI] + URI of the image used for the search. + MaxMatches - optional [xs:int] + The search ends after MaxMatches. + KeepAliveTime [xs:duration] + The session timeout after each request concerning this session. + + + + response + + SearchToken [tt:JobToken] + Identifies the search session created by this request. + + + + faults + + env:Receiver - ter:Action - ter:ResourceProblem + Device is unable to create a new search session. + + + + access class + + READ_MEDIA + + + +
+
+ GetImageSearchResults + GetImageSearchResults acquires the results from an image-based search session previously initiated by a FindImagebyImage operation. The response shall not include results already returned in previous requests for the same session. + + + request + + SearchToken [tt:JobToken] + Specifies the search session. + MinResults - optional [xs:int] + Specifies the minimum number of results that should be returned. + MaxResults – optional [xs:int] + Specifies the maximum number of results to return. + WaitTime – optional [xs:duration] + Defines the maximum time to block, waiting for results. + + + + response + + ResultList [tt:FindImageResultList] + A structure containing the search results. + + + + faults + + env:Sender - ter:InvalidArgVal - ter:InvalidToken + The search token is invalid. + + + + access class + + READ_MEDIA + + + +
+ + + +
EndSearch EndSearch stops an ongoing search session, causing any blocking result request to return and the SearchToken to become invalid. If the search was interrupted before completion, the point in time that the search had reached shall be returned. If the search had not yet begun, the StartPoint shall be returned. Note that an error message will occur if the search session has been already completed before this request. If the search was completed the original EndPoint supplied by the Find operation shall be returned. When issuing EndSearch on a FindRecordings request the EndPoint is undefined and shall not be used since the FindRecordings request doesn't have StartPoint/EndPoint. This operation is mandatory to support for a device implementing the recording search service. @@ -1387,6 +1574,14 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace GeneralStartEvents Indicates support for general virtual property events in the FindEvents method + + NLSearch + Indicates if the device supports natural language based search for recorded video + + + ImageSearch + Indicates if the device supports image based search for recorded video +
diff --git a/wsdl/ver10/schema/onvif.xsd b/wsdl/ver10/schema/onvif.xsd index d639c40d8..f8ce2fe9c 100755 --- a/wsdl/ver10/schema/onvif.xsd +++ b/wsdl/ver10/schema/onvif.xsd @@ -3565,6 +3565,12 @@ decoding .A decoder shall decode every data it receives (according to its capabi + + Indicates support for natural language based search. + + + Indicates support for image based search. + @@ -7621,6 +7627,37 @@ and sample rate. + + + + + + The state of the search when the result is returned. Indicates if there can be more results, or if the search is completed. + + + + A FindImageResult structure for each found set of image matching the search. + + + + + + + + + The detected target image URI in LocalStorage format. + + + The time when the target was found. Users can use this as a reference point to search records that occurred before and after this specific time point. + + + Likelihood score of the target between 0~1. + + + The recording where this result was found. + + + @@ -9192,6 +9229,8 @@ and sample rate. + + diff --git a/wsdl/ver10/search.wsdl b/wsdl/ver10/search.wsdl index c995be557..564a26c71 100644 --- a/wsdl/ver10/search.wsdl +++ b/wsdl/ver10/search.wsdl @@ -37,6 +37,8 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO Indicates support for general virtual property events in the FindEvents method. + Indicates support for natural language based search. + Indicates support for image based search. @@ -444,7 +446,143 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - + + + + + Starts a search session and specifies the search parameters. + + + + + Start time for the search. + + + End time for the search. + + + Token for the recording to search. + + + Natural language description for the search (e.g., "person with red hat"). + + + Likelihood threshold for the search between 0~1. + + + Maximum number of matches to return in the response. + + + The time the search session will be kept alive after responding to this and subsequent requests. + + + + + + + + + A unique reference to the search session created by this request. + + + + + + + + + Gets results from a particular search session. + + + + + The search session to get results from. + + + The minimum number of results to return in one response. + + + The maximum number of results to return in one response. + + + The maximum time before responding to the request, even if the MinResults parameter is not fulfilled. + + + + + + + + + + + + + + + + Starts a search session and specifies the search parameters. + + + + + Start time for the search. + + + End time for the search. + + + Token for the recording to search. + + + The target image to be searched in LocalStorage URI format. + + + Maximum number of matches to return in the response. + + + The time the search session will be kept alive after responding to this and subsequent requests. + + + + + + + + + A unique reference to the search session created by this request. + + + + + + + + + + + The search session to get results from. + + + The minimum number of results to return in one response. + + + The maximum number of results to return in one response. + + + The maximum time before responding to the request, even if the MinResults parameter is not fulfilled. + + + + + + + + + + + + @@ -532,6 +670,30 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO + + + + + + + + + + + + + + + + + + + + + + + + @@ -732,6 +894,31 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO + + + Starts a natural language search session and specifies the search parameters. + + + + + + Gets results from a natural language search session. + + + + + + Starts an image-based search session and specifies the search parameters. + + + + + + + Gets results from an image-based search session. + + + @@ -878,5 +1065,47 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + From c272ef28d6efcb0a54518722fb4d2a6755e9691c Mon Sep 17 00:00:00 2001 From: WANGXIAOMIN-HIK Date: Mon, 30 Jun 2025 15:20:13 +0800 Subject: [PATCH 02/11] Expand the recording search feature to support multi-recording token search Updated document and WSDL definitions to allow multiple recording tokens to be passed in in a search operation to query multiple recordings at the same time. --- doc/RecordingSearch.xml | 4 ++-- wsdl/ver10/search.wsdl | 8 ++++---- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/doc/RecordingSearch.xml b/doc/RecordingSearch.xml index ec0ae4ef2..a7737944b 100644 --- a/doc/RecordingSearch.xml +++ b/doc/RecordingSearch.xml @@ -1347,7 +1347,7 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace EndPoint [xs:dateTime] The point of time where the search will stop. RecordingToken - optional [tt:RecordingReference] - Token for the recording to search. + This element contains a list of recording tokens to search. Text [xs:string] Natural language description for the search. Likelihood - optional [xs:float] @@ -1433,7 +1433,7 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace EndPoint [xs:dateTime] The point of time where the search will stop. RecordingToken - optional [tt:RecordingReference] - Token for the recording to search. + This element contains a list of recording tokens to search. TargetImageURI [xs:anyURI] URI of the image used for the search. MaxMatches - optional [xs:int] diff --git a/wsdl/ver10/search.wsdl b/wsdl/ver10/search.wsdl index 564a26c71..7e4a4037c 100644 --- a/wsdl/ver10/search.wsdl +++ b/wsdl/ver10/search.wsdl @@ -460,8 +460,8 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO End time for the search. - - Token for the recording to search. + + This element contains a list of recording tokens to search. Natural language description for the search (e.g., "person with red hat"). @@ -531,8 +531,8 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO End time for the search. - - Token for the recording to search. + + This element contains a list of recording tokens to search. The target image to be searched in LocalStorage URI format. From 5b48ba2242fcb5c9c44fe015775a15aa8edc7d69 Mon Sep 17 00:00:00 2001 From: WANGXIAOMIN-HIK Date: Mon, 14 Jul 2025 17:19:09 +0800 Subject: [PATCH 03/11] #603 update 1.change the FindImageByImage to SearchImageByImage, and FindImageByNL to SearchImageByNL. 2.memo the TargetImageURI is the result returned from SearchImageByNL. --- doc/RecordingSearch.xml | 18 +++++------ wsdl/ver10/search.wsdl | 66 ++++++++++++++++++++--------------------- 2 files changed, 42 insertions(+), 42 deletions(-) diff --git a/doc/RecordingSearch.xml b/doc/RecordingSearch.xml index a7737944b..f9e17b3ec 100644 --- a/doc/RecordingSearch.xml +++ b/doc/RecordingSearch.xml @@ -1321,8 +1321,8 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace
- FindImagebyNL - FindImagebyNL starts a search session, looking for video records based on a natural language description of the content, such as "a person wearing a red hat" or "a blue car". Results from the search are acquired using the GetNLSearchResults request, specifying the search token returned from this request. + SearchImageByNL + SearchImageByNL starts a search session, looking for video records based on a natural language description of the content, such as "a person wearing a red hat" or "a blue car". Results from the search are acquired using the GetNLSearchResults request, specifying the search token returned from this request. The device shall continue searching until one of the following occurs: @@ -1347,7 +1347,7 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace EndPoint [xs:dateTime] The point of time where the search will stop. RecordingToken - optional [tt:RecordingReference] - This element contains a list of recording tokens to search. + Token for the recording to search. Text [xs:string] Natural language description for the search. Likelihood - optional [xs:float] @@ -1383,7 +1383,7 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace
GetNLSearchResults - GetNLSearchResults acquires the results from a natural language search session previously initiated by a FindImagebyNL operation. The response shall not include results already returned in previous requests for the same session. + GetNLSearchResults acquires the results from a natural language search session previously initiated by a SearchImageByNL operation. The response shall not include results already returned in previous requests for the same session. request @@ -1422,8 +1422,8 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace
- FindImagebyImage - FindImagebyImage starts a search session, looking for video records based on a provided image. Results from the search are acquired using the GetImageSearchResults request, specifying the search token returned from this request. + SerachImagebyImage + SerachImagebyImage starts a search session, looking for video records based on a provided image. Results from the search are acquired using the GetImageSearchResults request, specifying the search token returned from this request. request @@ -1433,9 +1433,9 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace EndPoint [xs:dateTime] The point of time where the search will stop. RecordingToken - optional [tt:RecordingReference] - This element contains a list of recording tokens to search. + Token for the recording to search. TargetImageURI [xs:anyURI] - URI of the image used for the search. + the TargetImageURI is the result returned from SearchImageByNL. MaxMatches - optional [xs:int] The search ends after MaxMatches. KeepAliveTime [xs:duration] @@ -1466,7 +1466,7 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace
GetImageSearchResults - GetImageSearchResults acquires the results from an image-based search session previously initiated by a FindImagebyImage operation. The response shall not include results already returned in previous requests for the same session. + GetImageSearchResults acquires the results from an image-based search session previously initiated by a SerachImagebyImage operation. The response shall not include results already returned in previous requests for the same session. request diff --git a/wsdl/ver10/search.wsdl b/wsdl/ver10/search.wsdl index 7e4a4037c..017ea1dd8 100644 --- a/wsdl/ver10/search.wsdl +++ b/wsdl/ver10/search.wsdl @@ -447,8 +447,8 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - - + + Starts a search session and specifies the search parameters. @@ -460,8 +460,8 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO End time for the search. - - This element contains a list of recording tokens to search. + + Token for the recording to search. Natural language description for the search (e.g., "person with red hat"). @@ -478,7 +478,7 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - + @@ -518,8 +518,8 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - - + + Starts a search session and specifies the search parameters. @@ -531,11 +531,11 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO End time for the search. - - This element contains a list of recording tokens to search. + + Token for the recording to search. - The target image to be searched in LocalStorage URI format. + This TargetImageURI is the result returned from SearchImageByNL. Maximum number of matches to return in the response. @@ -546,7 +546,7 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - + @@ -670,11 +670,11 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - - + + - - + + @@ -682,11 +682,11 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - - + + - - + + @@ -894,11 +894,11 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - - + + Starts a natural language search session and specifies the search parameters. - - + + @@ -906,11 +906,11 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - - + + Starts an image-based search session and specifies the search parameters. - - + + @@ -1066,9 +1066,9 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - - - + + + @@ -1086,9 +1086,9 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - - - + + + From a1eec95b2613c1914a925ead924035b9731cd549 Mon Sep 17 00:00:00 2001 From: WANGXIAOMIN-HIK Date: Mon, 3 Nov 2025 09:41:46 +0800 Subject: [PATCH 04/11] Correct the spelling mistake : SerachImagebyImage -> SearchImagebyImage --- doc/RecordingSearch.xml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/RecordingSearch.xml b/doc/RecordingSearch.xml index f9e17b3ec..6a0951d51 100644 --- a/doc/RecordingSearch.xml +++ b/doc/RecordingSearch.xml @@ -1422,8 +1422,8 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace
- SerachImagebyImage - SerachImagebyImage starts a search session, looking for video records based on a provided image. Results from the search are acquired using the GetImageSearchResults request, specifying the search token returned from this request. + SearchImagebyImage + SearchImagebyImage starts a search session, looking for video records based on a provided image. Results from the search are acquired using the GetImageSearchResults request, specifying the search token returned from this request. request @@ -1466,7 +1466,7 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace
GetImageSearchResults - GetImageSearchResults acquires the results from an image-based search session previously initiated by a SerachImagebyImage operation. The response shall not include results already returned in previous requests for the same session. + GetImageSearchResults acquires the results from an image-based search session previously initiated by a SearchImagebyImage operation. The response shall not include results already returned in previous requests for the same session. request From c0b206aa39d4b4da8602932ffe3e28a6c76c3ee1 Mon Sep 17 00:00:00 2001 From: WANGXIAOMIN-HIK Date: Fri, 7 Nov 2025 16:22:08 +0800 Subject: [PATCH 05/11] Verander SearchImagebyImage na SearchImageByImage --- doc/RecordingSearch.xml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/RecordingSearch.xml b/doc/RecordingSearch.xml index 6a0951d51..5e565f15a 100644 --- a/doc/RecordingSearch.xml +++ b/doc/RecordingSearch.xml @@ -1422,8 +1422,8 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace
- SearchImagebyImage - SearchImagebyImage starts a search session, looking for video records based on a provided image. Results from the search are acquired using the GetImageSearchResults request, specifying the search token returned from this request. + SearchImageByImage + SearchImageByImage starts a search session, looking for video records based on a provided image. Results from the search are acquired using the GetImageSearchResults request, specifying the search token returned from this request. request @@ -1435,7 +1435,7 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace RecordingToken - optional [tt:RecordingReference] Token for the recording to search. TargetImageURI [xs:anyURI] - the TargetImageURI is the result returned from SearchImageByNL. + The TargetImageURI is the result returned from SearchImageByNL. MaxMatches - optional [xs:int] The search ends after MaxMatches. KeepAliveTime [xs:duration] @@ -1466,7 +1466,7 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace
GetImageSearchResults - GetImageSearchResults acquires the results from an image-based search session previously initiated by a SearchImagebyImage operation. The response shall not include results already returned in previous requests for the same session. + GetImageSearchResults acquires the results from an image-based search session previously initiated by a SearchImageByImage operation. The response shall not include results already returned in previous requests for the same session. request From 4509b2c0419c080c4685b5e4e7bb7ff84e81fb4c Mon Sep 17 00:00:00 2001 From: WANGXIAOMIN-HIK Date: Wed, 12 Nov 2025 16:28:49 +0800 Subject: [PATCH 06/11] Update the search functionality, add the TargetImageData parameter, and provide detailed explanation of the use of TargetImageURI --- doc/RecordingSearch.xml | 6 ++++-- wsdl/ver10/schema/onvif.xsd | 2 +- wsdl/ver10/search.wsdl | 7 +++++-- 3 files changed, 10 insertions(+), 5 deletions(-) diff --git a/doc/RecordingSearch.xml b/doc/RecordingSearch.xml index 5e565f15a..99b5ce2f7 100644 --- a/doc/RecordingSearch.xml +++ b/doc/RecordingSearch.xml @@ -1434,8 +1434,10 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace The point of time where the search will stop. RecordingToken - optional [tt:RecordingReference] Token for the recording to search. - TargetImageURI [xs:anyURI] - The TargetImageURI is the result returned from SearchImageByNL. + TargetImageURI - optional [xs:anyURI] + The URI of the detected target object image. This can be either: - a local image stored in the NPL Target Image repository (LocalStorage format), or - an external image provided by the client for image search or feature matching. + TargetImageData - optional [xs:base64Binary] + Base64-encoded target object image used for visual search. MaxMatches - optional [xs:int] The search ends after MaxMatches. KeepAliveTime [xs:duration] diff --git a/wsdl/ver10/schema/onvif.xsd b/wsdl/ver10/schema/onvif.xsd index f8ce2fe9c..1da32f1d1 100755 --- a/wsdl/ver10/schema/onvif.xsd +++ b/wsdl/ver10/schema/onvif.xsd @@ -7645,7 +7645,7 @@ and sample rate. - The detected target image URI in LocalStorage format. + The URI of the detected target object image. This can be either: - a local image stored in the NPL Target Image repository (LocalStorage format), or - an external image provided by the client for image search or feature matching. The time when the target was found. Users can use this as a reference point to search records that occurred before and after this specific time point. diff --git a/wsdl/ver10/search.wsdl b/wsdl/ver10/search.wsdl index 017ea1dd8..cfe5d7649 100644 --- a/wsdl/ver10/search.wsdl +++ b/wsdl/ver10/search.wsdl @@ -534,8 +534,11 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO Token for the recording to search. - - This TargetImageURI is the result returned from SearchImageByNL. + + The URI of the detected target object image. This can be either: - a local image stored in the NPL Target Image repository (LocalStorage format), or - an external image provided by the client for image search or feature matching. + + + Base64-encoded target object image for visual search. Maximum number of matches to return in the response. From 80803e4c66742be3440233fdbe1cbe5865ae7b3e Mon Sep 17 00:00:00 2001 From: WANGXIAOMIN-HIK Date: Fri, 14 Nov 2025 14:56:08 +0800 Subject: [PATCH 07/11] Update the TargetImageURI description, remove the NPL prefix to unify the terminology, as the original meaning refers to the images stored internally on the device. --- doc/RecordingSearch.xml | 2 +- wsdl/ver10/schema/onvif.xsd | 2 +- wsdl/ver10/search.wsdl | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/RecordingSearch.xml b/doc/RecordingSearch.xml index 99b5ce2f7..0ec7e0c16 100644 --- a/doc/RecordingSearch.xml +++ b/doc/RecordingSearch.xml @@ -1435,7 +1435,7 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace RecordingToken - optional [tt:RecordingReference] Token for the recording to search. TargetImageURI - optional [xs:anyURI] - The URI of the detected target object image. This can be either: - a local image stored in the NPL Target Image repository (LocalStorage format), or - an external image provided by the client for image search or feature matching. + The URI of the detected target object image. This can be either: - a local image stored in the Target Image repository (LocalStorage format), or - an external image provided by the client for image search or feature matching. TargetImageData - optional [xs:base64Binary] Base64-encoded target object image used for visual search. MaxMatches - optional [xs:int] diff --git a/wsdl/ver10/schema/onvif.xsd b/wsdl/ver10/schema/onvif.xsd index 1da32f1d1..e21f10da7 100755 --- a/wsdl/ver10/schema/onvif.xsd +++ b/wsdl/ver10/schema/onvif.xsd @@ -7645,7 +7645,7 @@ and sample rate. - The URI of the detected target object image. This can be either: - a local image stored in the NPL Target Image repository (LocalStorage format), or - an external image provided by the client for image search or feature matching. + The URI of the detected target object image. This can be either: - a local image stored in the Target Image repository (LocalStorage format), or - an external image provided by the client for image search or feature matching. The time when the target was found. Users can use this as a reference point to search records that occurred before and after this specific time point. diff --git a/wsdl/ver10/search.wsdl b/wsdl/ver10/search.wsdl index cfe5d7649..1a35aa971 100644 --- a/wsdl/ver10/search.wsdl +++ b/wsdl/ver10/search.wsdl @@ -535,7 +535,7 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO Token for the recording to search. - The URI of the detected target object image. This can be either: - a local image stored in the NPL Target Image repository (LocalStorage format), or - an external image provided by the client for image search or feature matching. + The URI of the detected target object image. This can be either: - a local image stored in the Target Image repository (LocalStorage format), or - an external image provided by the client for image search or feature matching. Base64-encoded target object image for visual search. From a60627939d9e1bd41abe5b58c85cd36427be6e0d Mon Sep 17 00:00:00 2001 From: WANGXIAOMIN-HIK Date: Tue, 18 Nov 2025 10:50:17 +0800 Subject: [PATCH 08/11] change Likelihood score to Likelihood threshold , change FindImageResult to FindObjectImageResult. --- doc/RecordingSearch.xml | 4 ++-- wsdl/ver10/schema/onvif.xsd | 21 ++++++++++++--------- wsdl/ver10/search.wsdl | 4 ++-- 3 files changed, 16 insertions(+), 13 deletions(-) diff --git a/doc/RecordingSearch.xml b/doc/RecordingSearch.xml index 0ec7e0c16..f1b205a5c 100644 --- a/doc/RecordingSearch.xml +++ b/doc/RecordingSearch.xml @@ -1401,7 +1401,7 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace response - ResultList [tt:FindImageResultList] + ResultList [tt:FindObjectImageResultList] A structure containing the search results. @@ -1486,7 +1486,7 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace response - ResultList [tt:FindImageResultList] + ResultList [tt:FindObjectImageResultList] A structure containing the search results. diff --git a/wsdl/ver10/schema/onvif.xsd b/wsdl/ver10/schema/onvif.xsd index e21f10da7..8ecf772ba 100755 --- a/wsdl/ver10/schema/onvif.xsd +++ b/wsdl/ver10/schema/onvif.xsd @@ -7627,23 +7627,26 @@ and sample rate. - - + + The state of the search when the result is returned. Indicates if there can be more results, or if the search is completed. - - A FindImageResult structure for each found set of image matching the search. + + A FindObjectImageResult structure for each found set of image matching the search. - - + + + + Object unique identifier. + The URI of the detected target object image. This can be either: - a local image stored in the Target Image repository (LocalStorage format), or - an external image provided by the client for image search or feature matching. @@ -7651,7 +7654,7 @@ and sample rate. The time when the target was found. Users can use this as a reference point to search records that occurred before and after this specific time point. - Likelihood score of the target between 0~1. + Likelihood threshold of the target between 0~1. The recording where this result was found. @@ -9229,8 +9232,8 @@ and sample rate. - - + + diff --git a/wsdl/ver10/search.wsdl b/wsdl/ver10/search.wsdl index 1a35aa971..658bc171f 100644 --- a/wsdl/ver10/search.wsdl +++ b/wsdl/ver10/search.wsdl @@ -513,7 +513,7 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - + @@ -582,7 +582,7 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - + From ec431ae72a2226159873b4ed9a873a989d1610a4 Mon Sep 17 00:00:00 2001 From: WANGXIAOMIN-HIK Date: Tue, 16 Dec 2025 09:15:06 +0800 Subject: [PATCH 09/11] Change the "Likelihood" field to "CosineSimilarity". the description :It represents the cosine similarity between two vectors, which is used to measure the similarity of the directions of the vectors. The closer the value is to 1, the higher the similarity; the closer the value is to 0, the lower the similarity. --- doc/RecordingSearch.xml | 4 ++-- wsdl/ver10/schema/onvif.xsd | 4 ++-- wsdl/ver10/search.wsdl | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/doc/RecordingSearch.xml b/doc/RecordingSearch.xml index f1b205a5c..581defe38 100644 --- a/doc/RecordingSearch.xml +++ b/doc/RecordingSearch.xml @@ -1350,8 +1350,8 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace Token for the recording to search. Text [xs:string] Natural language description for the search. - Likelihood - optional [xs:float] - Likelihood threshold for the search (0-1). + CosineSimilarity - optional [xs:float] + It represents the cosine similarity between two vectors, which is used to measure the similarity of the directions of the vectors. The closer the value is to 1, the higher the similarity; the closer the value is to 0, the lower the similarity. MaxMatches - optional [xs:int] The search ends after MaxMatches. KeepAliveTime [xs:duration] diff --git a/wsdl/ver10/schema/onvif.xsd b/wsdl/ver10/schema/onvif.xsd index 8ecf772ba..0db9bab5b 100755 --- a/wsdl/ver10/schema/onvif.xsd +++ b/wsdl/ver10/schema/onvif.xsd @@ -7653,8 +7653,8 @@ and sample rate. The time when the target was found. Users can use this as a reference point to search records that occurred before and after this specific time point. - - Likelihood threshold of the target between 0~1. + + It represents the cosine similarity between two vectors, which is used to measure the similarity of the directions of the vectors. The closer the value is to 1, the higher the similarity; the closer the value is to 0, the lower the similarity. The recording where this result was found. diff --git a/wsdl/ver10/search.wsdl b/wsdl/ver10/search.wsdl index 658bc171f..e36ddfbb5 100644 --- a/wsdl/ver10/search.wsdl +++ b/wsdl/ver10/search.wsdl @@ -466,8 +466,8 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO Natural language description for the search (e.g., "person with red hat"). - - Likelihood threshold for the search between 0~1. + + It represents the cosine similarity between two vectors, which is used to measure the similarity of the directions of the vectors. The closer the value is to 1, the higher the similarity; the closer the value is to 0, the lower the similarity. Maximum number of matches to return in the response. From 0fd61ed921bbce199e6905f4a74adeb368e686f5 Mon Sep 17 00:00:00 2001 From: WANGXIAOMIN-HIK Date: Mon, 26 Jan 2026 14:23:05 +0800 Subject: [PATCH 10/11] Change the "CosineSimilarity" field to "Similarity" --- doc/RecordingSearch.xml | 4 ++-- wsdl/ver10/schema/onvif.xsd | 4 ++-- wsdl/ver10/search.wsdl | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/doc/RecordingSearch.xml b/doc/RecordingSearch.xml index 581defe38..acb6b5634 100644 --- a/doc/RecordingSearch.xml +++ b/doc/RecordingSearch.xml @@ -1350,8 +1350,8 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace Token for the recording to search. Text [xs:string] Natural language description for the search. - CosineSimilarity - optional [xs:float] - It represents the cosine similarity between two vectors, which is used to measure the similarity of the directions of the vectors. The closer the value is to 1, the higher the similarity; the closer the value is to 0, the lower the similarity. + Similarity - optional [xs:float] + It represents the similarity between two vectors, which is used to measure the similarity of the directions of the vectors. The closer the value is to 1, the higher the similarity; the closer the value is to 0, the lower the similarity. MaxMatches - optional [xs:int] The search ends after MaxMatches. KeepAliveTime [xs:duration] diff --git a/wsdl/ver10/schema/onvif.xsd b/wsdl/ver10/schema/onvif.xsd index 0db9bab5b..0bc2d5d46 100755 --- a/wsdl/ver10/schema/onvif.xsd +++ b/wsdl/ver10/schema/onvif.xsd @@ -7653,8 +7653,8 @@ and sample rate. The time when the target was found. Users can use this as a reference point to search records that occurred before and after this specific time point. - - It represents the cosine similarity between two vectors, which is used to measure the similarity of the directions of the vectors. The closer the value is to 1, the higher the similarity; the closer the value is to 0, the lower the similarity. + + It represents the similarity between two vectors, which is used to measure the similarity of the directions of the vectors. The closer the value is to 1, the higher the similarity; the closer the value is to 0, the lower the similarity. The recording where this result was found. diff --git a/wsdl/ver10/search.wsdl b/wsdl/ver10/search.wsdl index e36ddfbb5..2be429082 100644 --- a/wsdl/ver10/search.wsdl +++ b/wsdl/ver10/search.wsdl @@ -466,8 +466,8 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO Natural language description for the search (e.g., "person with red hat"). - - It represents the cosine similarity between two vectors, which is used to measure the similarity of the directions of the vectors. The closer the value is to 1, the higher the similarity; the closer the value is to 0, the lower the similarity. + + It represents the similarity between two vectors, which is used to measure the similarity of the directions of the vectors. The closer the value is to 1, the higher the similarity; the closer the value is to 0, the lower the similarity. Maximum number of matches to return in the response. From 2a39b4a8a9375abd7fabaeb57ca095925d350b03 Mon Sep 17 00:00:00 2001 From: WANGXIAOMIN-HIK Date: Mon, 9 Feb 2026 15:16:18 +0800 Subject: [PATCH 11/11] We have made the following changes considering the future scalability of the interface. We have added FindNLSearchResultList and FindNLSearchResult to distinguish the return values of the GetNLSearchResults and GetImageSearchResults interfaces. GetNLSearchResults -> FindNLSearchResultList, FindNLSearchResult GetImageSearchResults -> FindObjectImageResultList, FindObjectImageResult --- doc/RecordingSearch.xml | 2 +- wsdl/ver10/schema/onvif.xsd | 41 ++++++++++++++++++++++++++++++++++++- wsdl/ver10/search.wsdl | 2 +- 3 files changed, 42 insertions(+), 3 deletions(-) diff --git a/doc/RecordingSearch.xml b/doc/RecordingSearch.xml index acb6b5634..0a778324b 100644 --- a/doc/RecordingSearch.xml +++ b/doc/RecordingSearch.xml @@ -1401,7 +1401,7 @@ http://www.onvif.org/ver10/tptz/ZoomSpaces/PositionGenericSpace response - ResultList [tt:FindObjectImageResultList] + ResultList [tt:FindNLSearchResultList] A structure containing the search results. diff --git a/wsdl/ver10/schema/onvif.xsd b/wsdl/ver10/schema/onvif.xsd index 0bc2d5d46..1f73cd4db 100755 --- a/wsdl/ver10/schema/onvif.xsd +++ b/wsdl/ver10/schema/onvif.xsd @@ -7661,6 +7661,42 @@ and sample rate. + + + + + + + The state of the search when the result is returned. Indicates if there can be more results, or if the search is completed. + + + + A FindNLSearchResult structure for each found set of image matching the search. + + + + + + + + + Object unique identifier. + + + The URI of the detected target object image. This can be either: - a local image stored in the Target Image repository (LocalStorage format), or - an external image provided by the client for image search or feature matching. + + + The time when the target was found. Users can use this as a reference point to search records that occurred before and after this specific time point. + + + It represents the similarity between two vectors, which is used to measure the similarity of the directions of the vectors. The closer the value is to 1, the higher the similarity; the closer the value is to 0, the lower the similarity. + + + The recording where this result was found. + + + + @@ -9234,7 +9270,10 @@ and sample rate. - + + + + diff --git a/wsdl/ver10/search.wsdl b/wsdl/ver10/search.wsdl index 2be429082..d8173b949 100644 --- a/wsdl/ver10/search.wsdl +++ b/wsdl/ver10/search.wsdl @@ -513,7 +513,7 @@ IN NO EVENT WILL THE CORPORATION OR ITS MEMBERS OR THEIR AFFILIATES BE LIABLE FO - +