Implement full Query By Example matching. #87

bparth24 · 2025-09-05T16:43:17Z

Creates new queryByExample module with comprehensive field matching logic
Converts VPR examples to JSON pointers for precise credential filtering
Updates presentations.js filter chain to use new matching capabilities
Adds comprehensive unit tests covering all matching scenarios
Replaces basic context-only filtering with full example-based matching

Previous implementation only matched @context and specific Open Badge fields. New implementation matches ALL fields specified in Query By Example requests using JSON pointer traversal and flexible value comparison logic.

@context

- Creates new queryByExample module with comprehensive field matching logic - Converts VPR examples to JSON pointers for precise credential filtering - Updates presentations.js filter chain to use new matching capabilities - Adds comprehensive unit tests covering all matching scenarios - Replaces basic context-only filtering with full example-based matching Previous implementation only matched @context and specific Open Badge fields. New implementation matches ALL fields specified in Query By Example requests using JSON pointer traversal and flexible value comparison logic.

BigBlueHat · 2025-09-05T17:41:49Z

@bparth24 can you test this new code with the Academic Course credential's queries both on main and the branch at credential-handler/vc-examples#91? My hope is that this patch fixes that issue also.

lib/queryByExample.js

dlongley

Didn't do a full review yet, but left some comments so things could get kicked off in parallel.

dlongley · 2025-09-05T18:39:32Z

lib/presentations.js

+ *  the example to match against.
+ *
+ * @returns {Function} A filter function that returns true if the credential
+ *  matches the Query By Example specification.


Suggested change

* the example to match against.

*

* @returns {Function} A filter function that returns true if the credential

* matches the Query By Example specification.

* the example to match against.

*

* @returns {Function} A filter function that returns true if the credential

* matches the Query By Example specification.

That function got removed as not needed in this commit - 4d42d81

dlongley · 2025-09-05T18:41:49Z

lib/presentations.js

+  return ({record: {content}}) => {
+    // Use reusable module to check if this single credential matches
+    const matches = matchCredentials([content], credentialQuery);


It looks like matchCredentials should instead just be matchCredential and match against a single VC. Callers could then run it over an array if they desired or not -- and clearly this code doesn't need to be using an array (it's just wasted cycles).

Hmm, but I see below that there's savings in only having to process credentialQuery once within the matchCredentials call, but we're losing all of that benefit here. So, instead, we need to reorganize the code such that we don't waste this effort. There a few ways to do that that come to mind real quick:

Create an array of each record's content (VCs), run matchCredentials() on it, and then run over the records against the results doing an object comparison (shallow or "reference" comparison, not deep) to see which records match. A Map could also be built from content object => record and that could be used in post processing, but I doubt this saves many cycles and is maybe even less efficient that not using one.

Modifying the matchCredentials output to include which indexes matched and do an index match instead of an object comparison.

Modifying the matchCredentials input to accept an object with the credential property and return the whole given object in the match results, allowing the object to carry any other arbitrary data for post processing. Post process the results to convert the objects back to records. This seems to leak details about the structure being worked on into the matchCredentials API which is suboptimal / probably not a good idea.

Do something funky with Symbols and Sets or something (probably not unless it's not messy / hard to understand or follow as a maintainer).

Open to other better ideas too (of course), but idea 1 in the list above (vs. any of the others in that list) keeps the interfaces and separation of concerns the cleanest, I think.

It looks like matchCredentials should instead just be matchCredential and match against a single VC. Callers could then run it over an array if they desired or not -- and clearly this code doesn't need to be using an array (it's just wasted cycles).

Hmm, but I see below that there's savings in only having to process credentialQuery once within the matchCredentials call, but we're losing all of that benefit here. So, instead, we need to reorganize the code such that we don't waste this effort. There a few ways to do that that come to mind real quick:

Create an array of each record's content (VCs), run matchCredentials() on it, and then run over the records against the results doing an object comparison (shallow or "reference" comparison, not deep) to see which records match. A Map could also be built from content object => record and that could be used in post processing, but I doubt this saves many cycles and is maybe even less efficient that not using one.

Modifying the matchCredentials output to include which indexes matched and do an index match instead of an object comparison.

Modifying the matchCredentials input to accept an object with the credential property and return the whole given object in the match results, allowing the object to carry any other arbitrary data for post processing. Post process the results to convert the objects back to records. This seems to leak details about the structure being worked on into the matchCredentials API which is suboptimal / probably not a good idea.

Do something funky with Symbols and Sets or something (probably not unless it's not messy / hard to understand or follow as a maintainer).

Open to other better ideas too (of course), but idea 1 in the list above (vs. any of the others in that list) keeps the interfaces and separation of concerns the cleanest, I think.

@dlongley

Attempt to address your concern and going with option 1 in this commit - 4d42d81

dlongley · 2025-09-05T18:53:27Z

lib/queryByExample.js

+ * @returns {Array} Array of credentials that match the Query By Example
+ *  specification.
+ */
+export function matchCredentials(credentials, queryByExample) {


Use named params in exported APIs:

Suggested change

export function matchCredentials(credentials, queryByExample) {

export function matchCredentials({credentials, queryByExample} = {}) {

Use named params in exported APIs:

@dlongley

Attempt to address your concern in this commit - c8fad33

dlongley · 2025-09-05T18:55:11Z

lib/queryByExample.js

+  // Convert example to JSON pointers, excluding @context as it's handled
+  // separately
+  const expectedPointers = _convertExampleToPointers(example);


But @context shouldn't be handled externally (well, this wouldn't be known by this module when we're ready to separate it), so let's make sure it handles @context too. We should treat this function as if it could be separated and put into an external package for reuse by other applications.

But @context shouldn't be handled externally (well, this wouldn't be known by this module when we're ready to separate it), so let's make sure it handles @context too. We should treat this function as if it could be separated and put into an external package for reuse by other applications.

@dlongley

Attempt to address your concern in this commit - c8fad33

dlongley · 2025-09-05T18:58:12Z

lib/queryByExample.js

+  // Convert to JSON pointer dictionary and extract pointer/value pairs
+  try {
+    const dict = JsonPointer.dict(exampleWithoutContext);
+    for(const [pointer, value] of Object.entries(dict)) {
+      // Skip empty objects, arrays, or null/undefined values
+      if(_isMatchableValue(value)) {
+        pointers.push({
+          pointer,
+          expectedValue: value
+        });
+      }
+    }


We have this code over here:

bedrock-web-wallet/lib/presentations.js

Line 238 in 212ac42

const pointers = _adjustPointers(Object.keys(jsonpointer.dict(object)));

Can we find a way to expose it as an exported API in this module that this module can also use internally? There might need to be a few different options when converting an object to pointers, but it seems like we could get more reuse here.

There might be something common we can do with this as well:

https://github.com/digitalbazaar/di-sd-primitives/blob/v3.1.0/lib/select.js#L9

We have this code over here:

bedrock-web-wallet/lib/presentations.js

Line 238 in 212ac42

const pointers = _adjustPointers(Object.keys(jsonpointer.dict(object)));

Can we find a way to expose it as an exported API in this module that this module can also use internally? There might need to be a few different options when converting an object to pointers, but it seems like we could get more reuse here.

There might be something common we can do with this as well:

https://github.com/digitalbazaar/di-sd-primitives/blob/v3.1.0/lib/select.js#L9

@dlongley

Attempt to address your concern in this commit - c8fad33

dlongley · 2025-09-05T19:07:29Z

lib/queryByExample.js

+  // Skip null, undefined
+  if(value == null) {
+    return false;
+  }


Hmm, I think we need to be careful here -- perhaps a null value means "value must not be present" -- which is different from "ignore this". It's true that undefined should be treated as if the associated property wasn't even there to begin with (and generally shouldn't ever happen, but good to cover the case).

Hmm, I think we need to be careful here -- perhaps a null value means "value must not be present" -- which is different from "ignore this". It's true that undefined should be treated as if the associated property wasn't even there to begin with (and generally shouldn't ever happen, but good to cover the case).

@dlongley

Attempt to address your concern in this commit - c8fad33

dlongley · 2025-09-05T19:07:47Z

lib/queryByExample.js

+  // Skip empty arrays
+  if(Array.isArray(value) && value.length === 0) {
+    return false;
+  }


Hmm, I think empty arrays are wildcards that mean "any array will match". Clearly under-specified in the spec, but I think this was the aim.

Hmm, I think empty arrays are wildcards that mean "any array will match". Clearly under-specified in the spec, but I think this was the aim.

@dlongley

Attempt to address your concern in this commit - c8fad33

dlongley · 2025-09-05T19:08:12Z

lib/queryByExample.js

+  if(typeof value === 'object' && !Array.isArray(value) &&
+      Object.keys(value).length === 0) {
+    return false;
+  }


I think empty objects are considered "wildcards" -- meaning, any value will match. Clearly under-specified in the spec, but I think this was the aim.

I think empty objects are considered "wildcards" -- meaning, any value will match. Clearly under-specified in the spec, but I think this was the aim.

@dlongley

Attempt to address your concern in this commit - c8fad33

dlongley · 2025-09-05T19:12:47Z

lib/queryByExample.js

+  // Check if they have the same number of keys
+  if(keys1.length !== keys2.length) {
+    return false;
+  }
+
+  // Check if all keys and values match
+  return keys1.every(key =>
+    keys2.includes(key) && _valuesMatch(obj1[key], obj2[key])
+  );


I don't think the object matching is "must match this object exactly", but rather, "must include these fields and values". So I suspect this function (or something else) needs an adjustment for this.

In general, if you think about this query mechanism visually, the "example" is to be placed as an overlay on a possible candidate for a match -- and as long as everything in the example "overlaps" with something in the candidate, then the candidate is considered a match.

Would it be clearer if we provided wildcard values? Because the value thing being an actual value...which is then not used in the match...is very confusing.

I don't think the object matching is "must match this object exactly", but rather, "must include these fields and values". So I suspect this function (or something else) needs an adjustment for this.

In general, if you think about this query mechanism visually, the "example" is to be placed as an overlay on a possible candidate for a match -- and as long as everything in the example "overlaps" with something in the candidate, then the candidate is considered a match.

@dlongley

Attempt to address your concern in this commit - c8fad33

dlongley · 2025-09-05T19:22:10Z

lib/queryByExample.js

+        const actualValue = JsonPointer.get(credential, pointer);
+        const result = _valuesMatch(actualValue, expectedValue);


I think we might need to be more careful here with arrays? We should make sure there are tests with arrays that have objects nested within them that are also arrays still work.

This code might also benefit from using that other code I linked to where just the "deepest" pointers are considered, I'm not sure.

In other words, it might be best to:

Generating the deepest pointers for both the example.

Get the leaf "expected values" (similar to what the code is already doing).

Do some kind of adjustment for "arrays" (pointer values with numbers) -- perhaps multiplex these or save them for special multiphase processing? For example, each pointer that includes an array is really just another "example" that starts at that array and will need to try to match against any element found in the associated array in any candidate match.

For each VC, check just these deepest pointers with the appropriate "match" algorithm(s).

For each array case, see if there are any "candidate matches" to consider... recursively? Anyway, something to consider.

I think we might need to be more careful here with arrays? We should make sure there are tests with arrays that have objects nested within them that are also arrays still work.

This code might also benefit from using that other code I linked to where just the "deepest" pointers are considered, I'm not sure.

In other words, it might be best to:

Generating the deepest pointers for both the example.

Get the leaf "expected values" (similar to what the code is already doing).

Do some kind of adjustment for "arrays" (pointer values with numbers) -- perhaps multiplex these or save them for special multiphase processing? For example, each pointer that includes an array is really just another "example" that starts at that array and will need to try to match against any element found in the associated array in any candidate match.

For each VC, check just these deepest pointers with the appropriate "match" algorithm(s).

For each array case, see if there are any "candidate matches" to consider... recursively? Anyway, something to consider.

@dlongley

Attempt to address your concern in this commit - c8fad33

Replaces individual credential filtering with efficient batch processing to address code review feedback on performance. Processes all credentials at once instead of calling matchCredentials() separately for each item. - Updates presentations.js to use batch processing approach - Removes inefficient _matchQueryByExampleFilter function - Extracts all credential contents, processes in single call - Maps results back using reference comparison for O(1) lookup - Adds integration test to verify batch processing functionality - Removes unused profile setup from integration test

- Implements comprehensive QBE matching system with matchCredentials() function supporting semantic match types (mustBeNull, anyArray, anyValue, exactMatch) - Adds selectiveDisclosureUtil.js with JSON-LD selection and pointer utilities including selectJsonLd(), adjustPointers(), and parsePointer() functions - Integrates array containment matching vs positional matching for flexible credential queries supporting overlay matching where credentials can have extra fields beyond the example specification - Handles null value vs missing field distinction with strict semantic rules - Includes string normalization and type coercion for robust value comparison - Integrates with existing presentations.match() system while maintaining backward compatibility with existing query types - Adds debug.js utility to conditionally enable/disable console logs during development with support for environment variables and browser flags - Provides comprehensive test suite with 45+ tests covering semantic features, edge cases, array matching, nested objects, and real-world scenarios

bparth24 · 2025-09-08T01:57:12Z

@dlongley cc: @BigBlueHat

Outputs for newly added tests:

presentations.match()
    ✔ should match credentials using Query By Example with batch processing
  queryByExample
    matchCredentials()
      API and Basic Functionality
        ✔ should use named parameters API
        ✔ should return all credentials when no example provided
        ✔ should return all credentials when example is null
        ✔ should handle empty credentials array
      Semantic Features Tests
        ✔ should match credentials with populated arrays
      Empty Object Wildcar (anyValue)
        ✔ should match any value when example has empty object
        ✔ should match populated objects with empty object wildcard
      Null Semantic (mustBeNull)
        ✔ should match only when field is null
        ✔ should match multiple null fields
        ✔ should match individual null fields correctly
        ✔ should match when field is missing
      Overlay Matching
        ✔ should match when credential has extra fields
        ✔ should match nested objects with extra properties
      Array Matching
        ✔ should match single value against array
        ✔ should match array element
        ✔ should match arrays with common elements
        ✔ should match array elements in complex structures
      Complex Nested Structures
        ✔ should handle deep nesting with multiple levels
        ✔ should handle multiple field matching (AND logic)
        ✔ should handle complex nested object matching
      Error Handling and Edge Cases
        ✔ should handle structure mismatch gracefully
        ✔ should handle invalid credentials gracefully
        ✔ should handle complex pointer scenarios
      String Normalization and Type Coercion
        ✔ should handle string trimming
        ✔ should handle string/number coercion
        ✔ should handle reverse number/string coercion
      Real-world Scenarios
        ✔ should handle medical record queries
        ✔ should handle professional license queries
    convertExamplesToPointers()
      Basic Functionality
        ✔ should convert simple example to pointers
        ✔ should include match types for each pointer
      Context Handling
        ✔ should include @context by default
        ✔ should exclude @context when includeContext=false
      Edge Cases
        ✔ should handle empty example
        ✔ should handle null example
        ✔ should handle complex nested structures
      Integration with presentations.match()
        ✔ should work with credential store structure
        ✔ should work with original mockCredential

Finished in 1.151 secs / 0.503 secs @ 18:53:45 GMT-0700 (Pacific Daylight Time)

SUMMARY:
✔ 47 tests completed
All tests passed.

dlongley

Only a partial review again -- making great progress here!

dlongley · 2025-09-08T15:11:04Z

lib/index.js

 export {
  ageCredentialHelpers, capabilities, cryptoSuites, exchanges,
-  helpers, inbox, presentations, users, validator, zcap
+  helpers, inbox, presentations, queryByExample, users, validator, zcap


Something minor to consider here about this export:

Our long term goal is to move the queryByExample code to a separate module where anyone could import that API (from there) if desired. However, once we do that, we'd have this API surface here (if we continue to export this), that we will have to do backwards compatible support for until we do a major release to remove it. I expect that we're exporting this now for testing purposes.

I think we'd probably export it with an underscore _queryByExample to signal this (and leave a comment that it is exposed for testing purposes only) ... and we wouldn't mention exposing this in API in the changelog. We can make that change now if you'd like, or we can just support it until we do a major change in the future -- it's not that big of a deal, but worth keeping this sort of future planning in mind when exporting new API, so thought I'd mention it.

dlongley · 2025-09-08T15:11:56Z

lib/presentations.js

+    // Full query by example matching implemented via queryByExample module
+    // Process all credentials at once for efficiency
+    if(credentialQuery?.example) {
+


Suggested change

dlongley · 2025-09-08T15:12:19Z

lib/presentations.js

+      result.matches = result.matches.filter(match =>
+        matchingContents.includes(match.record.content)
+      );


Style fix and could be more efficient:

Suggested change

result.matches = result.matches.filter(match =>

matchingContents.includes(match.record.content)

);

const contentSet = new Set(matchingContents);

result.matches = result.matches.filter(

match => contentSet.has(match.record.content));

dlongley · 2025-09-08T15:12:31Z

lib/presentations.js

+    }
+
+    result.matches =
+        result.matches.filter(_openBadgeFilter({credentialQuery}));


Suggested change

result.matches.filter(_openBadgeFilter({credentialQuery}));

result.matches.filter(_openBadgeFilter({credentialQuery}));

dlongley · 2025-09-08T15:17:52Z

lib/queryByExample.js

+  if(!Array.isArray(credentials)) {
+    debugLog('Credentials is not an array:', credentials); // DEBUG
+    return [];
+  }


Seems to me like we'd want this to be a TypeError:

Suggested change

if(!Array.isArray(credentials)) {

debugLog('Credentials is not an array:', credentials); // DEBUG

return [];

}

if(!Array.isArray(credentials)) {

throw new TypeError('"credentials" must be an array.');

}

dlongley · 2025-09-08T15:18:28Z

lib/queryByExample.js

+  debugLog('Input credentials count:', credentials.length); // DEBUG
+  debugLog('Input credentials:', credentials.map(c => ({ // DEBUG


Style preference for comments above their related lines:

Suggested change

debugLog('Input credentials count:', credentials.length); // DEBUG

debugLog('Input credentials:', credentials.map(c => ({ // DEBUG

// DEBUG

debugLog('Input credentials count:', credentials.length);

debugLog('Input credentials:', credentials.map(c => ({

dlongley · 2025-09-08T15:19:56Z

lib/queryByExample.js

+           typeof credential === 'object' &&
+           !Array.isArray(credential) &&
+           credential.credentialSubject &&
+           typeof credential.credentialSubject === 'object';


Suggested change

typeof credential === 'object' &&

!Array.isArray(credential) &&

credential.credentialSubject &&

typeof credential.credentialSubject === 'object';

typeof credential === 'object' &&

!Array.isArray(credential) &&

credential.credentialSubject &&

typeof credential.credentialSubject === 'object';

dlongley · 2025-09-08T15:27:39Z

lib/queryByExample.js

+    const isValid = credential &&
+           typeof credential === 'object' &&
+           !Array.isArray(credential) &&
+           credential.credentialSubject &&
+           typeof credential.credentialSubject === 'object';


The indent here is off, but I also think we can collapse this to:

Suggested change

const isValid = credential &&

typeof credential === 'object' &&

!Array.isArray(credential) &&

credential.credentialSubject &&

typeof credential.credentialSubject === 'object';

const isValid = credential?.credentialSubject &&

typeof credential?.credentialSubject === 'object';

It's true that we wouldn't catch a weird case of credential being an array with a property of credentialSubject that is a non-null object, but I think that's ok (that's a problem the caller should have already caught).

Though, I'm also not entirely sure why we're filtering out some invalid credentials here (even the original code wouldn't catch every possible problem) at all instead of allowing errors to occur if the caller failed to provide good inputs. We should probably allow the errors to happen in this API -- and let the caller ensure they are passing valid credentials. Thoughts?

dlongley · 2025-09-08T15:30:37Z

lib/queryByExample.js

+ *  {pointer, expectedValue, matchType}
+ *  where pointer is a JSON pointer string, expectedValues is the
+ *  expected value, and matchType describes how to match
+ *  ('exactMatch', 'anyArray', etc.).


Indent should match other indent level (2), starting from prefix of * (the space after the docs asterisk isn't part of the "indent" on any line for consistency):

Suggested change

* {pointer, expectedValue, matchType}

* where pointer is a JSON pointer string, expectedValues is the

* expected value, and matchType describes how to match

* ('exactMatch', 'anyArray', etc.).

* {pointer, expectedValue, matchType}

* where pointer is a JSON pointer string, expectedValues is the

* expected value, and matchType describes how to match

* ('exactMatch', 'anyArray', etc.).

dlongley · 2025-09-08T15:33:24Z

lib/queryByExample.js

+      // Empty array - add it
+      pointers.push({pointer: currentPath, value});
+    } else if(typeof value === 'object' && value !== null &&
+              Object.keys(value).length === 0) {


Suggested change

Object.keys(value).length === 0) {

Object.keys(value).length === 0) {

dlongley · 2025-10-21T19:56:08Z

Might be able to address #91 here as well.

bparth24 requested review from BigBlueHat, dlongley, msporny and tminard September 5, 2025 16:43

BigBlueHat reviewed Sep 5, 2025

View reviewed changes

lib/queryByExample.js Outdated Show resolved Hide resolved

Remove DEBUG logs

07e243a

dlongley requested changes Sep 5, 2025

View reviewed changes

bparth24 requested a review from BigBlueHat September 5, 2025 18:56

dlongley requested changes Sep 5, 2025

View reviewed changes

bparth24 added 2 commits September 5, 2025 15:05

bparth24 requested a review from dlongley September 8, 2025 01:52

dlongley reviewed Sep 8, 2025

View reviewed changes

	export function matchCredentials(credentials, queryByExample) {
	export function matchCredentials({credentials, queryByExample} = {}) {

		const actualValue = JsonPointer.get(credential, pointer);
		const result = _valuesMatch(actualValue, expectedValue);

	result.matches.filter(_openBadgeFilter({credentialQuery}));
	result.matches.filter(_openBadgeFilter({credentialQuery}));

		debugLog('Input credentials count:', credentials.length); // DEBUG
		debugLog('Input credentials:', credentials.map(c => ({ // DEBUG

	Object.keys(value).length === 0) {
	Object.keys(value).length === 0) {

Implement full Query By Example matching. #87

Are you sure you want to change the base?

Implement full Query By Example matching. #87

Uh oh!

Conversation

bparth24 commented Sep 5, 2025

Uh oh!

BigBlueHat commented Sep 5, 2025

Uh oh!

Uh oh!

dlongley left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bparth24 commented Sep 8, 2025

Uh oh!

dlongley left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!