Skip to content

Conversation

@bparth24
Copy link

@bparth24 bparth24 commented Sep 5, 2025

  • Creates new queryByExample module with comprehensive field matching logic
  • Converts VPR examples to JSON pointers for precise credential filtering
  • Updates presentations.js filter chain to use new matching capabilities
  • Adds comprehensive unit tests covering all matching scenarios
  • Replaces basic context-only filtering with full example-based matching

Previous implementation only matched @context and specific Open Badge fields. New implementation matches ALL fields specified in Query By Example requests using JSON pointer traversal and flexible value comparison logic.

- Creates new queryByExample module with comprehensive field matching logic
- Converts VPR examples to JSON pointers for precise credential filtering
- Updates presentations.js filter chain to use new matching capabilities
- Adds comprehensive unit tests covering all matching scenarios
- Replaces basic context-only filtering with full example-based matching

Previous implementation only matched @context and specific Open Badge fields.
New implementation matches ALL fields specified in Query By Example requests
using JSON pointer traversal and flexible value comparison logic.
@BigBlueHat
Copy link

@bparth24 can you test this new code with the Academic Course credential's queries both on main and the branch at credential-handler/vc-examples#91? My hope is that this patch fixes that issue also.

Copy link
Member

@dlongley dlongley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't do a full review yet, but left some comments so things could get kicked off in parallel.

Comment on lines 431 to 434
* the example to match against.
*
* @returns {Function} A filter function that returns true if the credential
* matches the Query By Example specification.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* the example to match against.
*
* @returns {Function} A filter function that returns true if the credential
* matches the Query By Example specification.
* the example to match against.
*
* @returns {Function} A filter function that returns true if the credential
* matches the Query By Example specification.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That function got removed as not needed in this commit - 4d42d81

Comment on lines 443 to 445
return ({record: {content}}) => {
// Use reusable module to check if this single credential matches
const matches = matchCredentials([content], credentialQuery);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like matchCredentials should instead just be matchCredential and match against a single VC. Callers could then run it over an array if they desired or not -- and clearly this code doesn't need to be using an array (it's just wasted cycles).

Hmm, but I see below that there's savings in only having to process credentialQuery once within the matchCredentials call, but we're losing all of that benefit here. So, instead, we need to reorganize the code such that we don't waste this effort. There a few ways to do that that come to mind real quick:

  1. Create an array of each record's content (VCs), run matchCredentials() on it, and then run over the records against the results doing an object comparison (shallow or "reference" comparison, not deep) to see which records match. A Map could also be built from content object => record and that could be used in post processing, but I doubt this saves many cycles and is maybe even less efficient that not using one.
  2. Modifying the matchCredentials output to include which indexes matched and do an index match instead of an object comparison.
  3. Modifying the matchCredentials input to accept an object with the credential property and return the whole given object in the match results, allowing the object to carry any other arbitrary data for post processing. Post process the results to convert the objects back to records. This seems to leak details about the structure being worked on into the matchCredentials API which is suboptimal / probably not a good idea.
  4. Do something funky with Symbols and Sets or something (probably not unless it's not messy / hard to understand or follow as a maintainer).

Open to other better ideas too (of course), but idea 1 in the list above (vs. any of the others in that list) keeps the interfaces and separation of concerns the cleanest, I think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like matchCredentials should instead just be matchCredential and match against a single VC. Callers could then run it over an array if they desired or not -- and clearly this code doesn't need to be using an array (it's just wasted cycles).

Hmm, but I see below that there's savings in only having to process credentialQuery once within the matchCredentials call, but we're losing all of that benefit here. So, instead, we need to reorganize the code such that we don't waste this effort. There a few ways to do that that come to mind real quick:

  1. Create an array of each record's content (VCs), run matchCredentials() on it, and then run over the records against the results doing an object comparison (shallow or "reference" comparison, not deep) to see which records match. A Map could also be built from content object => record and that could be used in post processing, but I doubt this saves many cycles and is maybe even less efficient that not using one.
  2. Modifying the matchCredentials output to include which indexes matched and do an index match instead of an object comparison.
  3. Modifying the matchCredentials input to accept an object with the credential property and return the whole given object in the match results, allowing the object to carry any other arbitrary data for post processing. Post process the results to convert the objects back to records. This seems to leak details about the structure being worked on into the matchCredentials API which is suboptimal / probably not a good idea.
  4. Do something funky with Symbols and Sets or something (probably not unless it's not messy / hard to understand or follow as a maintainer).

Open to other better ideas too (of course), but idea 1 in the list above (vs. any of the others in that list) keeps the interfaces and separation of concerns the cleanest, I think.

@dlongley

Attempt to address your concern and going with option 1 in this commit - 4d42d81

@bparth24 bparth24 requested a review from BigBlueHat September 5, 2025 18:56
* @returns {Array} Array of credentials that match the Query By Example
* specification.
*/
export function matchCredentials(credentials, queryByExample) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use named params in exported APIs:

Suggested change
export function matchCredentials(credentials, queryByExample) {
export function matchCredentials({credentials, queryByExample} = {}) {

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use named params in exported APIs:

@dlongley

Attempt to address your concern in this commit - c8fad33

Comment on lines +26 to +28
// Convert example to JSON pointers, excluding @context as it's handled
// separately
const expectedPointers = _convertExampleToPointers(example);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But @context shouldn't be handled externally (well, this wouldn't be known by this module when we're ready to separate it), so let's make sure it handles @context too. We should treat this function as if it could be separated and put into an external package for reuse by other applications.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But @context shouldn't be handled externally (well, this wouldn't be known by this module when we're ready to separate it), so let's make sure it handles @context too. We should treat this function as if it could be separated and put into an external package for reuse by other applications.

@dlongley

Attempt to address your concern in this commit - c8fad33

Comment on lines +71 to +82
// Convert to JSON pointer dictionary and extract pointer/value pairs
try {
const dict = JsonPointer.dict(exampleWithoutContext);
for(const [pointer, value] of Object.entries(dict)) {
// Skip empty objects, arrays, or null/undefined values
if(_isMatchableValue(value)) {
pointers.push({
pointer,
expectedValue: value
});
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this code over here:

const pointers = _adjustPointers(Object.keys(jsonpointer.dict(object)));

Can we find a way to expose it as an exported API in this module that this module can also use internally? There might need to be a few different options when converting an object to pointers, but it seems like we could get more reuse here.

There might be something common we can do with this as well:

https://github.com/digitalbazaar/di-sd-primitives/blob/v3.1.0/lib/select.js#L9

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this code over here:

const pointers = _adjustPointers(Object.keys(jsonpointer.dict(object)));

Can we find a way to expose it as an exported API in this module that this module can also use internally? There might need to be a few different options when converting an object to pointers, but it seems like we could get more reuse here.

There might be something common we can do with this as well:

https://github.com/digitalbazaar/di-sd-primitives/blob/v3.1.0/lib/select.js#L9

@dlongley

Attempt to address your concern in this commit - c8fad33

Comment on lines +100 to +103
// Skip null, undefined
if(value == null) {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think we need to be careful here -- perhaps a null value means "value must not be present" -- which is different from "ignore this". It's true that undefined should be treated as if the associated property wasn't even there to begin with (and generally shouldn't ever happen, but good to cover the case).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think we need to be careful here -- perhaps a null value means "value must not be present" -- which is different from "ignore this". It's true that undefined should be treated as if the associated property wasn't even there to begin with (and generally shouldn't ever happen, but good to cover the case).

@dlongley

Attempt to address your concern in this commit - c8fad33

Comment on lines +105 to +108
// Skip empty arrays
if(Array.isArray(value) && value.length === 0) {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think empty arrays are wildcards that mean "any array will match". Clearly under-specified in the spec, but I think this was the aim.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think empty arrays are wildcards that mean "any array will match". Clearly under-specified in the spec, but I think this was the aim.

@dlongley

Attempt to address your concern in this commit - c8fad33

Comment on lines +111 to +114
if(typeof value === 'object' && !Array.isArray(value) &&
Object.keys(value).length === 0) {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think empty objects are considered "wildcards" -- meaning, any value will match. Clearly under-specified in the spec, but I think this was the aim.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think empty objects are considered "wildcards" -- meaning, any value will match. Clearly under-specified in the spec, but I think this was the aim.

@dlongley

Attempt to address your concern in this commit - c8fad33

Comment on lines 192 to 200
// Check if they have the same number of keys
if(keys1.length !== keys2.length) {
return false;
}

// Check if all keys and values match
return keys1.every(key =>
keys2.includes(key) && _valuesMatch(obj1[key], obj2[key])
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the object matching is "must match this object exactly", but rather, "must include these fields and values". So I suspect this function (or something else) needs an adjustment for this.

In general, if you think about this query mechanism visually, the "example" is to be placed as an overlay on a possible candidate for a match -- and as long as everything in the example "overlaps" with something in the candidate, then the candidate is considered a match.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be clearer if we provided wildcard values? Because the value thing being an actual value...which is then not used in the match...is very confusing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the object matching is "must match this object exactly", but rather, "must include these fields and values". So I suspect this function (or something else) needs an adjustment for this.

In general, if you think about this query mechanism visually, the "example" is to be placed as an overlay on a possible candidate for a match -- and as long as everything in the example "overlaps" with something in the candidate, then the candidate is considered a match.

@dlongley

Attempt to address your concern in this commit - c8fad33

Comment on lines +39 to +40
const actualValue = JsonPointer.get(credential, pointer);
const result = _valuesMatch(actualValue, expectedValue);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might need to be more careful here with arrays? We should make sure there are tests with arrays that have objects nested within them that are also arrays still work.

This code might also benefit from using that other code I linked to where just the "deepest" pointers are considered, I'm not sure.

In other words, it might be best to:

  1. Generating the deepest pointers for both the example.
  2. Get the leaf "expected values" (similar to what the code is already doing).
  3. Do some kind of adjustment for "arrays" (pointer values with numbers) -- perhaps multiplex these or save them for special multiphase processing? For example, each pointer that includes an array is really just another "example" that starts at that array and will need to try to match against any element found in the associated array in any candidate match.
  4. For each VC, check just these deepest pointers with the appropriate "match" algorithm(s).
  5. For each array case, see if there are any "candidate matches" to consider... recursively? Anyway, something to consider.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might need to be more careful here with arrays? We should make sure there are tests with arrays that have objects nested within them that are also arrays still work.

This code might also benefit from using that other code I linked to where just the "deepest" pointers are considered, I'm not sure.

In other words, it might be best to:

  1. Generating the deepest pointers for both the example.
  2. Get the leaf "expected values" (similar to what the code is already doing).
  3. Do some kind of adjustment for "arrays" (pointer values with numbers) -- perhaps multiplex these or save them for special multiphase processing? For example, each pointer that includes an array is really just another "example" that starts at that array and will need to try to match against any element found in the associated array in any candidate match.
  4. For each VC, check just these deepest pointers with the appropriate "match" algorithm(s).
  5. For each array case, see if there are any "candidate matches" to consider... recursively? Anyway, something to consider.

@dlongley

Attempt to address your concern in this commit - c8fad33

Replaces individual credential filtering with efficient batch processing
to address code review feedback on performance. Processes all credentials
at once instead of calling matchCredentials() separately for each item.

- Updates presentations.js to use batch processing approach
- Removes inefficient _matchQueryByExampleFilter function
- Extracts all credential contents, processes in single call
- Maps results back using reference comparison for O(1) lookup
- Adds integration test to verify batch processing functionality
- Removes unused profile setup from integration test
- Implements comprehensive QBE matching system with matchCredentials() function
  supporting semantic match types (mustBeNull, anyArray, anyValue, exactMatch)
- Adds selectiveDisclosureUtil.js with JSON-LD selection and pointer utilities
  including selectJsonLd(), adjustPointers(), and parsePointer() functions
- Integrates array containment matching vs positional matching for flexible
  credential queries supporting overlay matching where credentials can have
  extra fields beyond the example specification
- Handles null value vs missing field distinction with strict semantic rules
- Includes string normalization and type coercion for robust value comparison
- Integrates with existing presentations.match() system while maintaining
  backward compatibility with existing query types
- Adds debug.js utility to conditionally enable/disable console logs during
  development with support for environment variables and browser flags
- Provides comprehensive test suite with 45+ tests covering semantic features,
  edge cases, array matching, nested objects, and real-world scenarios
@bparth24 bparth24 requested a review from dlongley September 8, 2025 01:52
@bparth24
Copy link
Author

bparth24 commented Sep 8, 2025

@dlongley cc: @BigBlueHat

Outputs for newly added tests:

presentations.match()
    ✔ should match credentials using Query By Example with batch processing
  queryByExample
    matchCredentials()
      API and Basic Functionality
        ✔ should use named parameters API
        ✔ should return all credentials when no example provided
        ✔ should return all credentials when example is null
        ✔ should handle empty credentials array
      Semantic Features Tests
        ✔ should match credentials with populated arrays
      Empty Object Wildcar (anyValue)
        ✔ should match any value when example has empty object
        ✔ should match populated objects with empty object wildcard
      Null Semantic (mustBeNull)
        ✔ should match only when field is null
        ✔ should match multiple null fields
        ✔ should match individual null fields correctly
        ✔ should match when field is missing
      Overlay Matching
        ✔ should match when credential has extra fields
        ✔ should match nested objects with extra properties
      Array Matching
        ✔ should match single value against array
        ✔ should match array element
        ✔ should match arrays with common elements
        ✔ should match array elements in complex structures
      Complex Nested Structures
        ✔ should handle deep nesting with multiple levels
        ✔ should handle multiple field matching (AND logic)
        ✔ should handle complex nested object matching
      Error Handling and Edge Cases
        ✔ should handle structure mismatch gracefully
        ✔ should handle invalid credentials gracefully
        ✔ should handle complex pointer scenarios
      String Normalization and Type Coercion
        ✔ should handle string trimming
        ✔ should handle string/number coercion
        ✔ should handle reverse number/string coercion
      Real-world Scenarios
        ✔ should handle medical record queries
        ✔ should handle professional license queries
    convertExamplesToPointers()
      Basic Functionality
        ✔ should convert simple example to pointers
        ✔ should include match types for each pointer
      Context Handling
        ✔ should include @context by default
        ✔ should exclude @context when includeContext=false
      Edge Cases
        ✔ should handle empty example
        ✔ should handle null example
        ✔ should handle complex nested structures
      Integration with presentations.match()
        ✔ should work with credential store structure
        ✔ should work with original mockCredential

Finished in 1.151 secs / 0.503 secs @ 18:53:45 GMT-0700 (Pacific Daylight Time)

SUMMARY:
✔ 47 tests completed
All tests passed.

Copy link
Member

@dlongley dlongley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a partial review again -- making great progress here!

export {
ageCredentialHelpers, capabilities, cryptoSuites, exchanges,
helpers, inbox, presentations, users, validator, zcap
helpers, inbox, presentations, queryByExample, users, validator, zcap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something minor to consider here about this export:

Our long term goal is to move the queryByExample code to a separate module where anyone could import that API (from there) if desired. However, once we do that, we'd have this API surface here (if we continue to export this), that we will have to do backwards compatible support for until we do a major release to remove it. I expect that we're exporting this now for testing purposes.

I think we'd probably export it with an underscore _queryByExample to signal this (and leave a comment that it is exposed for testing purposes only) ... and we wouldn't mention exposing this in API in the changelog. We can make that change now if you'd like, or we can just support it until we do a major change in the future -- it's not that big of a deal, but worth keeping this sort of future planning in mind when exporting new API, so thought I'd mention it.

// Full query by example matching implemented via queryByExample module
// Process all credentials at once for efficiency
if(credentialQuery?.example) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Comment on lines +406 to +408
result.matches = result.matches.filter(match =>
matchingContents.includes(match.record.content)
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style fix and could be more efficient:

Suggested change
result.matches = result.matches.filter(match =>
matchingContents.includes(match.record.content)
);
const contentSet = new Set(matchingContents);
result.matches = result.matches.filter(
match => contentSet.has(match.record.content));

}

result.matches =
result.matches.filter(_openBadgeFilter({credentialQuery}));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
result.matches.filter(_openBadgeFilter({credentialQuery}));
result.matches.filter(_openBadgeFilter({credentialQuery}));

Comment on lines +32 to +35
if(!Array.isArray(credentials)) {
debugLog('Credentials is not an array:', credentials); // DEBUG
return [];
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me like we'd want this to be a TypeError:

Suggested change
if(!Array.isArray(credentials)) {
debugLog('Credentials is not an array:', credentials); // DEBUG
return [];
}
if(!Array.isArray(credentials)) {
throw new TypeError('"credentials" must be an array.');
}

Comment on lines +37 to +38
debugLog('Input credentials count:', credentials.length); // DEBUG
debugLog('Input credentials:', credentials.map(c => ({ // DEBUG
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style preference for comments above their related lines:

Suggested change
debugLog('Input credentials count:', credentials.length); // DEBUG
debugLog('Input credentials:', credentials.map(c => ({ // DEBUG
// DEBUG
debugLog('Input credentials count:', credentials.length);
debugLog('Input credentials:', credentials.map(c => ({

Comment on lines +47 to +50
typeof credential === 'object' &&
!Array.isArray(credential) &&
credential.credentialSubject &&
typeof credential.credentialSubject === 'object';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
typeof credential === 'object' &&
!Array.isArray(credential) &&
credential.credentialSubject &&
typeof credential.credentialSubject === 'object';
typeof credential === 'object' &&
!Array.isArray(credential) &&
credential.credentialSubject &&
typeof credential.credentialSubject === 'object';

Comment on lines +46 to +50
const isValid = credential &&
typeof credential === 'object' &&
!Array.isArray(credential) &&
credential.credentialSubject &&
typeof credential.credentialSubject === 'object';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indent here is off, but I also think we can collapse this to:

Suggested change
const isValid = credential &&
typeof credential === 'object' &&
!Array.isArray(credential) &&
credential.credentialSubject &&
typeof credential.credentialSubject === 'object';
const isValid = credential?.credentialSubject &&
typeof credential?.credentialSubject === 'object';

It's true that we wouldn't catch a weird case of credential being an array with a property of credentialSubject that is a non-null object, but I think that's ok (that's a problem the caller should have already caught).

Though, I'm also not entirely sure why we're filtering out some invalid credentials here (even the original code wouldn't catch every possible problem) at all instead of allowing errors to occur if the caller failed to provide good inputs. We should probably allow the errors to happen in this API -- and let the caller ensure they are passing valid credentials. Thoughts?

Comment on lines +143 to +146
* {pointer, expectedValue, matchType}
* where pointer is a JSON pointer string, expectedValues is the
* expected value, and matchType describes how to match
* ('exactMatch', 'anyArray', etc.).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent should match other indent level (2), starting from prefix of * (the space after the docs asterisk isn't part of the "indent" on any line for consistency):

Suggested change
* {pointer, expectedValue, matchType}
* where pointer is a JSON pointer string, expectedValues is the
* expected value, and matchType describes how to match
* ('exactMatch', 'anyArray', etc.).
* {pointer, expectedValue, matchType}
* where pointer is a JSON pointer string, expectedValues is the
* expected value, and matchType describes how to match
* ('exactMatch', 'anyArray', etc.).

// Empty array - add it
pointers.push({pointer: currentPath, value});
} else if(typeof value === 'object' && value !== null &&
Object.keys(value).length === 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Object.keys(value).length === 0) {
Object.keys(value).length === 0) {

@dlongley
Copy link
Member

Might be able to address #91 here as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants