The URLCollector class is working after the search step and de-duplicates based on cleaned (tracking parameters removed) urls. The following ZyteAPI step can add a resolved (respecting redirects, etx) url which can be different from the URL received after the search.
This can lead to not filtered duplicate products.
One solution would be to de-duplicate after ZyteAPI once again based on url_resolved
- check if
url_resolved is in the collection
- if yes: filter the product
- else: add
url_resolved to the collection
The
URLCollectorclass is working after the search step and de-duplicates based on cleaned (tracking parameters removed) urls. The following ZyteAPI step can add a resolved (respecting redirects, etx) url which can be different from the URL received after the search.This can lead to not filtered duplicate products.
One solution would be to de-duplicate after ZyteAPI once again based on
url_resolvedurl_resolvedis in the collectionurl_resolvedto the collection