Conversation
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
Standard8
left a comment
There was a problem hiding this comment.
Thank you for working on this. It is great to see this moving along. Sorry for the delay in responding, there was lots of thinking and checking to do.
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
| num_ads_notshowing, | ||
| ad_blocker_inferred, | ||
| ad_components.component as ad_click_target | ||
| FROM `mozdata.firefox_desktop.serp_events` e |
There was a problem hiding this comment.
I realise the current serp_events doesn't have it, but we could really do with partner_code here as well (can we get it added there?
This would allow for better breakdowns by partner code, something that we can't analyse today with the existing tables, yet we should really be taking it into account.
There was a problem hiding this comment.
We should also have something like search_engine as provider_id. Similar to the sap_source field, the user may be searching different providers which aren't triggered via the normal search access points. Hence we need to include those as well.
Though could we also change "duckduckgo" to "ddg" within the field so that it matches the provider_id field for SAP and the search_engine_default_provider_id for the default search engine?
There was a problem hiding this comment.
We might want to also add the normalized_engine field in here as well - but only filled in from search_engine (there's no name field for SERPs, since there's a limited amount.
This would make it easier to search for both tagged and organic SERPs using the normalized_engine field, which you might want to do if you're also including SAP data.
There was a problem hiding this comment.
@Standard8: with your first comment, are we understanding correctly that you want partner_code added to the serp_events table itself (not just us pulling it in through another CTE)?
There was a problem hiding this comment.
Sorry, I realise I wasn't too clear here. Yes, the provider and partner_code data need pulling in from the serp.impression event (from serp_events). They're dedicated to that event, you can't pull them from the sap.counts event, since they're different events with potentially different totals per provider/partner code.
So, in summary, I think this serp_events cte needs the following adding:
CASE
WHEN search_engine = "duckduckgo" then "ddg"
ELSE search_engine
END as provider_id,
partner_code,
(This will need partner_code also adding to the mozdata.firefox_desktop.serp_events table...).
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
| normalized_engine, | ||
| provider_name, | ||
| provider_id, | ||
| partner_code, | ||
| overridden_by_third_party, | ||
| search_access_point, |
There was a problem hiding this comment.
As noted above, provider_id, partner_code, search_access_point should be from both sap & serp.
There was a problem hiding this comment.
@Standard8: we don't see the provider_id and partner_code fields in the serp_events table. Do you want us to pull them in from serp_impressions ping (it has provider and partner_code)? Or are you saying they should be added to the serp_events table?
There was a problem hiding this comment.
I believe provider_id (aka provider) is in the serp_events table as search_engine.
I think the partner_code should also be added to serp_events table if possible - That's an important aspect that we're currently missing.
There was a problem hiding this comment.
@Standard8 DE can add partner_code to the serp_events table. That is not a big lift. Does it need to be backfilled? I understand that will take a much longer time if needed but that they could add it to their backlog. @kbammarito or @alekhyamoz can correct me as needed.
There was a problem hiding this comment.
Presumably we can request backfill at a later time? If so, I'd suggest leaving it for now, and if we need a bit more history when we start using the new table, then we can request it.
There was a problem hiding this comment.
I believe the plan for "backfill" was to create another versioned table. We'd do a small backfill to ensure metric trends haven't changed, but the final table will concatenate the two tables: v1 if < date and v2 if > date. Stakeholders will keep using legacy data for past results, but future results will be transitioned to the new metrics.
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
Standard8
left a comment
There was a problem hiding this comment.
I've added a few responses. I'm assuming the "done" comments are ones that have been done locally, and not yet pushed to this PR.
| num_ads_notshowing, | ||
| ad_blocker_inferred, | ||
| ad_components.component as ad_click_target | ||
| FROM `mozdata.firefox_desktop.serp_events` e |
There was a problem hiding this comment.
Sorry, I realise I wasn't too clear here. Yes, the provider and partner_code data need pulling in from the serp.impression event (from serp_events). They're dedicated to that event, you can't pull them from the sap.counts event, since they're different events with potentially different totals per provider/partner code.
So, in summary, I think this serp_events cte needs the following adding:
CASE
WHEN search_engine = "duckduckgo" then "ddg"
ELSE search_engine
END as provider_id,
partner_code,
(This will need partner_code also adding to the mozdata.firefox_desktop.serp_events table...).
| normalized_engine, | ||
| provider_name, | ||
| provider_id, | ||
| partner_code, | ||
| overridden_by_third_party, | ||
| search_access_point, |
There was a problem hiding this comment.
I believe provider_id (aka provider) is in the serp_events table as search_engine.
I think the partner_code should also be added to serp_events table if possible - That's an important aspect that we're currently missing.
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v9/query.sql
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
…build-new-glean-based-equivalent-table-to-replace-search_clients_daily_v8
…able-to-replace-search_clients_daily_v8
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…build-new-glean-based-equivalent-table-to-replace-search_clients_daily_v8
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…able-to-replace-search_clients_daily_v8
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…able-to-replace-search_clients_daily_v8
This comment has been minimized.
This comment has been minimized.
…lace-search_clients_daily_v8' of github.com:mozilla/bigquery-etl into DENG-8178-build-new-glean-based-equivalent-table-to-replace-search_clients_daily_v8
Integration report for "Merge branch 'DENG-8178-build-new-glean-based-equivalent-table-to-replace-search_clients_daily_v8' of github.com:mozilla/bigquery-etl into DENG-8178-build-new-glean-based-equivalent-table-to-replace-search_clients_daily_v8"
|
Description
This PR builds
search_clients_daily_v9from Glean (migrating from legacy telemetry).Related Tickets & Documents
Reviewer, please follow this checklist