Skip to content

Releases: mozilla/mozanalysis

Use branchSlug for Nimbus exposures

15 Feb 17:50
4d11b68

Choose a tag to compare

Merge pull request #168 from mozilla/danielkberry-patch-1

Change branch key for normandy exposures

2021.6.2: Rename search_clients_daily (#134)

16 Jun 19:15
f142c21

Choose a tag to compare

Handle search_clients_daily rename

2021.6.1

01 Jun 22:55
b9ea2d0

Choose a tag to compare

Allow no-op filter_outliers calls (#133)

2021.4.2: Use concrete table for main data source (#127)

09 Apr 19:25
b5d3033

Choose a tag to compare

Change to main desktop metrics data source: reference telemetry_stable.main_v4 instead of the telemetry.main view to reduce query complexity.

2021.1.2: Add metrics for disabling Pocket in New Tab. (#122)

28 Jan 23:23
40dc0ef

Choose a tag to compare

* Add metrics for disabling Pocket in New Tab.

Welcome to 2021

04 Jan 19:03
3614a8d

Choose a tag to compare

Not everything is broken now

Christmas miracle

21 Dec 17:58
e6a2092

Choose a tag to compare

  • Add a new new_unique_profiles segment based on clients_first_seen
  • Perform an explicit enrollment query and cache the result

Backwards incompatible changes:

  • BigQueryClient.run_script_or_fetch is removed

2020.12.3: Make metric datasets configurable at runtime (#114)

10 Dec 21:20
7664a9b

Choose a tag to compare

Re-release with correct version number.

Backwards-incompatible changes:

  • Hides DataSource.from_expr and SegmentDataSource.from_expr. Consumers should call DataSource.from_expr_for(dataset=None) instead.

--

  • Make metric datasets configurable at runtime

Different release channels of Glean apps have their data stored in different BigQuery datasets. This departs from the legacy telemetry practice on desktop, where data from all release channels end up in a single table.
This means that, although it's reasonable to talk about a set of metrics that are valid across the different Fenix release channels, the underlying DataSources need to be customized for each release channel.
I had once expected that this would be easy to do just by defining an alternate set of DataSources and rebasing the Metrics onto the new DataSources -- but the events data source is a good example of why this isn't as easy as I'd like. DataSources can be complex and it would be annoying to redefine them.

Instead, allow DataSources to be templated over dataset, and have the Experiment class forward some information from the client about which dataset the metrics should be based on.

  • Default to new-hotness Fenix release app_id

  • Fixes a bug in the Fenix ping count metrics.

20.12.3: Make metric datasets configurable at runtime (#114)

08 Dec 18:28
7664a9b

Choose a tag to compare

Backwards-incompatible changes:

  • Hides DataSource.from_expr and SegmentDataSource.from_expr. Consumers should call DataSource.from_expr_for(dataset=None) instead.

--

  • Make metric datasets configurable at runtime

Different release channels of Glean apps have their data stored in different BigQuery datasets. This departs from the legacy telemetry practice on desktop, where data from all release channels end up in a single table.
This means that, although it's reasonable to talk about a set of metrics that are valid across the different Fenix release channels, the underlying DataSources need to be customized for each release channel.
I had once expected that this would be easy to do just by defining an alternate set of DataSources and rebasing the Metrics onto the new DataSources -- but the events data source is a good example of why this isn't as easy as I'd like. DataSources can be complex and it would be annoying to redefine them.

Instead, allow DataSources to be templated over dataset, and have the Experiment class forward some information from the client about which dataset the metrics should be based on.

  • Default to new-hotness Fenix release app_id

  • Fixes a bug in the Fenix ping count metrics.

2020.12.2: Fix enrollments query (#112)

01 Dec 23:50
d00d0e2

Choose a tag to compare

Fast follow from 2020.12.1, which made a silly SQL error.

Release notes for 2020.12.1 again:

This release makes two backwards-incompatible changes:

  • BigQueryContext.get_query is removed and replaced by .run_script_or_fetch()
  • Experiment.build_query is replaced by .build_query_template()

There are new dependencies on pyarrow and google-cloud-bigquery-storage to enable fast DataFrame downloads by default.

Changes include:

  • Pocket metrics are revised

  • Materialize enrollments table

Repeated references to the enrollments CTE seem to stress BigQuery out, giving rise to "Resources exceeded during query execution: Not enough resources for query planning - too many subqueries or query is too complex." errors.

If we use CREATE TEMPORARY TABLE to materialize the enrollments table before joining it everywhere, we get a scripting query with similar semantics and a calmer query planner.

  • Write destination table in the script

It's illegal to specify a destination table name (or a create/write disposition) if a query is a script, so we need to direct the output to a named table without passing those parameters into the API.

This breaks the public API for build_query by adding a new mandatory parameter. Requiring all arguments to be passed as keyword-only will make this breakage safe, i.e. it won't accidentally and silently consume one of the existing optional arguments into the destination_table argument.

Finally, we can't rely on taking a hash of the final query to use in the name of the destination table anymore, because the query contains the name of the destination table! Get around this by emitting a template instead.