-
Notifications
You must be signed in to change notification settings - Fork 51
Add support for time partitioning and expiry #47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add support for time partitioning and expiry #47
Conversation
…on for WAFRuleIDs (string) and EdgeResponseCompressionRatio (float)
added explicit schema
added logic to read schema from remote JSON file
update schema with additional fields
merge remote-schema
Updated cloud functions storage bucket variable
small typo fix
update README.md
According to [docs](https://cloud.google.com/functions/docs/migrating/nodejs-runtimes) and warning emails > The Node.js 8 runtime will be deprecated on 2020-06-05 At moment I'm connecting our cloudflare to gcp, and decided to set node to 10 can confirm that everything working
upgrade to node 10
add gcp project id module
* Added Spectrum schema and bash $SCHEMA variable. * type fix for Spectrum schema * add $SCHEMA to automatic install * fix README.md * fix capitalization * Update cloudshell.md Co-authored-by: Frank Taylor <7483580+shagamemnon@users.noreply.github.com>
|
Hey @igorwwwwwwwwwwwwwwwwwwww thanks for taking a stab at this. In the past, creating ingestion-time partitioning for logs inserted via a load job has been non-trivial. The solution in your PR has not worked in the past (see https://github.com/cloudflare/cloudflare-gcp/blob/add-partition-v2/logpush-to-bigquery/index.js), so we will need a test before a merge can be considered. FWIW: to my knowledge, this can only be accomplished in BigQuery using "partition decorators" which are described here: https://cloud.google.com/bigquery/docs/creating-partitioned-tables#creating_an_ingestion-time_partitioned_table_when_loading_data |
The current behaviour is to load everything into a non-partitioned table. That means that queries will scan the entire table every time.
In order to make queries cheaper, we can use time partitioning.
One nice side-effect of this is that we also get the ability to configure expiration.
Note: I could use some help testing out this patch.