Monster Deploy

Infrastructure-as-code for deploying Monster team's environments.

Technologies used

We use Terraform and Helm to manage our deployments.

Defining an environment.

See the README.

Deploying an environment

See the README.

Setting up Terra Resources

We can automate nearly all of our infrastructure setup, but not the creation of resources in Terra (at least not yet). These actions require manual work:

Registering a Google service account with Terra
Creating a TDR resource profile
Creating a TDR dataset

NOTE: The instructions below use many APIs that are planned to change pretty drastically as part of Terra's new architecture. Your milage may vary.

Registering Accounts

Terraform can create Google SAs, but it can't register them in the Terra system. We need to register the SAs that run our ingest pipelines in order to grant them read/write permissions to the dataset(s) they target.

To register an account:

Apply the Terraform module that creates the account; it should also write the account's secret key to Vault
Read the secret key from Vault into a JSON file on your local machine
Run the registration script, passing the path to the key-file and the name of the targeted Terra environment

After registering the account, you'll still need to grant it permissions. The easiest way to do that right now is to make it a TDR steward. You can do this by:

Go to the Terra UI for the targeted environment
- Dev: https://bvdp-saturn-dev.appspot.com/
- Prod: https://app.terra.bio/
Click the top-left hamburger menu, then the dropdown with your name, then "Groups"
Find the Stewards group in the list of your groups
- Dev: "JadeStewards-dev"
- Prod: "Stewards"
Add the SA to the group using its email address
Grant access to relevant datasets by calling the Jade addDatasetPolicyMember with policyName = steward for the SA in either Dev or Prod (see the Data repo FAQ)

Creating Resource Profiles

Resource profiles connect Google Billing Accounts to the repository's machinery. You should only need to create a new profile when a projects begins with a funding source that hasn't been used before.

Step 1 of setting up a profile is ensuring the TDR can access the targeted account. Grant the TDR's service account "Billing Account User" permissions on the account.

Dev: jade-k8-sa@broad-jade-dev.iam.gserviceaccount.com
Prod: terra-data-repository@broad-datarepo-terra-prod.iam.gserviceaccount.com You need to be a Billing Account Administrator on the target account to make this change.

Step 2 is to get the ID of the Billing Account. If you're viewing the details page of the BA, the ID is in the URL:

https://console.cloud.google.com/billing/{id}

Step 3 is to link the Billing Account into the TDR. Visit the Swagger UI of the TDR instance. Under the "resources" section, expand the POST route. Click "Try it out" and make the following edits to the pre-populated JSON:

Replace the value of "biller" with the constant string "direct"
Replace the value of "billingAccountId" with the ID from step 2
Replace the value of "profileName" with some unique name for the profile object; it will be used in the name of the generated GCP project

NOTE: When the TDR creates a project, it applies a prefix to the profile name. Google imposes a character maximum on project names. This means that profile names are effectively length- limited, but the limit depends on other configuration in the TDR. In the current production deployment, the limit is 4 characters.

Once you've filled out the JSON, you can submit the POST. If everything works out, you should get back the same payload with extra fields:

An "accessible" field with a value of true
An "id" field with a UUID

The UUID is needed for dataset creation.

Creating Datasets

TDR Datasets are the main targets of our ingest pipelines. Most of the hard work that goes into dataset creation involves schema design & declaration. Our ingest-utils repository includes tooling & build plugins to assist with that piece of the puzzle.

Pre-work:

Create a resource profile for the dataset
Declare the schema for the dataset in the ingest project, using our plugins

From there, step 1 is to generate the Jade-compatible definition of the schema. From the root of the ingest project, run sbt generateJadeSchema. The output should include a line:

[info] Wrote Jade schema to <some-path>/schema.json

Step 2 is to declare the dataset. Visit the Swagger UI of the TDR instance. Under the "repository" section, look for the POST /api/repository/v1/datasets route. Expand it, click "Try it out", and make the following edits to the pre-populated JSON:

Delete the "additionalProfileIds" field
Replace the value of "defaultProfileId" with the UUID of the resource profile you want to use
Replace the value of "description" with whatever you'd like, or delete it
Replace the value of "name" with a BigQuery-compatible identifier (only lowercase alphanumeric characters and '_' allowed)
Replace the entire value of "schema" with the contents of the Jade schema generated by sbt in step 1

Once you've filled out the JSON, you can submit the POST. You'll get back a job ID.

Step 3 is to poll the job ID until it finishes. You can do so using the GET /api/repository/v1/jobs/{id} route in the Swagger UI. When the job exits the "running" state, you can get its final results using the GET /api/repository/v1/jobs/{id}/result endpoint. For succeeded jobs, this call will output the ID of the new dataset. For failed jobs, this call will show information about what went wrong.

Name		Name	Last commit message	Last commit date
Latest commit History 294 Commits
.github		.github
environments		environments
hack		hack
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Monster Deploy

Technologies used

Defining an environment.

Deploying an environment

Setting up Terra Resources

Registering Accounts

Creating Resource Profiles

Creating Datasets

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Monster Deploy

Technologies used

Defining an environment.

Deploying an environment

Setting up Terra Resources

Registering Accounts

Creating Resource Profiles

Creating Datasets

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages