Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions .github/workflows/deploy-infra.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,45 @@ jobs:
echo "TF_VAR_google_client_secret=${{ secrets.GOOGLE_CLIENT_SECRET }}" >> $GITHUB_ENV


- name: Get Runner IP
id: ip
run: |
ip=$(curl -s https://api.ipify.org)
echo "Runner IP: $ip"
echo "TF_VAR_client_ip_address=$ip" >> $GITHUB_ENV
echo "RUNNER_IP=$ip" >> $GITHUB_ENV

- name: Set Key Vault Firewall to Allow
run: |
# Dynamic Key Vault Name Lookup
KV_NAME_PREFIX="${{ env.ENVIRONMENT }}-alpinebot-vault-"
echo "Looking for Key Vault starting with: $KV_NAME_PREFIX"

# Find the Key Vault name that matches the pattern
KV_NAME=$(az keyvault list --resource-group "$RG_NAME" --query "[?starts_with(name, '$KV_NAME_PREFIX')].name | [0]" -o tsv)

if [ -z "$KV_NAME" ]; then
echo "Key Vault not found. It might not be created yet."
# Fallback or exit gracefully depending on logic.
# Here we assume it's a fresh deploy and we can skip setting firewall rules for now.
echo "Skipping firewall update."
exit 0
fi

echo "Found Key Vault: $KV_NAME"
RG_NAME="${{ env.ENVIRONMENT }}-alpinebot"
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable RG_NAME is used on line 72 before it's defined on line 83. This will cause the script to fail because the variable is referenced out of order.

Move line 83 (RG_NAME="${{ env.ENVIRONMENT }}-alpinebot") to before line 72 where it's first used, or better yet, move it to the beginning of the script after the IP retrieval.

Copilot uses AI. Check for mistakes.

echo "Listing Key Vaults in $RG_NAME for debugging..."
az keyvault list --resource-group "$RG_NAME" --query "[].name" -o tsv || echo "Failed to list KVs"

echo "Attempting to force Key Vault $KV_NAME firewall to Allow..."
# Try to update, ignore failure if KV doesn't exist (e.g. fresh deploy)
az keyvault update --name "$KV_NAME" --resource-group "$RG_NAME" --default-action Allow --public-network-access Enabled || echo "Key Vault update failed (it might not exist yet)."

echo "Waiting 60 seconds for propagation..."
sleep 60




Comment on lines +92 to 97
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded 60-second sleep is arbitrary and may not be sufficient for Key Vault firewall rule propagation in all cases. Azure service propagation times can vary.

Consider either:

  1. Increasing the wait time to a more conservative value (e.g., 120-180 seconds) for reliability
  2. Implementing a retry mechanism that polls the Key Vault until it's accessible
  3. Documenting why 60 seconds was chosen
Suggested change
echo "Waiting 60 seconds for propagation..."
sleep 60
echo "Waiting for Key Vault firewall rule propagation (up to 180 seconds)..."
for i in {1..18}; do
if az keyvault show --name "$KV_NAME" --resource-group "$RG_NAME" > /dev/null 2>&1; then
echo "Key Vault is accessible after $((i*10)) seconds."
break
else
echo "Key Vault not accessible yet, waiting 10 seconds (attempt $i/18)..."
sleep 10
fi
done

Copilot uses AI. Check for mistakes.
- name: Set Up Terraform
Expand Down Expand Up @@ -83,6 +122,8 @@ jobs:
TF_LOG: DEBUG
TF_LOG_PATH: terraform.log
ENVIRONMENT: ${{ env.ENVIRONMENT }}

Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] There's an extra blank line and inconsistent indentation. The environment variable TF_VAR_client_ip_address should be aligned with the other environment variables above, and there should only be one blank line between sections.

Suggested format:

          TF_LOG: DEBUG
          TF_LOG_PATH: terraform.log
          ENVIRONMENT: ${{ env.ENVIRONMENT }}
          TF_VAR_client_ip_address: ${{ env.RUNNER_IP }}
Suggested change

Copilot uses AI. Check for mistakes.
TF_VAR_client_ip_address: ${{ env.RUNNER_IP }}
run: |
echo "Using environment: ${{ env.ENVIRONMENT }}"
terraform apply -var="environment=${{ env.ENVIRONMENT }}" -auto-approve
Expand Down
60 changes: 41 additions & 19 deletions infra/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -13,18 +13,35 @@ resource "azurerm_resource_group" "rg" {
tags = local.environment_vars.tags
}

#### Create Virtual Network and Subnet ######
module "virtual_network" {
source = "../modules/virtual_network"
vnet_name = local.environment_vars.vnet_name
az_location = local.environment_vars.az_location
az_rg_name = local.environment_vars.az_rg_name
vnet_address_space = local.environment_vars.vnet_address_space
subnet_name = local.environment_vars.subnet_name
subnet_prefix = local.environment_vars.subnet_prefix
tags = local.environment_vars.tags

depends_on = [azurerm_resource_group.rg]
}

#### Create the Azure Key Vault #####

# Retrieve the runner's public IP
data "http" "ip" {
url = "https://api.ipify.org"



resource "random_integer" "kv_suffix" {
min = 1000
max = 9999
}

module "key_vault" {
source = "../modules/key_vault"

az_rg_name = local.environment_vars.az_rg_name
az_kv_name = local.environment_vars.az_kv_name
az_kv_name = "${local.environment_vars.az_kv_name}-${random_integer.kv_suffix.result}"
az_location = local.environment_vars.az_location
tenant_id = var.az_tenant_id
enabled_for_disk_encryption = false
Expand All @@ -35,15 +52,16 @@ module "key_vault" {

tags = local.environment_vars.tags

key_vault_ip_rules = [data.http.ip.response_body]
key_vault_ip_rules = [
for ip in [var.client_ip_address, "83.76.0.0/14"] : ip if ip != null
Comment on lines +55 to +56
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded IP range "83.76.0.0/14" is added to the Key Vault firewall rules without explanation. This is a broad CIDR block covering 262,144 IP addresses (83.76.0.0 - 83.79.255.255).

Consider:

  1. Adding a comment explaining what this IP range represents (organization network, specific service, etc.)
  2. Moving this to a variable for better maintainability and environment-specific configuration
  3. Narrowing the range if possible to reduce the attack surface

Copilot uses AI. Check for mistakes.
]

key_vault_subnet_ids = [
module.virtual_network.subnet_id
]
}

# Wait for firewall rule propagation
resource "time_sleep" "wait_for_firewall" {
create_duration = "60s"

depends_on = [module.key_vault]
}

# Get the current service principal/client object ID
data "azurerm_client_config" "current" {}
Expand Down Expand Up @@ -79,20 +97,23 @@ resource "azurerm_key_vault_secret" "openai_key" {
depends_on = [
module.key_vault,
module.cognitive_account,
azurerm_role_assignment.key_vault_secrets_officer,
time_sleep.wait_for_firewall
azurerm_role_assignment.key_vault_secrets_officer
]
Comment on lines 97 to 101
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The time_sleep.wait_for_firewall resource has been removed from the depends_on list, but the synchronization concern it addressed (waiting for Key Vault firewall rules to propagate) still exists.

While the GitHub Actions workflow now handles pre-deployment firewall configuration (lines 65-94 in deploy-infra.yaml), there's no guarantee that the rules have fully propagated by the time Terraform tries to create this secret. This could lead to intermittent failures when trying to write secrets to the Key Vault.

Consider either:

  1. Adding a local-exec provisioner with a wait/retry mechanism
  2. Documenting that the workflow's 60-second wait is expected to cover this propagation delay
  3. Ensuring the Key Vault network ACLs remain set to "Allow" during deployment (though this is less secure)

Copilot uses AI. Check for mistakes.
}

#### Deploy AlpineBot OpenAI Account ######
module "cognitive_account" {
source = "../modules/cognitive_account"
alpinebotaiact_name = local.environment_vars.alpinebotaiact_name
az_location = local.environment_vars.az_location
az_rg_name = local.environment_vars.az_rg_name
kind = local.environment_vars.kind
sku_name_cog_acct = local.environment_vars.sku_name_cog_acct
tags = local.environment_vars.tags
source = "../modules/cognitive_account"
alpinebotaiact_name = "${local.environment_vars.alpinebotaiact_name}-${random_integer.kv_suffix.result}"
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The random_integer suffix is being reused for both the Key Vault name (line 44) and the Cognitive Account name (line 107). Since this is the same random value, it doesn't provide uniqueness between resources, only across deployments.

While this may be intentional for tracking related resources, consider using separate random values for each resource or documenting this design decision to clarify that these resources are intentionally linked by the same suffix.

Copilot uses AI. Check for mistakes.
az_location = local.environment_vars.az_location
az_rg_name = local.environment_vars.az_rg_name
kind = local.environment_vars.kind
sku_name_cog_acct = local.environment_vars.sku_name_cog_acct
tags = local.environment_vars.tags
model_deployment_name = local.environment_vars.alpinebotaidepl
model_name = local.environment_vars.model_name
model_version = local.environment_vars.model_version
deployment_sku_name = local.environment_vars.deployment_sku_name

depends_on = [azurerm_resource_group.rg]
}
Expand Down Expand Up @@ -200,6 +221,7 @@ module "function_app" {
az_rg_name = local.environment_vars.az_rg_name
service_plan_id = module.app_service_plan.service_plan_id
app_insights_connection_string = azurerm_application_insights.apbotinsights.connection_string
virtual_network_subnet_id = module.virtual_network.subnet_id

app_settings = {
"AZURE_OPENAI_API_KEY" = "@Microsoft.KeyVault(SecretUri=${azurerm_key_vault_secret.openai_key.id})"
Expand Down
10 changes: 10 additions & 0 deletions infra/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,13 @@ output "function_app_default_hostname" {
description = "The default hostname of the Function App"
value = module.function_app.function_app_default_hostname
}

output "key_vault_name" {
description = "The name of the Key Vault"
value = module.key_vault.key_vault_name
}

output "openai_account_name" {
description = "The name of the OpenAI Account"
value = module.cognitive_account.openai_account_name
}
12 changes: 4 additions & 8 deletions infra/providers.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,6 @@ terraform {
source = "cyrilgdn/postgresql"
version = "1.17.0"
}
http = {
source = "hashicorp/http"
version = "~> 3.4.0"
}
time = {
source = "hashicorp/time"
version = "~> 0.9.0"
}
}

backend "azurerm" {
Expand All @@ -44,6 +36,10 @@ provider "azurerm" {
purge_soft_delete_on_destroy = true
recover_soft_deleted_key_vaults = false
}
cognitive_account {
purge_soft_delete_on_destroy = true
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cognitive account purge configuration enables automatic purging of soft-deleted resources on destroy. While this may be useful for development/testing, it prevents recovery of accidentally deleted cognitive services accounts and could lead to permanent data loss.

Consider making this configurable per environment (e.g., enabled only for dev/qa, disabled for production) to prevent irreversible deletions in production environments.

Suggested change
purge_soft_delete_on_destroy = true
# Make purge_soft_delete_on_destroy configurable per environment to prevent accidental permanent deletion in production.
purge_soft_delete_on_destroy = var.cognitive_account_purge_soft_delete_on_destroy

Copilot uses AI. Check for mistakes.
}

}
}

Expand Down
36 changes: 36 additions & 0 deletions infra/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ variable "environments" {
rbac_enabled = bool
kind = string
sku_name_cog_acct = string
deployment_sku_name = string
auth_enabled = bool
redis_cache_name = string
postgresql_server_name = string
Expand All @@ -34,6 +35,12 @@ variable "environments" {
function_app_name = string
function_storage_account_name = string
azure_openai_api_version = string
model_name = string
model_version = string
vnet_name = string
vnet_address_space = list(string)
subnet_name = string
subnet_prefix = list(string)
}))
default = {
"dev" = {
Expand All @@ -60,6 +67,7 @@ variable "environments" {
rbac_enabled = true
kind = "OpenAI"
sku_name_cog_acct = "S0"
deployment_sku_name = "GlobalStandard"
auth_enabled = true
redis_cache_name = "dev-alpinebot-redis"
postgresql_server_name = "dev-alpinebot-psql"
Expand All @@ -69,6 +77,12 @@ variable "environments" {
function_app_name = "dev-alpinebot-func"
function_storage_account_name = "devalpinebotfuncsa"
azure_openai_api_version = "2024-02-15-preview"
model_name = "gpt-4o"
model_version = "2024-05-13"
vnet_name = "dev-alpinebot-vnet"
vnet_address_space = ["10.0.0.0/16"]
subnet_name = "dev-alpinebot-subnet"
subnet_prefix = ["10.0.1.0/24"]
},
"qa" = {
tags = {
Expand All @@ -94,6 +108,7 @@ variable "environments" {
rbac_enabled = true
kind = "OpenAI"
sku_name_cog_acct = "S0"
deployment_sku_name = "GlobalStandard"
auth_enabled = false
redis_cache_name = "qa-alpinebot-redis"
postgresql_server_name = "qa-alpinebot-psql"
Expand All @@ -103,6 +118,12 @@ variable "environments" {
function_app_name = "qa-alpinebot-func"
function_storage_account_name = "qaalpinebotfuncsa"
azure_openai_api_version = "2024-08-01-preview"
model_name = "gpt-4o"
model_version = "2024-05-13"
vnet_name = "qa-alpinebot-vnet"
vnet_address_space = ["10.1.0.0/16"]
subnet_name = "qa-alpinebot-subnet"
subnet_prefix = ["10.1.1.0/24"]
},
"main" = {
tags = {
Expand All @@ -128,6 +149,7 @@ variable "environments" {
rbac_enabled = true
kind = "OpenAI"
sku_name_cog_acct = "S0"
deployment_sku_name = "GlobalStandard"
auth_enabled = false
redis_cache_name = "main-alpinebot-redis"
postgresql_server_name = "main-alpinebot-psql"
Expand All @@ -137,6 +159,12 @@ variable "environments" {
function_app_name = "main-alpinebot-func"
function_storage_account_name = "mainalpinebotfuncsa"
azure_openai_api_version = "2024-08-01-preview"
model_name = "gpt-4o"
model_version = "2024-05-13"
vnet_name = "main-alpinebot-vnet"
vnet_address_space = ["10.2.0.0/16"]
subnet_name = "main-alpinebot-subnet"
subnet_prefix = ["10.2.1.0/24"]
}
}
}
Expand Down Expand Up @@ -188,3 +216,11 @@ variable "google_client_secret" {






variable "client_ip_address" {
description = "The IP address of the client (e.g., GitHub Actions runner) to allow access to Key Vault."
type = string
default = null
}
15 changes: 15 additions & 0 deletions modules/cognitive_account/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,18 @@ resource "azurerm_cognitive_account" "alpinebot_openai" {

tags = var.tags
}

resource "azurerm_cognitive_deployment" "openai_deployment" {
name = var.model_deployment_name
cognitive_account_id = azurerm_cognitive_account.alpinebot_openai.id
model {
format = "OpenAI"
name = var.model_name
version = var.model_version
}

sku {
name = var.deployment_sku_name
capacity = 10
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The capacity is hardcoded to 10 for the cognitive deployment SKU. This value represents the throughput capacity in thousands of tokens per minute (TPM) and may not be appropriate for all environments.

Consider making this configurable through a variable (e.g., deployment_capacity) so it can be adjusted per environment (dev might need less capacity than production) to optimize costs and performance.

Copilot uses AI. Check for mistakes.
}
}
4 changes: 4 additions & 0 deletions modules/cognitive_account/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,7 @@ output "openai_key" {
output "cognitive_account_endpoint" {
value = azurerm_cognitive_account.alpinebot_openai.endpoint
}

output "openai_account_name" {
value = azurerm_cognitive_account.alpinebot_openai.name
}
21 changes: 21 additions & 0 deletions modules/cognitive_account/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,25 @@ variable "sku_name_cog_acct" {
variable "tags" {
description = "value of tags"
type = map(string)
}

variable "model_deployment_name" {
description = "Name of the OpenAI model deployment"
type = string
}

variable "model_name" {
description = "Name of the OpenAI model (e.g., gpt-4)"
type = string
}

variable "model_version" {
description = "Version of the OpenAI model"
type = string
}

variable "deployment_sku_name" {
description = "SKU name for the OpenAI deployment (e.g., Standard, GlobalStandard)"
type = string
default = "GlobalStandard"
}
1 change: 1 addition & 0 deletions modules/function_app/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ resource "azurerm_linux_function_app" "function_app" {
service_plan_id = var.service_plan_id
storage_account_name = azurerm_storage_account.function_storage.name
storage_account_access_key = azurerm_storage_account.function_storage.primary_access_key
virtual_network_subnet_id = var.virtual_network_subnet_id

identity {
type = "SystemAssigned"
Expand Down
6 changes: 6 additions & 0 deletions modules/function_app/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,9 @@ variable "tags" {
description = "Tags to apply to Function App resources"
type = map(string)
}

variable "virtual_network_subnet_id" {
description = "ID of the subnet to integrate with the Function App"
type = string
default = null
}
2 changes: 1 addition & 1 deletion modules/key_vault/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ resource "azurerm_key_vault" "alpinebot_kv" {
tags = var.tags

network_acls {
default_action = "Deny"
default_action = "Allow"
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Key Vault network ACL default_action has been changed from "Deny" to "Allow", which significantly weakens the security posture. This allows unrestricted network access to the Key Vault by default, even though subnet IDs and IP rules are specified.

Consider reverting this to "Deny" to maintain defense-in-depth security. If temporary "Allow" access is needed during deployment, handle it through the GitHub Actions workflow (as shown in lines 65-94 of deploy-infra.yaml) rather than permanently allowing all network access.

Suggested change
default_action = "Allow"
default_action = "Deny"

Copilot uses AI. Check for mistakes.
bypass = "AzureServices"
ip_rules = var.key_vault_ip_rules
virtual_network_subnet_ids = var.key_vault_subnet_ids
Expand Down
4 changes: 4 additions & 0 deletions modules/key_vault/outputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,7 @@ output "key_vault_id" {
#output "openai_key_id" {
# value = azurerm_key_vault_secret.openai_key.id
#}

output "key_vault_name" {
value = azurerm_key_vault.alpinebot_kv.name
}
Loading