Skip to content

feat(BRE2-661): Add Support for San Francisco Compute#87

Merged
drewmalin merged 17 commits intomainfrom
dm/BRE2-661
Feb 6, 2026
Merged

feat(BRE2-661): Add Support for San Francisco Compute#87
drewmalin merged 17 commits intomainfrom
dm/BRE2-661

Conversation

@drewmalin
Copy link
Contributor

No description provided.

@drewmalin drewmalin changed the title Add Support for San Francisco Compute (#78) feat(BRE2-661): Add Support for San Francisco Compute Feb 5, 2026
brianlechthaler and others added 16 commits February 6, 2026 11:36
* add scaffolding, and client authentication.

* return APITypeGlobal from GetAPIType function, as SFC accounts are not tied to specific regions.

* fix apiKey in SFCCredential struct

* scaffolding for instance.go

* add instance creation implementation with SSH key support

* add function to map the status of a node reported from SFC API to v1.LifecycleStatus in Brev

* implement GetInstance in sfcompute with node data retrieval inluding SSH Hostname / Public IP

* add TerminateInstance implementation with node release and delete logic

* set default SSH port to 2222, as is standard for our platform

* implement GetSSHHostname for retrieving the SSH hostname of an instance in sfcompute

* remove unneeded call to api for GetSSHHostname

* use VM ID instead of instance ID to retrieve SSH Hostname

* remove get ssh hostname function

* implement ListInstances

* add validation test for sfcompute with API key check and skip logic

* add getInstanceTypeID method for generating instance type IDs in sfcompute

* bump sfcnodes version to v0.1.0-alpha.4 which adds support for the /v0/zones endpoint

* implement GetLocations

* only return approved zones

* only return regions that have more than zero capacity instead of any zones that have capacity not equal to zero, in case the availability ever returns a negative number

* update location description to include formatted hardware type information. example: `sfc_hayesvalley_h100`

* return unavailable regions with v1.Location{Available: false}

* fix an error where a nil map was returned

* start implementing GetInstanceTypes

* fix tests failing due to ValidateRegionalInstanceTypes and ValidateStableInstanceTypeIDs fails errors

* set MaxPricePerNodePerHour to 1600 ($16/node/h, $2/gpu/h)

* add regions excelsior and yerba

* remove region excelsior
@drewmalin drewmalin marked this pull request as ready for review February 6, 2026 19:46
@drewmalin drewmalin requested a review from a team as a code owner February 6, 2026 19:46
t.Fatalf("failed to make client: %v", err)
}

instance, err := client.GetInstance(context.Background(), "6c7a3ade-1e59-4e04-af6e-365046995a81_test")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's with the hardcoded id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah these are basically test tasks, so this was left over from testing.

)

const (
maxPricePerNodeHour = 1600
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this 1600 dollars?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cents

}

func sshKeyCloudInit(sshKey string) string {
script := fmt.Sprintf("#cloud-config\nssh_authorized_keys:\n - %s", sshKey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to double check for ufw, or does sfcompute have firewalls external?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah SFC essentially only has port 2222 open at an infrastructural level.


func sfcStatusToLifecycleStatus(status string) v1.LifecycleStatus {
switch strings.ToLower(status) {
case "pending", "nodefailure", "unspecified", "awaitingcapacity", "unknown", "failed":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

failed/nodefailure will sit in pending and then our 30 minute timeout will hit and turn these to failed, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch I will adjust these to map to LifecycleStatusFailed

diskTypeSSD = "ssd"
)

var allowedZones = []string{"hayesvalley", "yerba"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the zones have an "enabled" or "available" so we don't have to hardcode them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have an "enabled" but we also want to explicitly exclude zones (per SFC's suggestion out of band).

patelspratik
patelspratik previously approved these changes Feb 6, 2026
@drewmalin drewmalin merged commit 67b1186 into main Feb 6, 2026
8 of 10 checks passed
@drewmalin drewmalin deleted the dm/BRE2-661 branch February 6, 2026 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants