Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Last Updated: 2026-04-12

### Out of Scope

- **Force deletion** — behavior and approach (e.g., immediate hard delete, graceful escalation, manual intervention) requires a separate spike; depends on peer team requirements
- **Force deletion** — covered by [Force Deletion Design](../../../docs/force-deletion-design.md)
- **API hard-deletion mechanism** — the actor and pattern for hard-deleting DB records (inline, background job, customer-triggered, retention window) is covered by [HYPERFLEET-904](https://redhat.atlassian.net/browse/HYPERFLEET-904) — see [Hard-Delete Design](../../api-service/hard-delete-design.md)
- Cleanup job support (run a job before deleting resources) — future enhancement
- Per-resource deletion retry — Sentinel reconciliation loop re-triggers the full adapter; fine-grained per-resource retry is a future enhancement
Expand Down Expand Up @@ -633,7 +633,7 @@ Detailed metric naming, labels, alert thresholds, and SLO/SLI definitions will b
| 2 | **Stale Applied=False before `deleted_time`** | **API gates on aggregate `Reconciled` (computed from adapter `Finalized`), never on individual adapter `Applied`** | **Critical** |
| 3 | Creation in-flight when deletion starts | Discovery handles naturally on next event | Low |
| 4 | Which adapters must confirm? | Only adapters with existing status entries | Medium |
| 5 | Stuck in Finalizing (`deleted_time` set but can't hard-delete) | Configurable timeout + alerting (force deletion covered separately) | Medium |
| 5 | Stuck in Finalizing (`deleted_time` set but can't hard-delete) | Configurable timeout + alerting (see [Force Deletion Design](../../../docs/force-deletion-design.md)) | Medium |
| 6 | Concurrent deletion events | Idempotent operations, K8s handles safely | Low |
| 7 | Independent subresource deletion | Same pattern, check subresource `deleted_time` | Low |
| 8 | Cancel deletion | Not cancellable in 1.0.0 | Low |
Expand Down Expand Up @@ -713,7 +713,7 @@ No grace period is needed — deletion `Reconciled` remains `False` until all ad

#### Stuck in Finalizing (#5)

If a resource stays in Finalizing (`deleted_time` set) beyond a configurable timeout (e.g., 30 minutes): log which adapters haven't confirmed and expose stuck state via API. Force deletion behavior is out of scope for this design and requires a separate spike. The approach (e.g., immediate hard delete, graceful escalation, or manual intervention) depends on peer team requirements and has not been decided yet.
See [Force Deletion Design](../../../docs/force-deletion-design.md).

#### Concurrent Events During Deletion (#6)

Expand Down
2 changes: 1 addition & 1 deletion hyperfleet/components/api-service/hard-delete-design.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Last Updated: 2026-04-23
### Out of Scope

- **Pending deletion** (setting `deleted_time`, cascading to subresources) — covered by [Adapter Deletion Flow Design](../adapter/framework/adapter-deletion-flow-design.md)
- **Force deletion** — requires a separate spike; depends on peer team requirements
- **Force deletion** — covered by [Force Deletion Design](../../docs/force-deletion-design.md)
- **Retention window + CronJob** — viable but deferred; see [ADR 0012 Alternatives](../../adrs/0012-hard-delete-mechanism-after-adapter-reconciliation.md#alternatives-considered)

---
Expand Down
180 changes: 180 additions & 0 deletions hyperfleet/docs/force-deletion-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
---
Status: Draft
Owner: HyperFleet Team
Last Updated: 2026-04-30
---

# Force Deletion Design

**Jira**: [HYPERFLEET-895](https://redhat.atlassian.net/browse/HYPERFLEET-895)

**Related**:
- [Adapter Deletion Flow Design](../components/adapter/framework/adapter-deletion-flow-design.md)
- [Hard Delete Design](../components/api-service/hard-delete-design.md)

---

## Table of Contents

- [Problem Statement](#problem-statement)
- [API Contract for Force Delete](#api-contract-for-force-delete)
- [Database Impact](#database-impact)
- [Cascade Semantics for Resource and Subresource Deletion](#cascade-semantics-for-resource-and-subresource-deletion)
- [Interaction with Normal Delete Flow](#interaction-with-normal-delete-flow)
- [Timeout and Stuck Detection Ownership](#timeout-and-stuck-detection-ownership)
- [Audit Logging Approach](#audit-logging-approach)
- [Trade-offs](#trade-offs)
- [Alternatives Considered](#alternatives-considered)

---

## Problem Statement

The existing deletion flow ([Adapter Deletion Flow Design](../components/adapter/framework/adapter-deletion-flow-design.md)) relies on all adapters confirming cleanup (`Finalized=True`) before the API hard-deletes records. If an adapter is stuck, unreachable, or permanently unable to clean up its resources, the resource remains in `Finalizing` state indefinitely with no recovery path.

Force deletion must provide an escape hatch that allows operators to hard-delete resource and subresource records from the database when the normal deletion flow is blocked.

---

## API Contract for Force Delete

```mermaid
sequenceDiagram
participant Admin
participant API
participant DB
participant Adapter

Admin->>API: POST /admin/clusters/{id}/force-delete
API->>API: Validate resource in Finalizing
API->>API: Log audit entry
API->>DB: Hard-delete records (single transaction)

alt Delete succeeds
API-->>Admin: 204 No Content
Adapter->>API: GET /clusters/{id}
API-->>Adapter: 404 Not Found
else Delete fails
API->>API: Log failure
API-->>Admin: 500 Internal Server Error
end
```

Force delete is a synchronous API action. The API immediately hard-deletes records from the database, bypassing the `Reconciled=True` gate.

`POST /admin/clusters/{id}/force-delete`
`POST /admin/clusters/{cluster_id}/nodepools/{nodepool_id}/force-delete`

Force delete is a privileged operation exposed under the `/admin/` path prefix, a new pattern outside the standard `/api/hyperfleet/{version}/` convention. The `/admin/` prefix is used because force delete is an operational escape hatch, not a versioned resource API. The HyperFleet API is not directly customer-facing. It sits behind partner APIs (GCP, AWS/ROSA) that provide native authorization. Partners do not expose admin endpoints in their public API.

The resource must already be in `Finalizing` state (`deleted_time` set), meaning a normal `DELETE` was issued first. Force delete is an escalation for stuck deletions, not a replacement for normal delete. The API rejects force delete on resources that are not in `Finalizing`.

The request body requires a `reason` field explaining why the admin is force-deleting. See [Audit Logging Approach](#audit-logging-approach).

Response codes:

- `204 No Content`: force delete succeeded, records removed
Comment thread
kuudori marked this conversation as resolved.
- `400 Bad Request`: missing or empty `reason` in request body
- `404 Not Found`: resource does not exist (already deleted or invalid ID)
- `409 Conflict`: resource is not in `Finalizing` state
- `500 Internal Server Error`: delete failed due to unexpected server error

All error responses follow the [HyperFleet Error Model](../standards/error-model.md) (RFC 9457 Problem Details format).

Comment thread
coderabbitai[bot] marked this conversation as resolved.
---

## Database Impact

No new columns or tables. Force delete removes the same records as normal hard-delete (adapter statuses, subresources, then the resource) in a single transaction without waiting for `Reconciled=True`. Despite bypassing the Reconciled gate, the API code enforces the same bottom-up deletion ordering within the transaction. `ON DELETE RESTRICT` on foreign keys acts as a safety net, not the primary enforcement mechanism (see [Hard Delete Design](../components/api-service/hard-delete-design.md)).

Comment thread
coderabbitai[bot] marked this conversation as resolved.
---

## Cascade Semantics for Resource and Subresource Deletion

Force-deleting a resource also removes all its subresources in the same transaction, using bottom-up ordering (e.g., force-deleting a Cluster removes all NodePool records before the Cluster itself).

Force delete also works on individual subresources. For example, a single stuck NodePool can be force-deleted without affecting the Cluster or other NodePools.

---

## Interaction with Normal Delete Flow

Force delete requires no changes to Sentinel's polling or event publishing. Once records are removed from the DB, Sentinel has nothing to poll. Sentinel does gain new responsibilities for stuck detection (see [Timeout and Stuck Detection Ownership](#timeout-and-stuck-detection-ownership)).

Adapters may receive events for resources that have been force-deleted. When an adapter tries to GET the resource as a precondition or POST its status back to the API, the API returns 404. Adapters must handle this gracefully (log and move on, do not retry).

---

## Timeout and Stuck Detection Ownership

Force delete is always manually triggered by an admin. There is no automatic escalation from normal delete to force delete.

Sentinel owns stuck detection. It exposes a gauge metric aggregated by `resource_type` (cluster, nodepool) to keep label cardinality low:

- `hyperfleet_sentinel_finalizing_resources` (gauge): count of resources currently in `Finalizing` state

A Prometheus alert rule on this gauge (e.g., `finalizing_resources > 0 for 30m`) detects stuck deletions. To identify specific stuck resources, operators query the API via TSL filtering on `deleted_time`.

---

## Audit Logging Approach

The API logs a structured log entry before hard-deleting records, following the [Logging Specification](../standards/logging-specification.md). The log entry includes the caller identity, resource ID, resource type, timestamp, subresources being removed, and adapter statuses at time of force delete. If the delete fails, the API logs the failure with the error.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything else we can add here, i know we are following the logging spec, but is that enough, during an incident, or a PMR, can we answer exactly the when and why a resources was nuked?

Copy link
Copy Markdown
Author

@pnguyen44 pnguyen44 Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a required reason field on the request body that gets included in the audit log. Between adapter statuses (why it was stuck) and the reason (why the admin intervened), we can reconstruct the full picture during a PMR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable, i think it is worth capturing the trade off's since we do not control the log retention period, this could become an issue, if there is an incident and the logs are not around, we wont be able to answer the when/why

So yeah, a trade off sounds good, that we will rely on the log audit for now, and if it proves insufficient we can potentially extend to an audit table, which we can control.

WDYT?


The force-delete endpoint requires a `reason` in the request body. The reason is included in the audit log entry, recording why the admin chose to force-delete.

---

## Trade-offs

### What We Gain

- Recovery path for resources stuck in `Finalizing` indefinitely

### What We Lose / What Gets Harder

- K8s resources managed by adapters may be orphaned if adapters did not finish cleanup before force delete

### Acceptable Because

- Force delete is a privileged, manual operation. The admin accepts the consequences when they invoke it.
- Orphaned K8s resources can be cleaned up manually or via a future garbage collector.

---

## Alternatives Considered
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if you explored the idea, of removing finalizers from created resources 🤔

With the current design we not only orphan resources in the users cloud, but we orphan resources in our infra. I dont think hypershift has a 'force' option but we could always force it via removing finalizers. It would be worth at least exploring IMO

What do people think? Is the juice worth the squeeze to clean up not just our DB but resources in our infra down to the management cluster.

Copy link
Copy Markdown
Author

@pnguyen44 pnguyen44 Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Cleaning up K8s infra is a separate concern since stripping finalizers would need management cluster access, which today only adapters have, and the adapter being stuck is usually why we're force-deleting in the first place. For now the admin can strip finalizers manually via kubectl as part of the same incident. The stuck detection metric and force-delete audit logs will tell us how often this happens, and we can revisit if it becomes a frequent manual step.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree to a point, I would think having a 'best effort' force would probably require a dedicated adapter/controller to handle it.

My main concern about revisiting it later is API compatibility, if we revisit this after we are in production we are limited in what changes we can back to the proposed API in this design, as the contract is locked.

I think best we capture this decision in an ADR, that we are accepting that force delete is hyperfleet DB only, and that we acknowledge that we are orphanining infra. We should document how we would extend to a 'best effort' clean up later on, via a dedicate endpoint, or cleanup adapter/controller.
Def worth a note in the trade off's and link to the ADR, just so we dont have a missed gap, and if it becomes a problem we have a point of reference to start from.


### DeletionStuck Condition via Sentinel

**What**:
- When a resource exceeds a configurable deletion timeout, Sentinel POSTs a `DeletionStuck=True` condition to the resource's `status.conditions` via the API. Operators search for stuck resources via TSL: `GET /clusters?search=status.conditions.DeletionStuck='True'`.

**Why Rejected**:
- Sentinel is read-only by design. It polls the API and publishes CloudEvents. Adding write capabilities changes Sentinel from observer to actor, violating single-responsibility (see [ADR 0012](../adrs/0012-hard-delete-mechanism-after-adapter-reconciliation.md)).
- The Prometheus gauge metric and TSL filtering on `deleted_time` provide equivalent operator visibility without changing Sentinel's role.

### Per-Adapter Skip Annotations

**What**:
- An annotation (`hyperfleet.io/skip-cleanup-ADAPTER_NAME`) on a resource tells the system to skip a specific adapter during the normal deletion flow, allowing the remaining adapters to finalize while bypassing the stuck one.

**Why Rejected**:
- Force delete already covers the "adapter is stuck" scenario. The skip flag adds a middle ground (skip one, wait for the rest) that touches API, Sentinel, and the adapter framework for a narrow case.
- If the healthy adapters are running, they handle 404s gracefully when a force-deleted resource disappears. The practical difference between letting them finish cleanup and force-deleting while they no-op on 404 is minimal.
- Keeping deletion binary (normal waits for all, force bypasses all) is simpler to reason about and implement.

### Async Force Delete

**What**:
- Admin calls the delete endpoint with `?force=true`.
Comment thread
kuudori marked this conversation as resolved.
- Instead of immediately hard-deleting, the API sets a `force_delete` signal on the resource in the DB.
- Sentinel polls, detects the change, publishes an event.
- Adapters receive the event, skip or attempt cleanup, report `Finalized=True` back to the API.
- API sees `Reconciled=True` and hard-deletes the records.
- A variation adds a timeout: if adapters do not respond, a background job in the API hard-deletes the records anyway.

**Why Rejected**:
- If adapters are reachable, the async path round-trips through Sentinel and adapters only to report `Finalized=True`. The records get hard-deleted either way.
- If adapters are unreachable, the async path is blocked for the same reason as normal deletion.
- The timeout variation requires code changes in all three components (API, Sentinel, adapters) instead of just the API, a background job in the API to monitor expired timeouts, and still needs sync hard-delete as a fallback.