NERC Allocation Revocation Workflow by knikolla · Pull Request #28 · CCI-MOC/ops-docs

knikolla · 2026-01-21T14:28:15Z

No description provided.

policies/0009-nerc-allocation-revocation.md

joachimweyl

a few small changes and a question or two added in comments.

Milstein · 2026-01-29T14:26:49Z

policies/0009-nerc-allocation-revocation.md

+
+| **Phase**                | **Duration** (Days) | **System Actions**                                               | **User Impact**                                                   |
+|--------------------------|---------------------|------------------------------------------------------------------|-------------------------------------------------------------------|
+| **Renewal Grace Period** | 0-30                | Status changed to **Active (Needs Renewal)**. Notification sent. | No impact.                                                        |


Once an allocation’s status changes to Active (Needs Renewal), the user must follow up with an administrator. At this stage, administrative intervention is required to manually update the status to Active, then only ColdFront enables the user to submit change requests.

Please ensure that when the admin updates the allocation status, the End Date is extended by one year (or some extention period?) during this update.

Yes, that is correct.

Milstein · 2026-01-29T14:27:52Z

policies/0009-nerc-allocation-revocation.md

+
+For OpenStack, revocation is implemented by deleting all VMs and networking objects and switching the project status to disabled to prevent further access. Object storage and volumes are preserved. After the storage grace period, the remaining storage resources are deleted.
+
+For OpenShift, revocation is implemented by deleting all Pods, Deployments, Jobs, CronJobs and other resources that pertain to compute or networking. Persistent Volumes, ConfigMaps and Secrets are preserved during the storage grace period. After the storage grace period, those resources are deleted too.


Including the namespace?

The namespace would be the very last thing that gets deleted, after the storage grace period. We can also preserve the namespace, if you prefer that. Are there any reasons in particular for preserving it?

@knikolla, how does invoicing use namespaces? Will deleting it mid-month cause any issues with the invoice script?

I don't see how there will be any issues, but just to triple check:

@naved001 would deleting a namespace have any effect on collection of metrics up to its point of deletion?

@knikolla I don't think it should matter. In my mind it's the same as pod's metrics will remain (up to the retention period) even after a pod object is deleted. Should be the same for all the pods in a namespace that gets deleted.

And we also ship off metrics every day to s3 anyway.

That being said I will do a quick test and update this comment - can't be too careful with billing stuff.

okay, I tested this and I can say it's safe to delete the namespace. 4 hours after deleting the namespace, I can query prometheus and get the metrics for the pods in the deleted namespace.

Thanks for checking!

policies/0009-nerc-allocation-revocation.md

Milstein

Please review my comments.

msdisme · 2026-02-13T21:22:58Z

policies/0009-nerc-allocation-revocation.md

+
+#### 1. The Expiration / Revocation Lifecycle
+
+Each allocation has an end date at which point the allocation automatically enters **"Active (Needs Renewal)"** status. 


The original version was, "At 30 days before End Date, the allocation changes to Active (Needs Renewal)". That would align with the data that the PI sets when they create the allocation. Is that what is still happening?

That way they will be able to renew themsleves until the status is revoked.

@Milstein discussed with Kristi, Quan, Kim.
Proposal:
1.30 days before the end date they see a button at 30 days out that says "Expires in: " (we will no longer show Active Needs Renewal)
2. At the end of 30 days it goes to expired, turned off VMs and/or pods, access to PI and teams turned off. Admin action to turn back on. They will likely lose state.
3. At the end of the 30 days in expired status we switch to revoked and delete storage and other resources.

testing: 4 months of expiration happening without happening.

manual method for approving revocation, at least at first.

Will need active communication during testing and rollout plan warning folks about this change (folks have been ignoring them).

This approach returns us to normal coldfront period. Kristi plans to rewrite parts of the proposal to capture this.

NERC Allocation Revocation Workflow

4e39bb1

joachimweyl reviewed Jan 21, 2026

View reviewed changes