Conversation
joachimweyl
left a comment
There was a problem hiding this comment.
a few small changes and a question or two added in comments.
|
|
||
| | **Phase** | **Duration** (Days) | **System Actions** | **User Impact** | | ||
| |--------------------------|---------------------|------------------------------------------------------------------|-------------------------------------------------------------------| | ||
| | **Renewal Grace Period** | 0-30 | Status changed to **Active (Needs Renewal)**. Notification sent. | No impact. | |
There was a problem hiding this comment.
Once an allocation’s status changes to Active (Needs Renewal), the user must follow up with an administrator. At this stage, administrative intervention is required to manually update the status to Active, then only ColdFront enables the user to submit change requests.
Please ensure that when the admin updates the allocation status, the End Date is extended by one year (or some extention period?) during this update.
|
|
||
| For OpenStack, revocation is implemented by deleting all VMs and networking objects and switching the project status to disabled to prevent further access. Object storage and volumes are preserved. After the storage grace period, the remaining storage resources are deleted. | ||
|
|
||
| For OpenShift, revocation is implemented by deleting all Pods, Deployments, Jobs, CronJobs and other resources that pertain to compute or networking. Persistent Volumes, ConfigMaps and Secrets are preserved during the storage grace period. After the storage grace period, those resources are deleted too. |
There was a problem hiding this comment.
The namespace would be the very last thing that gets deleted, after the storage grace period. We can also preserve the namespace, if you prefer that. Are there any reasons in particular for preserving it?
There was a problem hiding this comment.
@knikolla, how does invoicing use namespaces? Will deleting it mid-month cause any issues with the invoice script?
There was a problem hiding this comment.
I don't see how there will be any issues, but just to triple check:
@naved001 would deleting a namespace have any effect on collection of metrics up to its point of deletion?
There was a problem hiding this comment.
@knikolla I don't think it should matter. In my mind it's the same as pod's metrics will remain (up to the retention period) even after a pod object is deleted. Should be the same for all the pods in a namespace that gets deleted.
And we also ship off metrics every day to s3 anyway.
That being said I will do a quick test and update this comment - can't be too careful with billing stuff.
There was a problem hiding this comment.
okay, I tested this and I can say it's safe to delete the namespace. 4 hours after deleting the namespace, I can query prometheus and get the metrics for the pods in the deleted namespace.
|
|
||
| #### 1. The Expiration / Revocation Lifecycle | ||
|
|
||
| Each allocation has an end date at which point the allocation automatically enters **"Active (Needs Renewal)"** status. |
There was a problem hiding this comment.
The original version was, "At 30 days before End Date, the allocation changes to Active (Needs Renewal)". That would align with the data that the PI sets when they create the allocation. Is that what is still happening?
There was a problem hiding this comment.
That way they will be able to renew themsleves until the status is revoked.
There was a problem hiding this comment.
@Milstein discussed with Kristi, Quan, Kim.
Proposal:
1.30 days before the end date they see a button at 30 days out that says "Expires in: " (we will no longer show Active Needs Renewal)
2. At the end of 30 days it goes to expired, turned off VMs and/or pods, access to PI and teams turned off. Admin action to turn back on. They will likely lose state.
3. At the end of the 30 days in expired status we switch to revoked and delete storage and other resources.
- testing: 4 months of expiration happening without happening.
- manual method for approving revocation, at least at first.
- Will need active communication during testing and rollout plan warning folks about this change (folks have been ignoring them).
This approach returns us to normal coldfront period. Kristi plans to rewrite parts of the proposal to capture this.
No description provided.