We need a way to delete resources when terraform gets stuck #179

HartS · 2020-04-17T06:56:19Z

I've been noticing terraform getting stuck trying to interact with ECP a lot lately.

For example, I attempted to destroy a cluster that failed to completely deploy with make clean.. terraform destroy started running and eventually stalled (presumably due to network/VPN hiccup). Four hours later, nothing was progressing, the process can't be terminated without kill -9, and terraform leaves the resources in a 'locked' state.

I don't know a way to recover from this and have wasted a ton of time trying to address it already; terraform does have a force-unlock subcommand, but attempting to run that yields Local state cannot be unlocked by another process . When this happened previously, I manually deleted the lockfile but that didn't allow terraform destroy to run again either.

I ended up having to delete the buildir manually and spend about 2 hours drilling into resources in the openstack console to ensure everything was cleaned up, but we should either determine a way to recover from this kind of scenario (which I've now hit again) in a graceful way that allows terraform to clean things up, or provide another subcommand in catapult to clean resources from ECP using the openstack CLIs instead of terraform

The text was updated successfully, but these errors were encountered:

viccuad · 2020-04-20T09:40:45Z

I share your frustration.

In the past I have used the following snippet to delete only lb, secgroups, and nets from ECP: https://gitlab.suse.de/snippets/338. Mind you that this is not easy, as there's several recursive dependencies, and a specific order. Personally, I think replicating terraform on our own is a bad idea here, we are gonna be playing cat and mouse on our own for all the clouds.

I have opened SUSE/skuba#1051 to have the CaaSP terraform files create an Openstack Stack, which should be easier to delete.

It seems that the error you are facing is on the Terraform side. Catapult justs call a terraform destroy, and deletes the folder if it succeeds. One can call make clean as much as they want; and if Terraform is failing, one can use terraform knowledge to work around anything.

I really see no way to work around this, and I don't think Catapult can get more intelligent than Terraform. If it were for me, I would close this issue, as I see it as out of scope :/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

We need a way to delete resources when terraform gets stuck #179

We need a way to delete resources when terraform gets stuck #179

HartS commented Apr 17, 2020 •

edited

Loading

viccuad commented Apr 20, 2020 •

edited

Loading

Uh oh!

We need a way to delete resources when terraform gets stuck #179

We need a way to delete resources when terraform gets stuck #179

Comments

HartS commented Apr 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

viccuad commented Apr 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HartS commented Apr 17, 2020 •

edited

Loading

viccuad commented Apr 20, 2020 •

edited

Loading