Skip to content

Conversation

@klmp200
Copy link

@klmp200 klmp200 commented Oct 28, 2025

When resizing a VDI on a thick provisioned linstor SR, we need to wait for DRBD to be up to date.
This is dependent on previous operations that we can't know.
If we wait too much, we stop and return an error to the user letting him know that he needs to try later.

Also fix an issue when resizing raw images.

@klmp200 klmp200 force-pushed the linstor_resize_retry branch from d5c806e to 545ac97 Compare October 28, 2025 09:31
@klmp200 klmp200 requested review from Nambrok and Wescoeur October 28, 2025 09:31
Comment on lines 1911 to 1933
def compute_vdi_sizes(vdi: LinstorVDI) -> Tuple[int, int]:
if vdi.vdi_type == vhdutil.VDI_TYPE_RAW:
return vdi.size, LinstorVolumeManager.round_up_volume_size(size)

old_volume_size = vdi.utilisation
if vdi.sr._provisioning == 'thin':
# VDI is currently deflated, so keep it deflated.
return old_volume_size, old_volume_size

return old_volume_size, LinstorVhdUtil.compute_volume_size(size, vdi.vdi_type)

def resize_vdi(vdi: LinstorVDI, old_volume_size: int, new_volume_size) -> None:
if vdi.vdi_type == vhdutil.VDI_TYPE_RAW:
vdi._linstor.resize_volume(vdi.uuid, new_volume_size)
return

if new_volume_size != old_volume_size:
vdi.sr._vhdutil.inflate(
vdi.sr._journaler, vdi.uuid, vdi.path,
new_volume_size, old_volume_size
)
vdi.sr._vhdutil.set_size_virt_fast(vdi.path, size)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really against it but the refactor seems a bit out of scope here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put this refactor in it's dedicated commit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactoring is always a good idea, but personally, I don't think we should use nested functions that are only called once. A comment above the blocks that seem complex to you (even with numbering) seems more appropriate here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been replaced by comments

@klmp200 klmp200 force-pushed the linstor_resize_retry branch from 545ac97 to 6fc3638 Compare November 7, 2025 08:53
Comment on lines 1911 to 1933
def compute_vdi_sizes(vdi: LinstorVDI) -> Tuple[int, int]:
if vdi.vdi_type == vhdutil.VDI_TYPE_RAW:
return vdi.size, LinstorVolumeManager.round_up_volume_size(size)

old_volume_size = vdi.utilisation
if vdi.sr._provisioning == 'thin':
# VDI is currently deflated, so keep it deflated.
return old_volume_size, old_volume_size

return old_volume_size, LinstorVhdUtil.compute_volume_size(size, vdi.vdi_type)

def resize_vdi(vdi: LinstorVDI, old_volume_size: int, new_volume_size) -> None:
if vdi.vdi_type == vhdutil.VDI_TYPE_RAW:
vdi._linstor.resize_volume(vdi.uuid, new_volume_size)
return

if new_volume_size != old_volume_size:
vdi.sr._vhdutil.inflate(
vdi.sr._journaler, vdi.uuid, vdi.path,
new_volume_size, old_volume_size
)
vdi.sr._vhdutil.set_size_virt_fast(vdi.path, size)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactoring is always a good idea, but personally, I don't think we should use nested functions that are only called once. A comment above the blocks that seem complex to you (even with numbering) seems more appropriate here.

Comment on lines 844 to 848
# We can't resize anything until DRBD is up to date
# We wait here for 5min max and raise an easy to understand error for the user
# 5min is an arbitrary time, it's impossible to get a fit all situation value
# and it's currently impossible to know how much time we have to wait
# This is mostly an issue for thick provisioning, thin isn't affected
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll need to discuss comment formatting for the new stack, especially regarding period at the end of sentences. I'm a fan of this for pydocstyle, but also for comments. :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added proper punctuation

try:
self._linstor.resource_dfn_wait_synced(volume_name, wait_interval=1.0, timeout=60*5)
except linstor.LinstorTimeoutError:
raise linstor.LinstorTimeoutError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise linstor.LinstorTimeoutError(
raise LinstorVolumeManagerError(

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +859 to +863
result = self._linstor.volume_dfn_modify(
rsc_name=volume_name,
volume_nr=0,
size=new_size
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to keep the initial call before waiting for the sync to complete. The idea would then be to display a message saying that we need to wait, which would create a sort of block/symmetry with your message suggestion "DRBD is up to date, syncing took...".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Experience proves that we do need to wait, especially for early steps. That's why we had the previous retry mechanism.

Either we keep the previous process which is a bit bruteforce or we just wait before but we can't keep both.
I don't see the benefit of calling the function if we know that it's gonna crash anyway.

I think we should discuss this :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to discuss this, I'm not sure this would be benificial

@klmp200 klmp200 force-pushed the linstor_resize_retry branch from 6fc3638 to 24dece5 Compare November 14, 2025 09:18
@klmp200 klmp200 requested a review from Wescoeur November 14, 2025 09:18
@klmp200 klmp200 force-pushed the linstor_resize_retry branch from 24dece5 to ec1f735 Compare November 14, 2025 09:25
fix resize typo for linstor on raw VDI

Signed-off-by: Antoine Bartuccio <antoine.bartuccio@vates.tech>
Avoid most common drbd sync errors during resize with a cleaner
wait mechanism and friendly user error

Signed-off-by: Antoine Bartuccio <antoine.bartuccio@vates.tech>
@klmp200 klmp200 force-pushed the linstor_resize_retry branch from ec1f735 to 1b531eb Compare November 14, 2025 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants