Skip to content
This repository was archived by the owner on Jan 26, 2024. It is now read-only.
This repository was archived by the owner on Jan 26, 2024. It is now read-only.

data race in memory update function #35

@psychocoderHPC

Description

@psychocoderHPC

IMO the memory counting functions contain data races which lead to the possibility that the counter for free memory is underflowing.

https://github.com/ROCm-Developer-Tools/ROCclr/blob/90f1f61a9d6c28ffd2f844dc773e921444752e47/device/rocm/rocdevice.cpp#L2086-L2104

  • Case 1
    • Two threads both allocating 2MiB going into line 2091 and the check is false because freeMem_ is 3MiB.
    • Both threads will decrement the counter freeMem_ in line 2101 which results in an underflow
  • Case 2
    • Two threads where one is deallocating 2MiB and the other is allocating 5MiB and freeMem_ is 4MiB
    • The allocating thread (5MiB) is checking line 2091 and is going into the if body to line 2096
    • The second thread with the deallocation of 2MiB is exciting line 2088, freeMem_ is now 6MiB
    • The first thread is executing line 2098 which is setting freeMem_ to zero.
    • The result is that we lose 2MiB of memory we could potential allocate but due to the data race is not available anymore

Possible solution:

  • When memory is freed freeMem_ is stored in a register and atomic CAS is used to reset the variable freeMem_
  • line 2101 must be guarded by atomic Cas too to avoid that other threads reducing the value of freeMem_ in the same moment which results into a variant of Case 1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions