-
Notifications
You must be signed in to change notification settings - Fork 66
Open
Description
I have a dual-socket server, with each socket connected to four 256 GB Intel Optane 200 series persistent memory modules. These modules are configured in App Direct Interleaved Mode, forming two logical devices: pmem0 and pmem1. Specifically, pmem0 consists of the four PMem modules attached to socket 0, while pmem1 is formed from those on socket 1.
I’m using fio to benchmark performance. The test script is as follows:
[global]
ioengine=libpmem
direct=1
bs=4k
size=1G
numjobs=28
thread
ramp_time=15
runtime=15
time_based
group_reporting
[n0-0]
rw=write
filename=/mnt/pmem0/testfile
cpus_allowed=0-27
[n1-1]
rw=write
filename=/mnt/pmem1/testfile
cpus_allowed=28-55
I tested both NUMA-local and NUMA-remote access performance, on both single-socket and dual-socket configurations. The results are shown in the attached figure.
However, I didn’t observe a significant NUMA effect — in some cases, remote access bandwidth even exceeded that of local access.
Is this expected behavior?
Metadata
Metadata
Assignees
Labels
No labels