Revisions to How to disable memory for a NUMA node on a Linux system

I tried to make it clear the database vendor is the one telling us how to use (or not to use) our hardware

edited Nov 10, 2020 at 23:21

423
3
9

We have a bit of controversy with the database vendor about our HP DL560 machines. The DB sales type’s technical support person was animated that we could not use our DL560s but had to buy new DL360s since they have fewer sockets. I believe their concern is the speed of accessing inter-socket memory. They recommended that if I insisted on keeping the DL560s, I should leave two of the sockets empty. I think they are mistaken (akaAKA crazy) but I need tests to demonstrate that I am on solid ground.

Added lscpu output and updated N3 to Node 3

Source Link

edited Jun 18, 2019 at 18:56

user1683793

423
3
9

The lscpu display reads (in part):

$ lscpu | egrep 'NUMA|ore' Thread(s) per core: 2 Core(s) per socket: 22 NUMA node(s): 4 NUMA node0 CPU(s): 0-21,88-109 NUMA node1 CPU(s): 22-43,110-131 NUMA node2 CPU(s): 44-65,132-153 NUMA node3 CPU(s): 66-87,154-175

If I had access to the physical hardware, I would consider pulling the processors from two of the sockets to prove my point but I don’t have access and I don’t have permission to go monkeying around with the hardware anyway.

Returns no records initially but After gobbling up memory for a long time, it starts using memory on N3Node 3.

If I had access to the physical hardware, I would consider pulling the processors from two of the sockets to prove my point but I don’t have access and I don’t have permission to go monkeying around with the hardware anyway.

Returns no records initially but After gobbling up memory for a long time, it starts using memory on N3.

The lscpu display reads (in part):

$ lscpu | egrep 'NUMA|ore' Thread(s) per core: 2 Core(s) per socket: 22 NUMA node(s): 4 NUMA node0 CPU(s): 0-21,88-109 NUMA node1 CPU(s): 22-43,110-131 NUMA node2 CPU(s): 44-65,132-153 NUMA node3 CPU(s): 66-87,154-175

If I had access to the physical hardware, I would consider pulling the processors from two of the sockets to prove my point but I don’t have access and I don’t have permission to go monkeying around with the hardware anyway.

Returns no records initially but After gobbling up memory for a long time, it starts using memory on Node 3.

Add weblink, grammar

Source Link

edited Jun 18, 2019 at 17:43

K7AAY

3.9k
4
27
40

Is there a way to disable access to memory associated with a given NUMA node/socket on a NUMANUMA machine?

We have a bit of controversy with the vendor about our usage of our HP DL560 machines. The sales type’s technical support person was animated that we could not use our DL560s but had to go out and buy new DL360s since they have fewer sockets. I believe their concern is the speed of accessing inter-socket memory. They recommended that if I insisted on keeping the DL560s, I should leave two of the sockets empty. I think they are mistaken (aka crazy) but I need to run some tests to demonstrate that I am on solid ground.

My configuration:
The machines have four sockets, each of which has 22 hyperthreaded physical cores, for a total of 176 apparent cores with a total of 1.5 T of memory. The operating system is Red Hat Enterprise Linux Server release 7.4.

If I had access to the physical hardware, I would consider pulling the processors from two of the sockets to prove my point but I don’t have access and I don’t have permission to go monkeying around with the hardware anyway.

The next best thing would be to virtually disable the sockets using the operating system. I read on this link that I can take a processor out of service with

echo 0 > /sys/devices/system/cpu/cpu3/online

and, indeed, the processors the processors are out of service but that says nothing about the memory.

I just turned off all the processors for socket #3 with (using lscpu to find which are for Socket#3):

for num in {66..87} {154..175} do echo 0 > /sys/devices/system/cpu/cpu${num}/online cat /sys/devices/system/cpu/cpu${num}/online done

and got:

$ grep N3 /proc/$$/numa_maps 7fe5daa79000 default file=/usr/lib64/libm-2.17.so mapped=16 mapmax=19 N3=16 kernelpagesize_kB=4

Which, if I am reading this correctly, shows my current process is using memory in socket #3. Except the shell was already running when I turned off the processors.

I started a new process that does its best to gobble up memory and

$ cat /proc/18824/numa_maps | grep N3

Returns no records initially but After gobbling up memory for a long time, it starts using memory on N3.

I tried running my program with numactl and binding to nodes 0,1,2 and it works as expected ... except I don’t have control over the vendor's software and there is no provision in Linux to set another process as is done with the set_mempolicy service as used by numactl.

Short of physically removing the processors, Is there a way to force the issue?

Is there a way to disable access to memory associated with a given NUMA node/socket on a NUMA machine?

We have a bit of controversy with the vendor about our usage of our HP DL560 machines. The sales type’s technical support person was animated that we could not use our DL560s but had to go out and buy new DL360s since they have fewer sockets. I believe their concern is the speed of accessing inter-socket memory. They recommended that if I insisted on keeping the DL560s, I should leave two of the sockets empty. I think they are mistaken (aka crazy) but I need to run some tests to demonstrate that I am on solid ground.

My configuration: The machines have four sockets, each of which has 22 hyperthreaded physical cores, for a total of 176 apparent cores with a total of 1.5 T of memory. The operating system is Red Hat Enterprise Linux Server release 7.4.

If I had access to the physical hardware, I would consider pulling the processors from two of the sockets to prove my point but I don’t have access and I don’t have permission to go monkeying around with the hardware anyway.

The next best thing would be to virtually disable the sockets using the operating system. I read on this link that I can take a processor out of service with

echo 0 > /sys/devices/system/cpu/cpu3/online

and, indeed, the processors the processors are out of service but that says nothing about the memory.

I just turned off all the processors for socket #3 with (using lscpu to find which are for Socket#3):

for num in {66..87} {154..175} do echo 0 > /sys/devices/system/cpu/cpu${num}/online cat /sys/devices/system/cpu/cpu${num}/online done

and got:

$ grep N3 /proc/$$/numa_maps 7fe5daa79000 default file=/usr/lib64/libm-2.17.so mapped=16 mapmax=19 N3=16 kernelpagesize_kB=4

Which, if I am reading this correctly, shows my current process is using memory in socket #3. Except the shell was already running when I turned off the processors.

I started a new process that does its best to gobble up memory and

$ cat /proc/18824/numa_maps | grep N3

Returns no records initially but After gobbling up memory for a long time, it starts using memory on N3.

I tried running my program with numactl and binding to nodes 0,1,2 and it works as expected ... except I don’t have control over the vendor's software and there is no provision in Linux to set another process as is done with the set_mempolicy service as used by numactl.

Short of physically removing the processors, Is there a way to force the issue?

Is there a way to disable access to memory associated with a given NUMA node/socket on a NUMA machine?

We have a bit of controversy with the vendor about our HP DL560 machines. The sales type’s technical support person was animated that we could not use our DL560s but had to buy new DL360s since they have fewer sockets. I believe their concern is the speed of accessing inter-socket memory. They recommended that if I insisted on keeping the DL560s, I should leave two of the sockets empty. I think they are mistaken (aka crazy) but I need tests to demonstrate that I am on solid ground.

My configuration:
The machines have four sockets, each of which has 22 hyperthreaded physical cores, for a total of 176 apparent cores with a total of 1.5 T of memory. The operating system is Red Hat Enterprise Linux Server release 7.4.

If I had access to the physical hardware, I would consider pulling the processors from two of the sockets to prove my point but I don’t have access and I don’t have permission to go monkeying around with the hardware anyway.

The next best thing would be to virtually disable the sockets using the operating system. I read on this link that I can take a processor out of service with

echo 0 > /sys/devices/system/cpu/cpu3/online

and, indeed, the processors the processors are out of service but that says nothing about the memory.

I just turned off all the processors for socket #3 with (using lscpu to find which are for Socket#3):

for num in {66..87} {154..175} do echo 0 > /sys/devices/system/cpu/cpu${num}/online cat /sys/devices/system/cpu/cpu${num}/online done

and got:

$ grep N3 /proc/$$/numa_maps 7fe5daa79000 default file=/usr/lib64/libm-2.17.so mapped=16 mapmax=19 N3=16 kernelpagesize_kB=4

Which, if I am reading this correctly, shows my current process is using memory in socket #3. Except the shell was already running when I turned off the processors.

I started a new process that does its best to gobble up memory and

$ cat /proc/18824/numa_maps | grep N3

Returns no records initially but After gobbling up memory for a long time, it starts using memory on N3.

I tried running my program with numactl and binding to nodes 0,1,2 and it works as expected ... except I don’t have control over the vendor's software and there is no provision in Linux to set another process as is done with the set_mempolicy service as used by numactl.

Short of physically removing the processors, Is there a way to force the issue?

Source Link

asked Jun 18, 2019 at 16:29

user1683793

423
3
9

Loading

Stack Exchange Network

Return to Question