STREAMer: Benchmarking remote volatile and non-volatile memory bandwidth
$ python3 STREAMer.py --help usage: STREAMer.py [-h] [--noHT | --HT] [--Socket0 | --Socket1 | --Socket0Socket1] [--Socket0DDR4 | --Socket1DDR4 | --CXLDDR4 | --Socket0DDR5 | --Socket1DDR5 | --CXLDAX | --Socket0DDR4DAX | --Socket1DDR4DAX | --Socket0DDR5DAX | --Socket1DDR5DAX] [--Close | --Spread] [--noFT | --FT] [--DAX_Path DAX_PATH] [--Arrays_Size ARRAYS_SIZE] [--Cores_per_Socket CORES_PER_SOCKET] Program Options optional arguments: -h, --help show this help message and exit --noHT Disable hyperthreading --HT Enable hyperthreading --Socket0 Use only Socket0 cores --Socket1 Use only Socket1 cores --Socket0Socket1 Use both Socket0 and Socket1 cores --Socket0DDR4 Use DDR4 memory on Socket0 --Socket1DDR4 Use DDR4 memory on Socket1 --CXLDDR4 Use DDR4 memory with CXL --Socket0DDR5 Use DDR5 memory on Socket0 --Socket1DDR5 Use DDR5 memory on Socket1 --CXLDAX Use DAX DDR4 memory with CXL --Socket0DDR4DAX Use DAX DDR4 memory on Socket0 --Socket1DDR4DAX Use DAX DDR4 memory on Socket1 --Socket0DDR5DAX Use DAX DDR5 memory on Socket0 --Socket1DDR5DAX Use DAX DDR5 memory on Socket1 --Close Use close thread affinity --Spread Use spread thread affinity --noFT Disable first touch --FT Enable first touch --DAX_Path DAX_PATH Path for DAX (default: {NOPATH}) --Arrays_Size ARRAYS_SIZE Specify the size of the arrays (default: 100000000) --Cores_per_Socket CORES_PER_SOCKET Specify the number of cores per socket (default: 10) In case no input is supplied, the system will generate a default that will mostly be able to execute correctly on most systems and will serve as a baseline.
$ python3 STREAMer.py Hyperthreading is disabled Using only Socket0 cores Using DDR4 memory on Socket0 Using close thread affinity First touch is disabled Folder 'noHT_Socket0_Socket0DDR4_Close_noFT_NOPATH_Arrays100000000_Cores10/' has been created. In this example, we ask to run: from thread 0 up to all possible threads, without hyperthreading; using all of the cores in the node (the two sockets); enabling access to a CXL remote memory in a DAX mode (in this case, DDR4) and place the memory there (with PMDK, given a DAX path); while also spreading evenly the threads location in the hardware (Thread Affinity); and also spread evenly the memory allocation in the hardware (First Touch). A folder is created with all of the args as its name, and inside this folder, all of the runs are executed, and results are saved.
$ python3 STREAMer.py --noHT --Socket0Socket1 --CXLDAX --Spread --FT --DAX_Path /mnt/pmem2 Hyperthreading is disabled Using both Socket0 and Socket1 cores Using DAX DDR4 memory with CXL Using spread thread affinity First touch is enabled Folder 'noHT_Socket0Socket1_CXLDAX_Spread_FT_@mnt@pmem2_Arrays100000000_Cores10/' has been created. In this example, we are changing all of the possible variables for the purpose of demonstration.
$ python3 STREAMer.py --noHT --Socket0Socket1 --CXLDAX --Spread --FT --DAX_Path /mnt/pmem5 --Arrays_Size 10000 --Cores_per_Socket 5 Hyperthreading is disabled Using both Socket0 and Socket1 cores Using DAX DDR4 memory with CXL Using spread thread affinity First touch is enabled Folder 'noHT_Socket0Socket1_CXLDAX_Spread_FT_@mnt@pmem5_Arrays10000_Cores5/' has been created.