I have a zfs server, where I ran a couple of dumb tests just for understanding, and it puzzles me.
Context:- FreeBSD 11.2, ZFS with Compression enabled, SAS HDDs, RAIDz2, 768GB of memory.
Both commands were run directly on the FreeBSD server.
# time dd if=/dev/random of=./test_file bs=128k count=131072 131072+0 records in 131072+0 records out 17179869184 bytes transferred in 135.191596 secs (127077937 bytes/sec) 0.047u 134.700s 2:15.19 99.6% 30+172k 4+131072io 0pf+0w # #The result file size: # du -sh test_file 16G test_file This shows that I was able to a 16GiB file with random data in 135 secs with a throughput of approx. 117 MiB/s.
Now, I try to use fio,
# fio --name=seqwrite --rw=write --bs=128k --numjobs=1 --size=16G --runtime=120 --iodepth=1 --group_reporting seqwrite: (g=0): rw=write, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=psync, iodepth=1 fio-3.6 Starting 1 process seqwrite: Laying out IO file (1 file / 16384MiB) Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=2482MiB/s][r=0,w=19.9k IOPS][eta 00m:00s] seqwrite: (groupid=0, jobs=1): err= 0: pid=58575: Wed Jul 25 09:38:06 2018 write: IOPS=19.8k, BW=2478MiB/s (2598MB/s)(16.0GiB/6612msec) clat (usec): min=28, max=2585, avg=48.03, stdev=24.04 lat (usec): min=29, max=2586, avg=49.75, stdev=25.19 bw ( MiB/s): min= 2295, max= 2708, per=99.45%, avg=2464.33, stdev=124.56, samples=13 iops : min=18367, max=21664, avg=19714.08, stdev=996.47, samples=13 ---------- Trimmed for brevity ------------- Run status group 0 (all jobs): WRITE: bw=2478MiB/s (2598MB/s), 2478MiB/s-2478MiB/s (2598MB/s-2598MB/s), io=16.0GiB (17.2GB), run=6612-6612msec Now, I hit 2478 MiB/s of throughput. while using the same 16 GiB file with random data.
Why is there such a big difference? My understanding is that dd command must have used create call to create a file, then issue open, and write calls to write the random data into the open file. Finally closethe file. I chose block size of 128 K to match with ZFS default record size.
The fio test should be measuring just the write calls, but everything else, the same. Why is there so much difference in throughput?
To confuse me even further, if I asked fio to create a file with 50% compressibility, the throughput drops to 847 MiB/s. I understand there CPU work involved in compression causing a throughput drop, but I was hoping that it's impact would be neutralised by having near half the amount of data to write. Any ideas why the impact is this high?
Command used to run fio with 50% compressibility:
fio --name=seqwrite --rw=write --bs=128k --numjobs=1 --size=16G --runtime=60 --iodepth=1 --buffer_compress_percentage=50 --buffer_pattern=0xdeadbeef --group_reporting
dd. Do not know about fio, interesting question.--buffer_compress_percentage=0?--buffer_compress_percentage=0and now I get a throughput of 139 MiB/s. I expected the result to be similar to not asking for buffer_compress_percentage at all(2478 MiB/s), but the results differ wildly.--buffer_compress_percentage=0to use 100% random data, so the fact that the throughput is the about same asdd if=/dev/randomis a good thing. Can you add the contents of your fio job file to your question? It looks like fio is not using/dev/randomas its source in your first run, but is instead using highly compressible data,