NVMe disk shows 80% io utilization, partitions show 0% io utilization

Question

I have a CentOS 7 server (kernel 3.10.0-957.12.1.el7.x86_64) with 2 NVMe disks with the following setup:

# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT nvme0n1 259:0 0 477G 0 disk ├─nvme0n1p1 259:2 0 511M 0 part /boot/efi ├─nvme0n1p2 259:4 0 19.5G 0 part │ └─md2 9:2 0 19.5G 0 raid1 / ├─nvme0n1p3 259:7 0 511M 0 part [SWAP] └─nvme0n1p4 259:9 0 456.4G 0 part └─data-data 253:0 0 912.8G 0 lvm /data nvme1n1 259:1 0 477G 0 disk ├─nvme1n1p1 259:3 0 511M 0 part ├─nvme1n1p2 259:5 0 19.5G 0 part │ └─md2 9:2 0 19.5G 0 raid1 / ├─nvme1n1p3 259:6 0 511M 0 part [SWAP] └─nvme1n1p4 259:8 0 456.4G 0 part └─data-data 253:0 0 912.8G 0 lvm /data

Our monitoring and iostat continually shows nvme0n1 and nvme1n1 with 80%+ io utilization while the individual partitions have 0% io utilization and are fully available (250k iops, 1GB read/write per sec).

avg-cpu: %user %nice %system %iowait %steal %idle 7.14 0.00 3.51 0.00 0.00 89.36 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util nvme1n1 0.00 0.00 0.00 50.50 0.00 222.00 8.79 0.73 0.02 0.00 0.02 14.48 73.10 nvme1n1p1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 nvme1n1p2 0.00 0.00 0.00 49.50 0.00 218.00 8.81 0.00 0.02 0.00 0.02 0.01 0.05 nvme1n1p3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 nvme1n1p4 0.00 0.00 0.00 1.00 0.00 4.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00 nvme0n1 0.00 0.00 0.00 49.50 0.00 218.00 8.81 0.73 0.02 0.00 0.02 14.77 73.10 nvme0n1p1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 nvme0n1p2 0.00 0.00 0.00 49.50 0.00 218.00 8.81 0.00 0.02 0.00 0.02 0.01 0.05 nvme0n1p3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 nvme0n1p4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md2 0.00 0.00 0.00 48.50 0.00 214.00 8.82 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 0.00 1.00 0.00 4.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00

Any ideas what can be the root cause for such behavior?
All seems to be working fine except for monitoring triggering high io alerts.

The nvme1n1p2 and nvme0n1p2 are showing activity, and they are the components of your md2 RAID1 device. Perhaps the RAID set is syncing or scrubbing in the background? Look into /proc/mdstat to see the RAID device status. The monitoring results might be a quirk resulting from how the background syncing/scrubbing is implemented in the kernel. If so, it should be basically "soft workload" that will be automatically restricted when there are actual user-space disk I/O operations to do. — telcoM
– telcoM, Commented May 8, 2019 at 6:30
@telcoM - seems ok. md2 : active raid1 nvme0n1p2[0] nvme1n1p2[1] and 20478912 blocks [2/2] [UU] — mike
– mike, Commented May 8, 2019 at 8:01
Just to add - this happens on multiple servers. Hosted with the same provider though. — mike
– mike, Commented May 23, 2019 at 9:24
Do you have the scheduler of the NVMEs set to none? access.redhat.com/solutions/3901291 — Thomas
– Thomas, Commented May 25, 2019 at 10:07
Sorry, just saw that you need a login for that link. Basically it says that iostat shows wrong utilization for NVME drives configured with none scheduler, which is the default for NVME drives. It seems to be a bug which has been identified. — Thomas
– Thomas, Commented May 25, 2019 at 13:40

Thomas · Accepted Answer · 2019-05-30 10:38:09Z

The source of the abysmal iostat output for %util and svctm seems to be related to a kernel bug which will be solved in kernel-3.10.0-1036.el7 or in RHEL/CentOS release 7.7. Devices that have the scheduler set to none are affected, which is the default for NVME drives.

For reference, there is a Redhat solution (login required) which describes the bug.
In the CentOS bug report someone wrote that the issue will be solved with the above mentioned kernel/release version.

Changing the scheduler should resolve the issue, until the new kernel is available. As it seems to only affects metrics not real performance, another possibility would be to just ignore the metrics until the new kernel.
I cannot verify this due to lack of NVME drive, maybe @michal kralik can verify this.

John Doe · Accepted Answer · 2019-05-27 16:45:59Z

0

IO directed to individual logical partitions is remapped by Linux kernel to underlining physical device which actually performs the IO.

answered May 27, 2019 at 16:45

John Doe

4333 silver badges8 bronze badges

nope. accounted to both. elixir.bootlin.com/linux/latest/source/block/bio.c#L1735 . (disk busy% metrics are caculated from io_ticks, see this question and answer: unix.stackexchange.com/questions/517132/…) True cause is most likely that mentioned in the comments underneath the question.

sourcejedi
– sourcejedi

2019-05-27 17:10:23 +00:00
Commented May 27, 2019 at 17:10

Add a comment |

Stack Exchange Network

NVMe disk shows 80% io utilization, partitions show 0% io utilization

2 Answers 2

You must log in to answer this question.

Linked

Hot Network Questions

NVMe disk shows 80% io utilization, partitions show 0% io utilization

2 Answers 2

You must log in to answer this question.

Linked

Related

Hot Network Questions