My problem
AF_NETLINK queries to the kernel intermittently take many seconds before being answered, for example in this strace trace:
10:42:38.864353 socket(AF_NETLINK, SOCK_RAW|SOCK_CLOEXEC, NETLINK_ROUTE) = 3 10:42:38.864377 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [32768], 4) = 0 10:42:38.864399 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [1048576], 4) = 0 10:42:38.864418 setsockopt(3, SOL_NETLINK, NETLINK_EXT_ACK, [1], 4) = 0 10:42:38.864436 bind(3, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, 12) = 0 10:42:38.864459 getsockname(3, {sa_family=AF_NETLINK, nl_pid=16296, nl_groups=00000000}, [12]) = 0 10:42:38.864491 sendto(3, {{len=40, type=RTM_GETLINK, flags=NLM_F_REQUEST|NLM_F_DUMP, seq=1588581759, pid=0}, {ifi_family=AF_UNSPEC, ifi_type=ARPHRD_NETROM, ifi_index=0, ifi_flags=0, ifi_change=0}, {{nla_len=8, nla_type=IFLA_EXT_MASK}, 1}}, 40, 0, NULL, 0) = 40 10:42:51.894848 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 2608
Background
I noticed that, every now and then, software would hang while trying to resolve an IP address. Mainly browsers, but also new sshs or anything else that needs DNS.
Using Wireshark I was able to check that the hang occured before a DNS query packet ever gets sent to the name server, so it's not a lagging nameserver by itself.
Tracing a few relevant processes showed that, every now and then, the process would read /etc/resolv.conf first, which has IPV6 addresses:
# Generated by NetworkManager search example.de otherexample.de nameserver 192.168.178.1 nameserver 2a02:8070:c19e:b400:xxxx:xxxx:xxxx:xxxx nameserver fd00::9a9b:cbff:xxxx:xxxx
then read /etc/gai.conf which doesnt' contain anything but comments, and then, obviously, get the list of interfaces using an AF_NETLINK socket.
Most of the time, the sendto and corresponding recvmsg are just a few milliseconds apart, but in some cases, this hangs what feels like forever.
Which made me realize that the problem isn't even DNS, and indeed, running ip a in a loop would, sometimes hang for several seconds as well. So I did this while stracing each ip aand logging the output and thestrace` to two different files. This shows the problem happens about once per minute, for about 12-13 seconds:
10:41:58.561713 sendto(3, {{len=40, type=RTM_GETLINK, flags=NLM_F_REQUEST|NLM_F_DUMP, seq=1588581719, pid=0}, {ifi_family=AF_UNSPEC, ifi_type=ARPHRD_NETROM, ifi_index=0, ifi_flags=0, ifi_change=0}, {{nla_len=8, nla_type=IFLA_EXT_MASK}, 1}}, 40, 0, NULL, 0) = 40 10:41:58.561943 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 2608 10:42:38.864491 sendto(3, {{len=40, type=RTM_GETLINK, flags=NLM_F_REQUEST|NLM_F_DUMP, seq=1588581759, pid=0}, {ifi_family=AF_UNSPEC, ifi_type=ARPHRD_NETROM, ifi_index=0, ifi_flags=0, ifi_change=0}, {{nla_len=8, nla_type=IFLA_EXT_MASK}, 1}}, 40, 0, NULL, 0) = 40 10:42:51.894848 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 2608 10:43:42.269435 sendto(3, {{len=40, type=RTM_GETLINK, flags=NLM_F_REQUEST|NLM_F_DUMP, seq=1588581823, pid=0}, {ifi_family=AF_UNSPEC, ifi_type=ARPHRD_NETROM, ifi_index=0, ifi_flags=0, ifi_change=0}, {{nla_len=8, nla_type=IFLA_EXT_MASK}, 1}}, 40, 0, NULL, 0) = 40 10:43:54.894689 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 2608 10:44:45.276410 sendto(3, {{len=40, type=RTM_GETLINK, flags=NLM_F_REQUEST|NLM_F_DUMP, seq=1588581886, pid=0}, {ifi_family=AF_UNSPEC, ifi_type=ARPHRD_NETROM, ifi_index=0, ifi_flags=0, ifi_change=0}, {{nla_len=8, nla_type=IFLA_EXT_MASK}, 1}}, 40, 0, NULL, 0) = 40 10:44:57.894722 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 2608 10:45:48.273509 sendto(3, {{len=40, type=RTM_GETLINK, flags=NLM_F_REQUEST|NLM_F_DUMP, seq=1588581949, pid=0}, {ifi_family=AF_UNSPEC, ifi_type=ARPHRD_NETROM, ifi_index=0, ifi_flags=0, ifi_change=0}, {{nla_len=8, nla_type=IFLA_EXT_MASK}, 1}}, 40, 0, NULL, 0) = 40 10:46:00.894574 recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base=NULL, iov_len=0}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 2608
The first pair is an example of what happens normally; the other pairs show how the problem happens once per minute and lasts for ~ 12 seconds.
There is no significant network change during those times; here's an example of the output of ip a before and after one the first of those pauses:
Mon May 4 10:42:38 CEST 2020 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether a8:5e:45:60:e4:be brd ff:ff:ff:ff:ff:ff inet 192.168.178.131/24 brd 192.168.178.255 scope global dynamic noprefixroute enp3s0 valid_lft 83515sec preferred_lft 83515sec inet6 2a02:8070:c19e:b400:bec7:94b4:34f1:86b4/64 scope global dynamic noprefixroute valid_lft 7078sec preferred_lft 3478sec inet6 fe80::d27:8efd:f696:3c47/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: wlp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether d0:ab:d5:0e:02:09 brd ff:ff:ff:ff:ff:ff inet 192.168.10.10/24 brd 192.168.10.255 scope global dynamic noprefixroute wlp7s0 valid_lft 602858sec preferred_lft 602858sec inet6 fe80::c694:6683:6353:e9c9/64 scope link noprefixroute valid_lft forever preferred_lft forever 4: wlxf4f26d08d54e: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether f4:f2:6d:08:d5:4e brd ff:ff:ff:ff:ff:ff Mon May 4 10:42:52 CEST 2020 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether a8:5e:45:60:e4:be brd ff:ff:ff:ff:ff:ff inet 192.168.178.131/24 brd 192.168.178.255 scope global dynamic noprefixroute enp3s0 valid_lft 83514sec preferred_lft 83514sec inet6 2a02:8070:c19e:b400:bec7:94b4:34f1:86b4/64 scope global dynamic noprefixroute valid_lft 7077sec preferred_lft 3477sec inet6 fe80::d27:8efd:f696:3c47/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: wlp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether d0:ab:d5:0e:02:09 brd ff:ff:ff:ff:ff:ff inet 192.168.10.10/24 brd 192.168.10.255 scope global dynamic noprefixroute wlp7s0 valid_lft 602857sec preferred_lft 602857sec inet6 fe80::c694:6683:6353:e9c9/64 scope link noprefixroute valid_lft forever preferred_lft forever 4: wlxf4f26d08d54e: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether f4:f2:6d:08:d5:4e brd ff:ff:ff:ff:ff:ff
The question
What could cause the kernel to delay replies to AF_NETLINK/RTM_GETLINK socket calls for several seconds, once per minute?
As far as I know, those calls are handled by the kernel directly, not by some other process (that I could strace for a timeout). Is this correct?
If so, what could make the kernel block on those requests, again and again? How could one debug that?