This is a very peculiar issue. I have a SOHO in my home that looks roughly like this:
internet->-|ISPmodem|->-|Firewall/DHCP/DNS/HTTP server|->-intranet where
- ISP modem is not used as a router at all, and is configured as DMZ, with open firewall
- my server is doing the firewall/routing work and at the same time acts as a DHCP and DNS(bind) resolver for all intranet. Naturally it also acts as the gateway to the internet for all the intranet.
- Intranet includes desktops/laptops/printers/access points/etc, which include services like some HTTP servers, MPD (music player daemon), VNC servers , raspberry pi's, the list goes on.
While modem is connected to the internet, resolving any intranet and internet address and pinging them works fine. Any additional services such as using the FQDN to view the webpage of the HTTP server in my intranet , or using the FQDN of my MPD from some MPC (music player control) to change songs works fine too. The same for everything. I say all that to basically say that I think I have setup the DHCP/DNS part correctly.
Now if I have no internet I would expect that all intranet services would continue being resolvable during the outage, but this seems to not be the case. I can still use everything in the intranet if I use their IP address instead of the FQDN. I can also run nslookup <hostname|FQDN> to resolve the ip address of some server and I will see in tcpdump that my request did reach the DNS server and a response was sent back fine. But trying to access the "service" normally, simply fails. So if I try to use "Firefox" to view the webpage of the HTTP server for example , that fails with "cannot find page" error. Likewise for MPD, or when I try to ssh to other GNU/Linux IoT or embedded computers. The problem is always that the server is each case cannot be found.
This is very peculiar for me because I would expect that all intranet services should be working - the DNS server is a DNS "master" and acts authoritatively for the intranet domain.
This becomes even more peculiar because I have been tracking the activity of resolving with tcpdump and have realized that no DNS requests have been forwarded to the DNS server when a see those "server not found" error messages! It is as if the clients (all intranet pcs!) have decided that "since we cannot resolve the vast majority of our internet requests, there is no point trying to request anything at all".
I know this sounds silly but I literally have no idea why no resolving takes place! I cannot even narrow down this problem to ,say, the DNS server or some other component or some access point.
Could you please help? I have attached here my named.conf as a starting point. Perhaps there is indeed some important option missing. If you think some other information will help to draw some conclusions please do not hesitate to ask !
$ cat /etc/named.conf include "/etc/rndc.key"; acl skails_dns { 192.168.12.122/32; }; acl outbound_subnet { 192.168.321.0/24; }; //this is the modem subnet side acl skails_subnet { 192.168.12.0/24; }; //this is the intranet side acl local_host { 127.0.0.1/32; }; acl local_nets { local_host; skails_subnet; }; acl other_subnets { outbound_subnet; }; acl trusted_subnets { local_nets; skails_dns; }; acl all_my_nets { trusted_subnets; other_subnets; }; options { directory "/var/named"; /* * If there is a firewall between you and name servers you want * to talk to, you might need to un-comment the query-source * directive below. Previous versions of BIND always asked * questions using port 53, but BIND 8.1 uses an unprivileged * port by default. */ // query-source address * port 53; forwarders { 192.168.321.72; //modem ..... //some other servers (openDNS, google, others) }; // forward first; allow-recursion { trusted_subnets; }; //for whom will bind go the extra mile to find the final address listen-on { trusted_subnets; }; //ifs on which bind listens for queries allow-transfer { trusted_subnets; }; allow-query { trusted_subnets; }; }; controls { inet 127.0.0.1 port 953 allow { local_host; } keys { rndc-key; }; }; // // a caching only nameserver config // zone "." IN { type hint; file "caching-example/named.root"; }; zone "localhost" IN { type master; file "caching-example/localhost.zone"; allow-update { none; }; }; zone "0.0.127.in-addr.arpa" IN { type master; file "caching-example/named.local"; allow-update { none; }; }; zone "skails.home" { type master; file "/var/named/skails.home.hosts"; notify yes; allow-update { key rndc-key; }; }; zone "12.168.192.in-addr.arpa" { type master; file "/var/named/192.168.12.rev"; notify yes; allow-update { key rndc-key; }; }; EDIT: As suggested I also tried host <intranet pc name|FQDN> <ip address of server> and this also worked as expected. So on a random intranet pc I :
- can ping, nslookup, host, ssh, wget correctly from cli (the cli works)
- cannot access the Apache server (a simple HTTP webpage)
from a mobile phone I:
- can nslookup
- cannot ping, access HTTP, access MPD or any other intranet services.
The curious thing in all these case is that no DNS lookup occurs.
EDIT2: the named.root is not empty:
$ cat /var/named/caching-example/named.root | egrep -v "(;)" . 3600000 NS A.ROOT-SERVERS.NET. A.ROOT-SERVERS.NET. 3600000 A 198.41.0.4 A.ROOT-SERVERS.NET. 3600000 AAAA 2001:503:ba3e::2:30 . 3600000 NS B.ROOT-SERVERS.NET. B.ROOT-SERVERS.NET. 3600000 A 199.9.14.201 B.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:200::b . 3600000 NS C.ROOT-SERVERS.NET. C.ROOT-SERVERS.NET. 3600000 A 192.33.4.12 C.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:2::c . 3600000 NS D.ROOT-SERVERS.NET. D.ROOT-SERVERS.NET. 3600000 A 199.7.91.13 D.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:2d::d . 3600000 NS E.ROOT-SERVERS.NET. E.ROOT-SERVERS.NET. 3600000 A 192.203.230.10 E.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:a8::e . 3600000 NS F.ROOT-SERVERS.NET. F.ROOT-SERVERS.NET. 3600000 A 192.5.5.241 F.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:2f::f . 3600000 NS G.ROOT-SERVERS.NET. G.ROOT-SERVERS.NET. 3600000 A 192.112.36.4 G.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:12::d0d . 3600000 NS H.ROOT-SERVERS.NET. H.ROOT-SERVERS.NET. 3600000 A 198.97.190.53 H.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:1::53 . 3600000 NS I.ROOT-SERVERS.NET. I.ROOT-SERVERS.NET. 3600000 A 192.36.148.17 I.ROOT-SERVERS.NET. 3600000 AAAA 2001:7fe::53 . 3600000 NS J.ROOT-SERVERS.NET. J.ROOT-SERVERS.NET. 3600000 A 192.58.128.30 J.ROOT-SERVERS.NET. 3600000 AAAA 2001:503:c27::2:30 . 3600000 NS K.ROOT-SERVERS.NET. K.ROOT-SERVERS.NET. 3600000 A 193.0.14.129 K.ROOT-SERVERS.NET. 3600000 AAAA 2001:7fd::1 . 3600000 NS L.ROOT-SERVERS.NET. L.ROOT-SERVERS.NET. 3600000 A 199.7.83.42 L.ROOT-SERVERS.NET. 3600000 AAAA 2001:500:9f::42 . 3600000 NS M.ROOT-SERVERS.NET. M.ROOT-SERVERS.NET. 3600000 A 202.12.27.33 M.ROOT-SERVERS.NET. 3600000 AAAA 2001:dc3::35
host internal.host firewall.ip.address. That should tell you more about the resolution error and where it happens.named.root.contains internet addresses. Nothing will get rexolved without internet. You need to solve that. @nass