4

I am studying the OSI Model, and to my understanding: a MAC address is needed to transmit a message to a node within the same physical network or to a switch/router. IP addresses are used for globally unique addressing. If the source host, hostA is in a different physical network than its target host, hostB and the MAC of hostB is unknown, hostA sends a request "what's the MAC paired with this IP?" aka ARP or NDP to its default gateway, which then routes it through the same procedure to hostB, which answers to hostA with its MAC address. This routing is possible because every node knows the MAC address of the next node that it needs to send this request to -> the MAC address is changed with every "node-hop" while the IP address remains.

My question: why does hostA need to know the MAC of hostB? It cannot be used for routing a message to B because of what I just described. correct?

1
  • 2
    Keep reading... ethernet is layer's 1 and 2, IP is layer-3, TCP is layer-4, etc. While most of the world has become ethernet, it is not the only layer-2 to ever exist, or that still exists. Not to be confused with bridging of ethernet (L2 within L2)... wifi, xPON, DOCSIS,... Commented Aug 18 at 2:10

5 Answers 5

10

If the source host, hostA is in a different physical network than its target host, hostB and the MAC of hostB is unknown, hostA sends a request "whats the MAC paired with this IP?"

No, it won't. If the destination is on another network (L2 segment), the routing table indicates the gateway to use and the packet is sent there, by using its MAC address in the encapsulating L2 frame.

In the simplest case, there's just a default route and gateway that everything not 'on link' is sent to.

why does hostA need to know the MAC of hostB?

It doesn't. MAC addresses outside of the local network are meaningless.

5
  • 1
    Thanks for your answer. My material is flawed then. So: ARP and NDP are used to get the MAC address of a host within the same (!) physical network by using its IP address. But what is the benefit of this? The only difference I see is that by using a MAC address instead of an IP address all other hosts in the same network discard the frame on L1 because they use a different MAC address. So they do not process the frame on higher layers. Did I miss something? Commented Aug 17 at 11:20
  • 2
    You need to use the (local) destination's MAC address because you want your frame (carrying a packet as payload) to go where you need it. Ethernet can transport many more protocols than just IPv4 - also IPv6, IPX, Appletalk, PPPoE, ... See also networkengineering.stackexchange.com/questions/34765/… Commented Aug 17 at 11:28
  • 1
    Thanks again. To check my understanding: If the local network is connected to a switch or router then I could use only the IP address of the target even if it is in the same network. The host would send the frame with the target IP address to the switch/router, which knows the MAC address of the target corresponding to the IP address. Correct? Commented Aug 17 at 11:36
  • A switch (bridge) forwards by MAC address, a router forwards by IP address. Any node in your local subnet is talked to directly (by MAC address, possibly across a switch). Any node in a remote subnet is sent to the according gateway (also by MAC address, possibly across a switch). Commented Aug 17 at 11:53
  • @PinkFlamingos correct - "if not local then send to default gateway to handle" Commented Aug 17 at 23:09
9

So host A has a packet to deliver to host B (identified by its IP address only, that's all the information upper layers like TCP or UDP will provide the IP layer anyway).

  • It starts by looking up B's IP address in its routing table
  • The routing table can tell it a number of things, for instance:
    • Host B is connected directly at the other end of a point-to-point link with no MAC addresses (tunnel, SLIP, PPP, Frame Relay, or ATM VC, etc.). The packet is sent directly on that point-to-point link. No MAC address, no ARP.

    • Host B's IP address is part of a layer 2 network (Ethernet, Token Ring...) connected to host A, which uses MAC addresses for local addressing and switching. Host A needs to find out Host B's MAC address to send the packet to Host B. It uses ARP to find the MAC addresses, and then uses host B's MAC address as destination in the L2 frame.

    • Host B is not connected directly to host A, but the routing table contains an entry which states "traffic to IP B goes through IP C" (in most cases, the default route pointing to the default gateway).

      Now Host A starts the process again with IP C, which should yield one of the first two options. If A and C are connected to a layer 2 network with MAC addressing, it will need C's MAC address. The frame sent will have C's MAC address as destination, but the IP packet inside will still have B's IP address as destination.

      When C receives the packet, it will perform the same process again, going through the routing table to find how to deliver the packet to B.

So host A never needs to know in advance host B's MAC address, and it only needs to learn it (dynamically in the vast majority of cases) if A and B are on the same layer 2 network, which actually needs MAC addresses.

1
  • 2
    I like this answer better than the other ones because it highlights that parts (or all) of the routing path may not even have a concept of MAC adresses. +1 Commented Aug 18 at 12:53
2

My question: why does hostA need to know the MAC of hostB? it cannot be used for routing a message to B because of what I just described. correct?

That's correct, it does not need to know that at all. If the route has a 'next hop', then the ARP request is actually made for the next hop's IP address – not the final destination's – and the packet is delivered to the 'next hop' (router/gateway) MAC address.

In some rare situations, networks may use "proxy ARP", where the default route has no next hop, so the ARP request is made directly for the destination IP address. In such cases, the router/gateway will reply on behalf of that IP address – but it will still reply with its own MAC address (i.e. offering itself as the next hop).

Either way, the final destination's MAC address (if it even exists!) remains completely irrelevant to the sender, or to any intermediate gateway, all the way until the last gateway which finally does need it to deliver the packet to the recipient.

(Again, assuming the recipient is even connected through a network that uses MAC addresses. It is perfectly possible that the recipient may be on a PPP link, or a tunnel, where only two nodes exist so link-layer addresses aren't used. Links that use different MAC address formats are also worth thinking about, e.g. some link types use 64-bit MACs.)

1

Answer Two: Identifying [Some Of] What's Right

Here, I try to identify some concepts that sound right, and some that don't. For concepts that don't sound right, I might or might not elaborate on what is right.

The initial question poster clearly has some good understanding of the details. (That's good!) Although, as noted, "some" good understanding. That same poster also seemed to be just a bit off on some details. (Asking these questions was a great idea, so that could become clarified more quickly.)

(I'm quoting from both the question, and some of the pre-existing comments by the same poster as the question.)

I'm mainly responding to some aspects that seem wrong to me, so I'm calling them out. One reason I might not offer more clarifying text on some of the topics, here, is for the sake of some effort to attempt reduction of some redundancy, not to try to provide a bunch of the "correct" details which are covered in my other asnwer to this same question, Answer 1: Overviewing How Some Things Work, which largely focused more on providing some right details on how things work. (In contrast, this question has more focus on identifying which components, of some of the questions/comments, seemed to reveal some confusion. So, quoted text containing some more of the details that seemed inaccurate is quoted text that may have been more likely to be ending up in this answer...)

(I do suggest reading over that other posted long answer first, before going through this posted long answer that focuses on some of the other questions/content. Doing so may help make some of this text make a bit more sense, perhaps by filling in some gaps earlier on...)

ARP and NDP handling, and Layer 2's Usefulness

If the source host, hostA is in a different physical network than its target host, hostB and the MAC of hostB is unknown,

Correct, the MAC-48 address of "hostB" is unknown to "HostA". (For the moment here, I'm now just quickly confirming this piece of a comment. This answer is about to elaborate upon this idea some more when looking at a very related forthcoming question.)

default gateway which then rout[e]s it through the same procedure to hostB

That part sounds right.

My question: why does hostA need to know the MAC of hostB?

I sure hope it doesn't, because "host A" never gets the MAC-48 address of a device that is not on the same local subnet (as part of the typical standard networking/routing process). (Since "host A" doesn't get that address, I sure hope that "host A" doesn't need it!) That's not the case if hostB is in the same local subnet as hostA. But the question stated that "hostA is in a different physical network than its target host, hostB".

aka ARP or NDP to its default gateway which then rout[e]s it through the same procedure to hostB which answers to hostA with its MAC address.

Wrong. ARP or NDP will be used, to get the maching Layer 2 address and Layer 3 address, if the local device's neighboring cache doesn't have those details in memory already. That may go to the "default gateway", as you specify.

However, then what happens is a new Layer 2 frame (containing an IP packet with the same desired IP address for a "destination IP address") goes to the default gateway. (Note that the destination of this frame is the "default gateway", and so the frame's destination "MAC-48 address" may go to a different destination than the embedded IP packet's destination "IP address", and that's perfectly fine that these two destination addresses point to different locations. The Layer 2 Frame is just meant to get the packet onto the device that will handle the "next hop", so that short-range trip has a different destination than what might be a longer-range trip to eventually get the IP packet to the desired (potentially distant) IP address.

Then, "default gateway which then rout[e]s it through the same procedure to hostB" is mostly right, as long as you are understanding that the routed "it" is a version of the desired "IP packet". (The "default gateway" is not routing/extending an ARP/NDP request!) If the "default gateway" doesn't know how to get the IP address all the way to the desired destination, then it may follow as similar process, including sending an ARP/NDP request for details about the very next hop, so it can make a new "Layer 2" frame (containing the still-mostly-unaltered "IP packet") in order to help get the traffic one more hop closer.

[traffic is routed] to hostB which answers to hostA with its MAC address.

No. The ARP/NDP requests only go one hop forward. If hostB gets an ARP/NDP request, which is very possible, that will come from whatever router is located in the hop before hostB, and so the ARP/NDP response will only be sent to that router located just one hop before hostB. None of this will cause hostB to send a response to hostA. (hostB might choose to respond to hostA, e.g. if the payload inside the "IP packet" contains an ICMP/IPv4 request or an ICMPv6 request, and so the response is done to support that received request, or if the payload in the "IP packet" contains a "TCP segment", and so the request is done in response to the received TCP details.

However, [to re-cap,] any ARP/NDP response from a "hostB" destination device will stay on the same subnet as that same "hostB" destination device, and so will not go back to a "hostA" on a different subnet. Also, the act of receiving an IP packet doesn't trigger a response from "hostB" back to "hostA". (Processing the "IP packet" payload details, however, might trigger some sort or response, depending on some details including just what is actually in that "IP packet" payload.)

Now, going back to the first part of the question:

a MAC address is needed to transmit a message to a node within the same physical network

Yes.

or to a switch/router.

This isn't as common.

You may send a "Layer 2" network frame through a "Layer 2" switch (or through a [Layer 3] router since that device also forwards [Layer 2] network frames as needed). (Note: Similar to how a Layer 3 router can handle the task of a Layer 2 switch, a Layer 4 firewall can handle the task of a Layer 3 router (including also handling the task of a Layer 2 switch). When the "Layer 2" switch receives a "network frame", it will figure out what the desired destination MAC-48 address is, and continue sending that traffic [forward on] in a useful way.

However, as for actually sending traffic "to a switch/router", rather than just going "through a switch/router", that suggests that the switch/router itself is the desired end destination. That's probably not what you were asking about, though. (Sometimes, people might "manage" the configuration of a device like a "managed switch" or a [Layer 3] router. This is most commonly done using protocols, like HTTP, which utilize an IP address on that device. A device doesn't necessarily need to even have an IP address to effectively server as a "Layer 2" switch.

There might be some exceptions to this. "MikroTik RouterBOARD" devices may often support a ["Layer 2"-based] protocol called "MAC-Telnet", which is different than the standard Layer 4/5 "Telnet" protocol. The biggest limitation that prevents MAC-Telnet from being more useful may just be that software supporting that protocol is not pre-installed on as many computers/devices. (The next-biggest limitation would be the "Layer 2" nature may be prone to limiting the traffic's range to communicating easily only to only devices within a single subnet.)

Traffic not reaching higher layers

The only difference I see is that[,] by using a MAC address instead of an IP address[,] all other hosts in the same network discard the frame on L1 because they use a different MAC address.

No, not Layer one (as quoted above where it says "on L1"). Instead, "all other hosts in the same network discard the frame on" Layer 2 "because they use a different MAC address." (Layer 2, not Layer 1 like what was just quoted above. The word "frame" is a "Layer 2" word. e.g., "Ethernet frame", "Wi-Fi frame".)

Since you said "discard the frame", the word "frame" means we are now talking about Layer 2. Also, the phrase "MAC address" was in the above-quoted text, so that also indicates "Layer 2".

An example of what would actually be a case of "Layer 1" discarding traffic would be something like a cable got cut, and now Layer 1 can't communicate so the traffic get discarded. Layer 1 discarding would be something that can happen in a scenario like when a microwave oven emits radiation which interferes with the radio frequencies of Wi-Fi, so Wi-Fi traffic gets discarded. (Hopefully Layer 2 will notice either of these problems. I'm thinking that Ethernet might treat this loss like a collision, so data may be re-transmitting, so the Wi-Fi issue might get resolved when the source of interference stops (when the microwave oven stops cooking). Re-transmitting over a broken cable won't work until the cable gets repaired/replaced.)

discard the frame [...] So they do not process the frame on higher layers.

Since it's unclear if by "higher layers" you really meant Layer 1 (as stated), or meant Layer 2 (where frames get handled), let me address both by just going over some cases where some information doesn't get processed by a higher layer.

If there is a Layer 1 issue on a direct connection network, such as a console port, then the traffic is lost. You may get some jumbled data (basically looking like random bits), and data is not re-transmitted. e.g., the "rollover cable" connection I mention above, and the "phone line" connection I mention above.

An "Ethernet hub" is basically a layer 1 device: as far as I know, it may work by connecting metal together so one cable has an electrical connection to multiple other cables.

Correct. If a frame isn't getting past a lower layer, then the frame's contents don't get processed. If the Layer 2 doesn't pull apart a frame, notice an ARP packet, and respond to ARP, then no ARP response will be given. If the Layer 2 doesn't pull apart a frame, notice an IPv4 packet, and send that packet to the IPv4-handling software (which will be what implements to lower portion of the "TCP/IP stack" of software), then the computer doesn't do any further processing of the IP packet.
(Likewise, if a received IP packet is meant for a different computer's TCP port 80, then a router may help by trying to route the traffic, and a non-routing "host" may just drop the packet. So the local web brower listening on port 80 would never get that TCP traffic, because the IP layer didn't extract the TCP traffic and try to get it processed by the local machine's TCP-handling software.)

Some protocols may offer some buffers to check for successful transmission. For instance, an "Ethernet card" may use the Ethernet protocol and have "buffers" ["buffer" memory] built into that hardware. A "network switch" may use a "CAM table" within its memory to match Ethernet addresses to individual RJ-45 ports, and may similarly use a buffer. TCP has some buffers that can store data temporarily until confirmation is received, which is part of how TCP is able to act in a way that people often refer to as "reliable" (as lost data can get re-transmitted).

When an Ethernet device wants to communicate with Layer 1, it may check the Layer 1 status and notice the line is busy. Once the Ethernet device notices the Layer 1 carrier seems up and available (not busy), it may transmit.

If there is a Layer 1 problem, such as multiple devices trying to communicate at once, you may get a "collision". The traffic on Layer 1 is jumbled, with all information lost. The device supporting Layer 2 may realize this, and follow the Layer 2 protocol rules. So the Ethernet devices may send out a "jamming signal" and wait for a random(-ish?) "back-off period" of time, as mandated by Ethernet rules, before re-transmitting the frame which is still in its buffer.

When a device supports Layer 3 using IPv6 or IPv4, your data may just be sent and forgotten. The same is true if you rely on UDP as a "Layer 4" protocol, and so sending a layer 4 "datagram" using "UDP" is called "unreliable" just because it doesn't provide any confirmation of the transmission. However, if you use TCP as a "Layer 4" protocol, you get the feature of the protocol being "reliable" by design, because the sending device will store the outgoing traffic in a (Layer 4, TCP) buffer before it tries to send the traffic out Layer 3 (e.g., by using an "IP packet"). A copy of that TCP segment" remains in the sender's outgoing buffer until that sender receives confirmation by the TCP recipient. If the TCP sender never gets such a confirmation, then the TCP sender will re-send that "TCP segment". (In a sequence of TCP segments, the receiver won't care that the received segment arrives out of order.)

Details on Using MAC-48

(Back to more details form the main question...)

So: ARP and NDP are used to get the MAC address of a host within the same (!) physical network by using its IP address.

Yes. (If you have a MAC address and want the IPv4 address, RARP, Reverse-ARP, may be used. That isn't done quite so commonly.)

NDP/IPv6 works similarly to ARP/IPv4. I'm used to referring to ARP/IPv4, but do consider NDP to act similar in regards to what I'm about to say.

Switches may take an ARP request and reach out to other segments within the same Layer 2 broadcast domain. A device running IP may respond to ARP requests sent to an Ethernet broadcast address (MAC-48 address FF-FF-FF-FF-FF-FF).

Note that ARP and NDP are unroutable, by design. (Routers won't take an ARP request and send it to another IP network segment. Although, a router might act like a switch, so information may flow through a Layer 3+ router, but that is just because of how the router is acting like a switch at the layer 2 level, not because of how the router is applying Layer 3 logic.)

So what happens if the device is on the same subnet? If a Layer 3+ device uses ARP or NDP, it may pay attention to some Layer 2 MAC-48 addresses.

Quick side note: As you might expect, the network driver may notice incoming traffic on the device's own Layer 2 MAC-48 address. Most Layer 3 devices will then process an "IP packet" to see if the destination address belongs to the same device, and otherwise discard the IP packet. Routers might process an IP packet further, in order to figure out what routing needs to occur.

The network driver might also pay attention to the MAC-48 broadcast address of FF-FF-FF-FF-FF-FF. If it is a supported Layer 2 protocol of ARP or NDP, then the device will check if the request is for one of its layer 3 addresses. If so, then the an ARP or NDP response may be given back (to whatever MAC address broadcasted the ARP/NDP request).

Why Use MAC-48

So: ARP and NDP are used to get the MAC address of a host within the same (!) physical network by using its IP address. But what is the benefit of this?

The benefit is... that's how things work.

You need the MAC-48 address to send out a proper Layer 2 frame (e.g. an Ethernet frame or a Wi-Fi frame).

Let's consider if this wasn't done. (Let's analyze what would happen in a non-working way.) Actually, you probably can't even do this. If a network card supports only the layer 2 protocol of Ethernet to communicate with, the "network driver" will only support using Ethernet frames, so the "operating system" doesn't have a way to write an "IP packet" directly onto the wire. But even if a device (like a full-blown computer) somehow theoreticaly did manage to get its communications components (e.g. the RJ-45 port) to somehow send the bits of a proper "IP packet" directly onto a wire, what would happen if we didn't bother sending a proper Layer 2 frame?

The destination computer wouldn't care because its Layer 3 stack is probably not listening to the Layer 1 communication. The remote Layer 2 device would likely just consider the IP packet to be unsupported noise, most likely to be ignored (or maybe treated like noise seen in a collision, and responding that way).

So the benefit of using ARP and NDP to get the MAC address is that you need to communicate using the Layer 2 frames, just because that is what the receiving computer is going to be listening for.

Now, this may lead to the question: Why does the computer only listen to Ethernet frames, instead of listening for IP packets? Could we just skip the entire Layer 2 process?

Well, the theortical reasoning of why we use Layer 2 is because it provides some benefit. The Layer 2 communications can help. For instance, if there is a layer 1 problem like a "collision" or even "random"-seeming data disruption caused by electrical interference which disrupts a bit of communication, the layer 2 network buffer may result in a re-transmission so data successfully gets sent.

Now, in theory one might think you could just incorporate such functionality into the Layer 3 software. But then you'd be basically merging the functionality of Layers 2 and 3. In theory, functionality of layers could be mixed together a bit in software. Actually, that is quite commonly done with Layers 5-7. There might also frequently be some merged code in a TCP/IP stack, mixing layers 3 and 4 a bit.

But there's usually a pretty strict boundary separating layers 2 and 3. That way, you can have separate software (e.g. an "Ethernet network driver" or a "Wi-Fi network driver) handling layer 2, and separate options (e.g. IPv6 or IPv4) for layer 3. If you have some detail specific to "Layer 2", e.g. Wi-Fi cards need SSID information and maybe login information to be implemented, that can be handled by the "Layer 2" network driver. A different "Layer 2" network driver, for an Ethernet card, might not need that complexity at all. And, either way, your IPv6 driver won't be affected, because IPv6 is at Layer 3. Your IPv4 driver won't be affected, because IPv4 is at layer 3.

When learning networking, a web browser may communicate using Layer 4 TCP ports, while ping may use ICMP at Layer 3, and a more rarely-implemented protocol called "MAC-Telnet" may communicate using just Layer 2. ARP may expose communication troubles at Layer 2, in which case you know that layers 3 and 4 won't work until the ARP table is able to successfully show the needed Layer 2 address.

For communications at higher layers to work, the communications will need to go up the network stack.

Layer Interaction

Some of the next paragraphs might not show new concepts as much as they simply show some details on how theory gets implemented. I figure some readers might not need these details, while others might find this provides a bit of clarity that helps solidify the concepts. (Especially for those who might not have needed it, this next little bit may feel a bit redundant with the previous paragraphs which discussed the theory a bit more.)

Since different software may communicate using these different layers, it is going to be helpful to familiarize yourself with the differences between Layers 2, 3, and 4. (I show some specific examples in my first answer to this same "question" post, in the section called "Layer-Based Terminology".)

When learning these networking details, it will be helpful for you to think of the layers one through four as being pretty separate, and that each layer only communicates with the layer above and below it. (Granted, that might not be entirely true since some software might try to kind of merge implementation of multiple layers a bit, probably in a way to be a bit more efficient. However, such software should act in a way that is compatible with what would happen if the layers were separated more, so you are likely to have an easier time learning faster if you keeping things simple in your mind, by keeping the layers separate, at least mentally. That can help you to focus on only the most applicable, needed layers.)

The upper layer of the TCP/IP stack, Layer 4, handles TCP and UDP. These protocols only communicate with Layer 3 protocols, like IPv6 and IPv4, and upper-level layer(s). (Layers 5 through 7 are often implemented together, so although they have different purposes, they are often not quite as separated as other layers.) TCP and UDP never communicate with Layer 2 protocols like Ethernet or Wi-Fi. They certainly never deal with Layer 1 issues like cable line congestion or airwave collisions. When software requests a new outgoing UDP connection, that is a Layer 4 request. When software requests that the computer "listen" to a specific TCP port, and communicate any traffic received on that port to the software, that is a Layer 4 request.

In the lower half of the TCP/IP stack, Layer 3 protocols like IPv6 and IPv4, happily communicate with Layer 4 protocols with TCP and UDP, as well as Layer 2 protocols like Ethernet or Wi-Fi. However, the portion of software known as the TCP/IP stack typically ignores Layer 1. So if a request comes in using Layer 1 (an Ethernet cable or airwaves using Wi-Fi), the TCP/IP software drivers are just going to ignore the communication.

Because of the details from this previous paragraph, if you send an IP packet by itself over the wire, the receiving computer will typically ignore it.

The network driver which implements Layer 2 communications will happily communicate with layer 3 (the IP-handling software portion of the TCP/IP network stack), and will also pay attention to Layer 1 details (is the connection "up", or does the device report "no carrier"). So it will listen for a Layer 2 frame sent to a supported MAC-48 address, and if it receives such a frame, then it will investigate what is in that frame. It will, for example, perform data verification (see if the Cyclic Redundancy Check ("CRC") data matches). Also, if it is an ARP or NDP request sent to its own MAC-48 address or to a supported Ethernet broadcast address (typically just FF-FF-FF-FF-FF-FF), then that will get handled as noted earlier. On the other hand, if the frame has an IP packet, then the network driver will pull the IP packet out of the frame, and send the IP packet to the lower half of the TCP/IP stack (and disregard the remaining part of the frame).

If the local network is connected to a switch or router then I could use only the IP address of the target even if it is in the same network.

Not quite right. Here is what is right:

  • If you are connected through a "layer 2"-only switch which isn't doing Layer 3 handling, that device is not going to pay any attention to the IP address in the embedded IP packet. You're relying on Layer 2 at this stage.
  • When using a switch, you are using "Layer 2" frames, so you would use MAC-48 addresses. A "layer 2" switch won't care about the IP address at all.
  • You can also likely use an "IP address" of any device on the same network, because trying to communicate with IP will result in an IP packet being inserted into a "Layer 2" "network frame" that the switch can handle nicely.
  • When using a router, you can use an IP address to any device that is reachable via routing, which could potentially involve somewhere on the other side of the city/nation/planet.

If you're on the same subnet, then you're likely on the same Ethernet broadcast domain, so you can use the Ethernet broadcast address (FF-FF-FF-FF-FF-FF is commonly supported by hardware/drivers) to use ARP so that you can convert a local device's IP address to a local device's MAC-48 address. You'll need that destination device's MAC-48 address to be able to communicate using IP for a device on the local subnet.

Although not just asked about, I will point out that if the remote device is not on the same local subnet, then what happens? The MAC-48 address of the device with the destination IP address won't ever get figured out by the local computer initiating that IP packet, and that's fine. That initiating computer will notice the destination IP address is on a different subnet. It will then take the IP address, which continues to show the destination IP address (of the very-remote device), and try to send that IP packet in a newly created Layer 2 network frame. To make that frame, the computer initiating this communication will need to identify the MAC-48 address of a "gateway device" which is on the same subnet.

The host would send the frame with the target IP address to the switch/router, which knows the MAC address of the target corresponding to the IP address.

No, no, no. Two things wrong here:

The host would send the frame with the target IP address to the switch

A frame doesn't have a field for an IP address. (The "IP address" in a frame is simply part of the "IP address" in an "IP packet". Such an "IP packet" gets inserted into a section of the frame which we call the "payload" for the frame.

Also, the host may be sending "through" the switch, expecting the switch will "switch"-forward the traffic. (That is different than sending "to the switch", which sounds like the switch is the ultimate destination. That's not what you were trying to portray.)

The host would send the [...] target IP address [through] the [...] router, which knows the MAC address of the target corresponding to the IP address.

The near router is only expected to know (or be able to figure out, and then know) the MAC address of the device on the next hop. If multiple hops are needed, then the device on the next hop won't be the "target corresponding to the IP address".

1

Answer #1: Overviewing How Some Things Work

Don't Plan on routing ARP/NDP

When the "Layer 2" frame is processed (by whatever device has the network interface using the "layer 2" frame's destination MAC-48 address), then the contents of that frame will determine how that frame gets responded to. An ARP packet may get responded to using the ARP protocol. An NDP packet may get responded to using the NDP protocol. An IPv4 packet may get routed by having IPv6 routing rules applied. An IPv6 packet may get routed by having IPv6 routing rules applied.

hostA sends a request "whats the MAC paired with this IP?" aka ARP or NDP

Yes.

Then, you say that gets sent

to its default gateway which then rout[e]s it

No. ARP and NDP do not typically get routed.

Descend the Stack by Encapsulating, & Ascend by Decapsulating

So: ARP and NDP are used to get the MAC address of a host within the same (!) physical network by using its IP address. But what is the benefit of this?

If you want to communicate using IP, which is a layer 3 protocol, you'll need to communicate by going through layer 2. Your layer 3 protocol (IP) doesn't communicate to layer 1 directly. This is because...

In theory...

There is a theory that layers only communicate with neighboring layers.

What software actually does (in practice)...

In practice, this is somewhat true (layer 5 is implemented by different software than layer 4, and layer 3 is implemented by different software than layer 2) and very commonly false.

Layers 5-7 are conceptual layers that commonly get handled with software techniques that don't feel very tied to networking, so in practice they may get merged or even entirely skipped over.

Layers 3 and 4 are commonly implemented using Layer 3 IPv6 or Layer 3 IPv4, and Layer 4 TCP or Layer 4 UDP. So, a "TCP/IP stack" implementations (which may commonly built into modern operating systems) might not use one program for handling Layer 4 and a different program for layer 3, but instead may actually be a software solution that somehow implements these layers in a somewhat blended way. Even so, this is done using behaviors that are compatible to what would happen if Layer 3 and Layer 4 were implemented with different software, so other protocols can still use Layer 3 (e.g. ICMP/IPv4 or ICMPv6/IPv6 used by ping, which does not use TCP or UDP and so does not use a "port" like pure "layer 4" protocols. Also, a less common protocol like the layer 4 protocol of SCTP can communicate using a choice of layer 3 protocols (IPv6 or IPv4).

Layer 2 may commonly be implemented through a combination of actions taken by both software and hardware. The "software" portion of this is a typically-small amount of software called the "network driver" (which communicates with the device's operating system and/or other software running under the device's operating system), and the "hardware" portion of this refers to what the actual hardware does (including any sort of "firmware"/"software" running on a "network card" or simlar, like other dedicated circuitry on a motherboard which is rather specific to handling networking).

Layer 1 is basically consisting of the physical hardware details, whether that is tangible physical (an Ethernet cable or a fiber-optic cable) or intangible physical (airwaves affected by "radio frequency" technology like how Wi-Fi or Bluetooth work, or infrared/whatever).

How this is implemented (actual example, broad details)

To communicate using a Transport layer protocol (commonly either TCP or UDP), a (Layer 4) datagram (using UDP) or a (Layer 4) segment (using TCP) will need to be placed into a (layer 3) Packet. To communicate using IP (either IPv6 or IPv4), a (Layer 3) IP packet will need to get placed into a (Layer 2) frame. Only software focused on the layer 2 frame (the network driver) communicates with hardware.

Some Networking Terminology

MAC-48

I tend to use the phrase MAC-48 when writing, a habit from when I was actively educating on this topic. A MAC-48 address is the same as your currently-common type of MAC address, and an EUI-48 address.

Layer-Based Terminology

a switch or router

I'm going to get very nitpicky, and not all network professionals will always like the vocabulary I'm about to give here. But even though these details are not helpful (and may even be detrimental) in some cases, in other scenarios I have found some of these terminology details to be useful.

Some basic functionalities of the layers are:

  • Layer 2: Communicate with neighbors (standard frames)
  • Layer 3: traditional routing (letting subnets/networks communicate with other networks/subnets)
  • Layer 4: port-based behaviors (e.g. getting network traffic sent not only to one remote host, but specifically to one copy of a desired software program. Also, e.g., a lot of firewalling happens by looking at port numbers.)

Even that's not entirely true: besides "Wi-Fi frames" or "token ring" frames or "Ethernet frames", there may be more variations in frames like "jumbo frames" or "VLAN frames". One of the popular ways to implement VLAN frames is using 802.1q. Such a VLAN frame may effectively implement some routing, and using such frames may be faster with some hardware. However, those VLAN frames don't typically directly encapsulate Layer 4 data like port numbers, so there will still be a layer 3 IP packet in the mix.

The way I use terminology is:

  • A "switch" handles Layer 2 network frames
  • A "router" handles Layer 2 network frames and also Layer 3 network packets
  • a "firewall" handles Layer 2 network frames and also Layer 3 network packets and also analyzes Layer 4 ports

Now, the reality is that some devices that are sold as being a "switch" may handle Layer 3 network packets. Personally, I don't like to call such a device a "switch" if it is routing IP packets. I like to call such a device a "router". If ports are affecting decisions, I like to call the device a "firewall". But, a manufacturer like Cisco might label such a device model as being a "switch", even if it can perform routing (and may perform that quite well) and might even be able to do some filtering at the Layer 4 level.

Why does Cisco call such a device a "switch" even if it routes? I believe there are two reasons.

  • Number one, the "switch" might use an ASIC ("application-specific integrated circuit") which is more optimized for "switching" layer two frames. So even though it technically can "route", doing so might be slower than using one of the more expensive devices called a "router". The term "switch" is more about what its design is optmized for, and not about the absolute limit of its capabilities.

  • Second, the "switch" might have some different default configuration behaviors, compared to a device that is called a "router".

The simple fact is that a device marketed as a switch (which I like to think of as "layer 2") might be able to route (using layer 3 protocols) and might even be able to perform some port-based actions (essentially "firewalling"), even though such port-based behaviors would indicate Layer 4.

That's some reality related to device marketed as a Cisco-branded "switch". This is hardly the only example. Many Wi-Fi wireless access points ("WAPs") might commonly be marketed as being a "router", even if they frequently have established functionaity of being able to do some "firewall"-like activities by handling Layer 4.

So, being willing to toss out any conceptions about "switches do stuff at layer 2" and "routers do stuff at layer 3" may be beneficial at times, so you're not getting too hung up about such minutae when shopping for a relatively low-budget hardware device.

Despite all of what I just said (which may help explain manufacturer terminology), being familiar with the layer-specific terminology is recommended. I think you will find that people may refer to "switching" as an activity done with Layer 2 frames that frequently use MAC-48 addresses. When talking about routing, people will typically be referring to "packets" which use IP addresses.

If you mix up terminology so you're using terms for one layer when discussing a different layer, then your language may even feel a bit "off-putting" as actual experts try to mentally re-categorize what you said into proper details that mimic actual functionality. As a result, your choices of words may be end up becoming a clearly-recognized sign of you likely being someone with a bit less experience, and likely less familiarity (which is not desired when you try to present yourself as having a lot of professionally valuable expertise).

So, I encourage adopting the terminology shown in my earlier list (Layer 2 frames get switched, Layer 3 packets get routed, and Layer 4 ports may provide information helpful for firewalls).

Terminology: Network interface

For a device to send an IP packet, it will need to know what destination MAC-48 address to use in a "network frame", and it will need to know which "hardware interface" to use. (When refering to such a [networking hardware] "interface", that term refers to what has traditionally/historically been a separate, easily-removable piece of hardware called a "network card", but now often refers to an individual "ethernet port", or often refers to an individual "wireless connection" that connects using a specific SSID and uses antennas.)

The term "NIC" officially stands for "network interface card", but in practice refers to any sort of "network interface circuity". Either way, the "NI" in "NIC" stands for "network interface", so if you see the short abbreviation of NIC, you may think of that as referring to a physical connection, like a single Ethernet port, a single port for some fiber-optic communications, or a single collection of a bunch of details related to a Wi-Fi connection like which Wi-Fi protocol to use, which frequency is used, which SSID is used, and login credentials that might be required.

Terminology: Broadcast

There are (at least) two different types of "broadcasts". This answer mentions "Layer 2" broadcasts, as used when ARP/NDP sends to MAC-48 FF-FF-FF-FF-FF-FF.

There are also "IP broadcast" communications. Those are used when talking about an IP address which is a "broadcast address", or when discussing IP-based "broadcast domain". So when you hear the term "broadcast domain", don't immediately start thinking that is referring to the type of broadcast of a Layer 2 broadcast to a MAC-48. (It's probably talking about broadcast communications being sent to an IP address which is being treated as a "broadcast"-style "IP address".)

Terminology: routing

When people talk about "routing" traffic, that ends to refer to Layer 3 routing by using "IP addresses" (or maybe Layer 4+ details, like firewalls analyzing a destination TCP port number).

When you use a "Layer 2" switch, that will use a "CAM table" (internal to the switch) to try to only send out traffic onto the helpful "hardware interface" ("network port"), and not send traffic out to a "hardware interface" which doesn't get the traffic closer to the desired destination. Conceptually, this is like "routing" the traffic to the desired network interface. However, we typically don't call this "routing". We call it, "switching".

When you get to learning about different "network frame" formats that may support "VLAN" functionality, some traffic may be restricted in a way that feels very much like "routing". However, since this is still being handled by a piece of information which is a type of "network frame", and isn't using an "IP packet" with an "IP address", this will still be called "switching" rather than "routing". (The recommendation here is to: save the term "routing" for Layer 3+ communications.)

Default routing details

Routing can be complex, e.g. multiple routes might be manually added to send traffic through a single "gateway device" for a VPN. However, some of the most common "default routes" tend to get created rather automatically.

Destinations Specifying NICs after IPv6

Now, let's get an easy one out of the way, even if it is less commonly used for many communications. Although routing is often based on some details like what subnets are used on what interfaces, IPv6 has a big exception to that.

If an interface is using IPv6 and that interface is operational (it is marked as being "Up"), then RFC 4291 section 2.8 requires a "Link-Local address", which RFC 4291 section 2.5.6 specifies needs to start with "fe80:0:0:0:". (Actually, section 2.5.6 doesn't even say "fe80:", but "fe80:0:0:0:" is what the documented 10 bits and 54 bits end up turning into.) The subnet size, by this standard, is an IPv6 /64 subnet.

The lack of flexibility caused by that standard's precision means that addresses in the fe80::/64 range won't be able to use subnet size to help determine different priority based on the subnet size used, and won't be able to get much benefit from seeing which subnet is being used on a specific interface, because the same subnet is used on multiple interfaces.
that subnet being supported on all interfaces that are operational with IPv6.

So, some software may support directly specifying a NIC to use. (e.g., my answer about a % in a destination address) For instance, instead of an address like 2001:db8::5e72, you might specify a destination like 2001:db8::5e72%10. That "%10" is a "network identifier". Unix-like systems may use that "network identifier" as a device name (e.g. "eth0" for Ethernet device number zero seen on a Linux-based system, or "ral0" for a device using a specific Ralink-compatible driver on a BSD-based system). For Microsoft Windows, one way to see the "Interface ID" is to use "netstat -nr" and look at the first column of numbers (before the three periods) in the top of the output. (WMI can also be used.)

So, if you use "ping6 fe80::1234:5678:9abc:def0%eth1", then that means using the device named eth1 (which is a "network ID" on the local computer) to communicate with fe80::1234:5678:9abc:def0 (which is an IPv6 address that may very well be getting claimed and used by the remote computer).

Some software might not support using a % and a network identifier at the end of a destination address. Such software might have a different way (like using a command line parameter) to specify the desired NIC, or such software might not be able to effectively communicate with remote addresses on such subnets (at least, unless there is only one IPv6 interfaces up so there is no related ambiguity).

More common default routes

For each IP address assigned to a network interface, there are also some "bits" that will help determine whether other addressea are part of the same subnet. In IPv6 networks and often with IPv4 networks, this is commonly done by using a specific number of bits used to specify a subnet size. (This number of bits is commonly specified by a CIDR-style "network prefix" which is a slash followed by a number of "network prefix" bits, so /0 through /32 for IPv4, or /0 through /128 for IPv6.) IPv4 relied on using a "subnet mask" for a very long time, so that is also very commonly seen and used. Ancient standards did support subnet masks with a "network bit" set to one after the first zero; such uncommon subnet masks are not as widely supported anymore.)

Each IP address assigned to a network interface is also given a number of bits, specified by using the CIDR "network prefix" standard or the IPv4 "subnet mask" standard, to determine the size of a subnet that IP address is a part of. A routing rule is then made for each subnet that has an IP address being used by one of the local network interfaces. As multiple routing rules get made like this, typical priority may specify that smaller subnets are considered to be more specific rules, and so they may be given higher priority than bigger subnets.

These automatically-generated routing rules (unlike other routing rules) specify a network interface to use for outgoing communication. If the destination IP address fits within the addresses used by a subnet like this, then the machine will check the neighbor table for a MAC-48 address of the device using the destination IP address. (For IPv4, this is typically called the "ARP table", visible by running "arp -a".) If the neighbor table doesn't have cached information for that destination IP address, then ARP/IPv4 or NDP is used to get that destination MAC-48 address. Naturally, the "network interface" ("network card"/RJ45 port/antenna(s)) to use will be the interface with the subnet which contains both the destination IP address and the source IP address.

Another process can create a routing rule, and a very common one is this: when an automatic [IP] "address assignment" protocol (like DHCP/IPv4), it may provide some routing details. Most commonly, a "default gateway", also known as a "gateway of last resort", may be provided to cover any traffic that isn't already handled by a more specific routing rule. The IP address of this "default gateway" device will be used as the destination of any traffic for IPv4 0.0.0.0/0 and/or IPv6 ::/0, meaning this can be used to handle any and all traffic which doesn't have a more-specific route (like an address which fits in one of the subnets that also has a local IP address on it).

So, if a destination address is not on a local subnet, then a more specific route isn't used, which means that the destination MAC-48 address will MAC-48 address which corresponds to the gateway device's IP address. If the neighbor cache doesn't show that MAC-48 address, then an ARP/IPv4 or NDP request is broadcast to try to figure out that MAC-48 address.

Note that the IP packet's "destination IP" address never gets changed during this routing process. The IP address for a "default gateway" is only used for two purposes, the first of which is to help determine the Layer 2 frame's desired MAC-48 address. (The second purpose is to identify the outgoing network interface.)

That is the only NDP or ARP requests that will get used by the initiating computer. (If additional NDP or ARP requests need to get made, those will be initiated by a later computer.)

Identifying the outgoing interface

Now that the device has the destination MAC-48 address, it can create a Layer 2 frame to the next hop, which is either the ultimate destination or it is a "gateway" device. Sometimes, a process may have already identified which outgoing network interface to use (e.g, using a % in a destination with an IPv6 address might have this result). If the outgoing network interface hasn't been determined yet, then the IP address of the computer using that "next hop" can be compared to subnets to help determine which network interface to use.

With the destination MAC-48 address known, and the outgoing hardware interface identified, the machine has the information it needs to go ahead and send a Layer 2 frame to the next hop.

Layer 2's Usefulness

The reason why we tend to look up a MAC-48 address (using ARP or NDP) is because MAC-48 addresses are part of the design of some popular styles of "network frames". For example, Ethernet uses a MAC-48 address. (So, this is what you will be using if you are using a device called an "Ethernet card", or using something called an "Ethernet port" built onto a motherboard.) Another popular Layer 2 protocol is Wi-Fi, and it is also designed to use a MAC-48 address. (If you tried to use a variation which didn't use a MAC-48 address, then that wouldn't be following the existing Wi-Fi standards that have standardized on that design.) You likely won't ever use "Token Ring" anymore because equipment supporting the "Token Ring" protocol is older, and therefore slower, but if you did happen to come across that, I will note that Token Ring equipment uses a MAC-48 address.

Why do all these protocols bother with specifying a MAC-48 address? Well, let's consider what communication would be like if a MAC-48 address (or something similar) were not being used. There has been some technology that has operated that way (although at least some of this may be pretty old), so considering that can help reveal what life would be like without the benefit of MAC-48 addresses. Consider these examples:

  • a "console" connection where you plug a "serial cable" port (e.g. a "DE-9" port, often called a "DB8" port) into your computer, and the cable is wired with "rollover" wiring, and on the other end of the cable is an RJ45 port an RJ45 "console" port on the remote end. So then when you type, the other end instantly gets the signal, and can instantly see whatever the other end is replying.

  • a PSTN (public switched telephne network, a.k.a. old-school "dial-up" phone-based) connection, where you connect your computer has an internal dial-up modem with an RJ-11 port, and you use a telephone cable to connect that port to another RJ-11 port (which historically were commonly found on a wall, similar to an electric outlet). Then, to communicate anywhere, you instruct your modem to "pick up" the phone line connection, and "dial" a number, so then tones are emitted which cause the phone company to perform switching and figure out who you go to. In this case, your layer 2 "address" is commonly called a "phone number". (In modern days, just as we sometime use wireless e.g. Wi-Fi instead of Ethernet, we tend to instead use wireless phones to communicate to towers, but still use the same old "phone number" addresses, just like how Ethernet and Token Ring and Wi-Fi all use the same MAC-48 addresses.)

So, you don't need layer 2 at all (e.g. with the console connection), and you don't need a layer 2 which uses a MAC-48 address (e.g. PTSN), but if you do use a Layer 2 protocol that requires MAC-48 addresses (Ethernet, Wi-Fi, Token Ring), then you'll be wanting to use MAC-48 addresses because that is what your protocol requires.

So, what your software will actually be using is probably a "driver", e.g. a driver for an operating system like Microsoft Windows. Guess what Layer 2 protocol is supported by a "driver" for an Ethernet card? (Hint: the answer is what you would get if you tried to spell "tenrehtE" backwards.) Guess what Layer 2 protocol is supported by a "driver" for a Wi-Fi card? (Hint: the answer is what you would get if you tried to spell "iF-iW" backwards.)

Globally routed traffic

IP addresses are used for globally unique addressing.

Wrong. Well, umm, kind of right, somewhat.

What do you mean by "globally unique"? I guess that while the term itself may be just fine, in some contexts, I don't like that term as much when trying to describe an IP address. Each individual address is "unique". But is it what you mean when you say, "globally unique"?

I can assign any IP address I want to any device on my network. Does most of the world think that address should belong to a major company like Google? Doesn't matter! I can assign that address. Now my device has the same address as Google's. (Yes, of course, that may be problematic, as I get to in just a moment.) I would, therefore, say that the address is not "globally unique", because it is being used by both me and Google.

Now, just because my computer has Google's address, doesn't mean that the computer across the room will send the traffic in my direction. The computer across the room will rely on its routing information, stored in what it calls the "routing table" of tha device.

Routing tables can be updated manually. They are also commonly updated using an "IP address assignment" protocol, like DHCP/IPv4 handing out an IP address that gets used as a "default gateway". There are other "routing protocols" like OSPF which can update routing protocols. (RIPv2 might be a bit of a simpler design, so worth studying first to more quickly understand how a working routing protocol functions, but then OSPF may provide some advantages with updates being propogated more quickly.)

If I were to run a "DHCP server", I might be able to get that remote computer to send information to the computer I am using. Then, my computer can receive traffic meant for Google. If I'm doing this properly, by design, honestly handling the traffic, then maybe I'm running a legitimate server that performs routing functions. If I am doing that maliciously, I may be performing a "man-in-the-middle" ("MITM") attack. If I am running a DHCP server that isn't authorized/intended by the network administrator, I may be using a "rogue DHCP server". In theory, I could try to pretend to be Google when responding to an HTTP connection. Even if I did that, the remote end might not trust my communications, noticing HSTS pinning and me lacking a trustworthy certificate. (If I ran an "Active Directory" server and could get your machine running "Microsoft Windows Pro" to trust my machine about what certificates to trust, I might be able to mimic some websites. Although, I've heard that some of the world's most popular websites might actually have some cert details built into popular web browsers, thereby thwarting even that attack.)

So, I can't necessarily say that IP assignment is "globally unique". But what an IP address might be is "globally routable". If I am a local network administrator, I may be able to totally tinker with routing and cause all sorts of confusion on my local network. But my ISP(s) may have routing rules coming from more trustworthy sources, and just ignore any nonsense I might come up with on my local network. So, the chaos on my local network won't affect another customer of the same ISP. On a global scale, my local attack won't affect much. So, for the most part, the IP address is "globally routable".

There are two versions of IP addresses: IPv6 and IPv4. They have some similarities, so (commonly, including throughout this post) when I speak of "IP" addresses, the information applies to both IPv6 and IPv4.

If you have a typical Unicast public IP address, then that is likely to be pretty "globally routable".

If you have a "private" Unicast address, then that is not likely to be "globally" routable. Hopefully it is unique and routable on your local network segment. (If it is not unique on your local network segment, you have a duplicate IP address, and those can be problematic.)

The ranges of such private IPv4 addresses are in RFC 1597 (more famously "RFC 1918"... if there's one RFC number to memorize, it is that, because some documentation may refer to an "RFC 1918" address, so it is good to understand what that is referring to). The ranges of such IPv6 addresses are fc00::/7, which consists of fc00::/8 and fd00::/8. fc00::/8 addresses were centrally managed and that centralized resource doesn't hand out addresses any more, so standards-compliant behavior is to not use that for new setups (although I actually doubt you're likely to have a problem if you did use it), and fd00::/8 is where modern IPv6 addresses come from.

Additional discussion of traffic handling

(You may wish to look over: my comment to a question about how traffic may be routed, and my answer to the same question: my answer goes into some details about how communications get handled with the various steps.)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.