Traceroute Lies! A Typical Misinterpretation Of Output

Sometimes a user with performance issues will proudly present me with a traceroute and point to a particular hop in the network and accuse it of being the problem because of high latency on the link. About 1 time in 1000 they are correct and the link is totally saturated. The other 999 times, well, let me explain.

Traceroute

Traceroute Output

Here’s a typical traceroute I might be sent by a user (IPs and hostnames are altered to protect the innocent):

Look! the user cries, The link from atl-edge to ga-core is clearly all messed up because the latency goes from 20ms to 106ms!

Oh No It Doesn’t

Isn’t it amazing that the link in question apparently adds 90ms of latency, yet the link between hops 6 and 7 (the jump from east coast USA to the United Kingdom) appears to show no latency increase at all? In fact, isn’t it odd that the latency for every hop from 3 onwards is about the same?

I know that many people reading this will already know why this is, but for those who do not (and there’s no shame in that), this is indicative of there being an MPLS network in the path, and the MPLS Provider Edge (PE) is the router at hop 2.

Why?

Remember that one of the benefits of MPLS networks is that the network core (the Provider, or P routers) doesn’t have to know anything about the routes at the edge. The two things the P routers need to know are 1) where all the other MPLS-capable routers are (usually via OSPF or IS-IS) and 2) where to forward incoming MPLS frames based on the incoming labels. They are relatively dumb switches, and this which allows them to move traffic around faster than a native IP router could. So what’s the problem?

MPLS Networtk

Traceroute relies on sending packets with an incrementing TTL; when the TTL expires, the router on which it expires will usually send back an ICMP message to the sender warning that the TTL expired in transit, and that’s how traceroute finds out about each hop in the network. Looking at the MPLS diagram above, what happens when the TTL expires on a P router? The P routers have no knowledge of the edge networks, so how could it route an ICMP packet back to a source it doesn’t know about? MPLS labels are one-way to the destination and there’s no return path included, so the P router does the only thing it can: it snags the outgoing label it was going to use and creates a new MPLS frame containing the ICMP TTL Expired message, and this frame gets switched all the way to the destination PE router (PE-B in this case).

PE-B receives the frame, looks at the ICMP message within it and looks at the destination address, which is my PC. As a PE router, it knows how to get to my PC (which label to use to send it into the MPLS network again), packages the ICMP packet up inside MPLS and sends it back into the MPLS network.

In other words, any ICMP TTL Expired messages generated within the MPLS network actually flow to the far side of the MPLS network and then back again, which is why they all have a similar TTL, and why in this example all thes TTLs are large (because in this case they would have to cross from US to UK then from the UK to US in order to get back to my PC):

MPLS - TTL Expired

If you’ve not seen this before it can be very confusing. As a result I’ve seen time wasted on troubleshooting links which actually have no problems, all thanks to traceroute.

Side note: Not all MPLS networks will push the incoming packet’s TTL into the MPLS frame, so the TTL will not always expire in the middle of the MPLS network. An MPLS network may therefore be seen as a single hop by the ICMP packet, so insight will not always be available into the internal nodes in an MPLS network.

5 Comments on Traceroute Lies! A Typical Misinterpretation Of Output

  1. Nice article. Never knew about such behaviour.

    Wouldn’t TTL increment happen only when there are passing hops/routers (aka Broadcast Domains or L3 Boundaries). And typically when TTL expires, that means you’ve reached the max. possible hops on the path for that packet.

    Corelating this to your explanation:
    1) This means in this case, us-ga-core did not have the hop/route to propagate forward. This is acceptable for a switch. So why would a switch attempt to decrement TTL when it is not crossing broadcast domain?
    2) Can’t we simply do a traceroute with a higher TTL in the packet if we see this behaviour?

    • Hi Hemant. Typically when TTL expires, you’ve reached the maximum possible hops (this is really for loop prevention), yes; but traceroute intentionally sends packets with a low hop count (starting at TTL=1 and incrementing by 1 each time) in order to trigger ICMP TTL Expired messages from routers along the path, thus revealing the path(s) to you in the process.

      1) us-ga-core doesn’t have a route back to the source of the expired packet because it’s an MPLS P-router. All P-routers know about is how to get to other P routers and the PE routers. Even then, while it’s a router, it’s running MPLS so it’s “switching” MPLS packets based on the MPLS header rather than the embedded packet’s destination IP.

      2) Not sure how this would help. You *need* TTL to expire in order to get the ICMP response. Raising TTL would simply mean you missed finding out about hops along the path.

Leave a Reply

Your email address will not be published.


*