March 2009 - Switched Ethernet Latency

Communications Insider: March 2009: Volume 4 Issue 1

Switched Ethernet Latency

Ethernet has challenged certain alternative network technologies because of its high data throughput rates. However, despite its throughput advantages, Ethernet has a key shortcoming: the exact time it takes any single packet to traverse the network cannot be predicted.

Unpredictable latency is not acceptable for certain types of applications where message delivery must occur within a specific time limit. So the question becomes: If we cannot know the actual time, can we calculate the maximum time a packet will need to travel from one end of the network to the other? Fortunately, the factors which influence latency in a switched Ethernet network are well known, so it is possible to make this analysis.

Worst Case Latency Analysis
To come up with the worst case scenario that would result in the highest latency, let’s assume the network topology shown in Figure 1. There are four devices, A – D, communicating via this switch:

  • Device A is a main controller
  • Device B and C send large (1518 Byte) frames at high data rates to Device A (assume they are video cameras streaming data.) Traffic from B and C to A has a low priority.
  • Device D is a latency critical controller, which sends 1000 Byte packets every 100ms to Device A. Traffic from D to A has a high priority.

In this example, latency is the time from when the first packet bit is clocked out on the Ethernet transmit link of Device D until the last bit of the same packet is clocked in on the receive link of Device A. This analysis deals only with the Ethernet network and does not include time required by operating systems on Device D and Device A.

Note: In this example, the switch is a Gigabit Ethernet switch, but the links are established at 100Mbit/sec. This will influence the internal switch capacity and the maximum data/packet rate each port can sustain. Abbreviations: B=Byte, b=bit, ms=millisecond, us=microsecond.

Using Figure 1, with a 24 port switch capable of line rate performance, or 48Gb/s throughput, the latency can be calculated as follows:

  1. Time to transmit all the bits from Device D to the input FIFO of Switch Port 24:
    (1000B(packet) + 20B(preamble, inter-frame gap) )*8b / 100Mb/s = 81.6 us
  2. Time to write packet data into switch memory:
    In a worst case scenario, all the other 23 ports have just received large 1518B packets and are ahead of us in writing it into the memory. Plus, all 24 ports are waiting to get a max size 1518B packet from the memory. At the switch’s 48Gb/s throughput:
    ( 23*1518B*8b + 24*1518B*8b ) / 48Gb/s = 11.89 us

    To write our critical packet into memory will take an additional:
    1000B*8b / 48Gb/s = 0.167 us

    So the total time to store the packet will be:
    11.89 us + 0.167 us = 12.06 us

    NOTE: We have assumed that all pending packets are also of high priority, which is typically not the case. In a more realistic case, the waiting time for memory access should be reduced to the time it takes to write in one packet.

     
 
3. Time to perform a lookup:
The address lookup engine operates in parallel and does not add extra time.
   
4. Time to read packet data from memory is the same as time to write, see Step 2:

Total time to read the packet will be: 11.89 us + 0.167 us = 12.06 us
   
5. Time to transmit the packet:
In this step, we have to consider how full the output queues might be, and in a worst case, let us assume that there are 23 other high priority packets waiting in the queue.

Then, time to wait: 23*(1000B + 20B (preamble, IPG) )*8b / 100Mb/s = 1876 us

And, time to transmit: (1000B + 20B (preamble, IPG) )*8b / 100Mb/s = 81.6 us

Total time: 1876 us + 81.6 us = 1957 us

NOTE: Waiting time in the output queue could be by far the highest contributor to the overall latency, hence it is important to accurately establish how many independent devices can be potentially sending high priority packets to the same destination device. If, for example, there are only 10 such devices, instead of 24 as assumed in the calculation above, waiting time can get reduced down to: 9*(1000B + 20B (preamble, IPG) )*8b / 100Mb/s = 734 us

Putting it all together we get: 81.6us + 12.06us + 12.06us + 1957 us = 2062 us or 2.06 ms
 

Conclusion
The calculations shown here give an example of how worst case latency in a Switched Ethernet network can be estimated. There are a few other things, however, that are important to note:

  1. Ethernet is a Best Effort technology. Switches will drop packets as soon as their memory and FIFOs fill up, hence it is important to plan for lower than maximum network utilization.
     
  2. TCP and UDP protocols used over Ethernet are notorious for generating bursty traffic patterns. Even when a network is underutilized, if every network device starts sending a large number of packets at once, the switch packet memory might overflow. Therefore, network design should look into how much packet memory is available and allocate memory based on packet priority, starting to drop low priority packets first if overflow occurs.


  3. Another aspect of network design is the end nodes themselves. The operating system needs to support latency expectations. If the operating system is tied up with some task for an extended amount of time without servicing Ethernet packets, that in turn might create FIFO congestion on the device itself. As a response, the Ethernet controller might invoke Ethernet flow control to throttle down packet transmission from the switch, moving congestion further upstream in the network. Therefore network design shall insure that devices are capable of handling the expected amount of traffic, and that they are able to assign packet priorities and differentiate their handling.