Wednesday, November 16, 2011

Collisions, Backoffs, and Retries



Multiple radios that are in range of each other and have data to transmit need to take turns. However, the particular flavor of 802.11 that is used in Wi-Fi devices does not provide for any collaboration between devices to ensure that two devices do take turns. Rather, a probabilistic scheme is used, to allow for radios to know nothing about each other at the most primitive level and yet be able to transmit.
This process is known as backing off, as is the basis of Carrier Sense Multiple Access with Collision Avoidance, or CSMA-CA. The process is somewhat involved, and is the subject of quite a bit of research, but the fundamentals are simple. Each radio that has something to send waits until the channel is free. If they then transmitted immediately, then if any two radios had data to transmit, they would transmit simultaneously, causing a collision, and a receiver would only pick up interference. Carrier sense before transmission helps avoid a radio transmitting only when another radio has been transmitting for a while. If two radios do decide to transmit at roughly the same time—within a few microseconds—then it would be impossible for the two to detect each other.
To partially avoid the collisions, each radio plays a particular well-scripted game. They each pick a random nonnegative integer less than a value known as the contention window (CW), a small power of 2. This value will tell the radio the number of slots, or fixed microsecond delays, that the radio must wait before they can transmit. The goal of the random selection is that, hopefully, each transmitter will pick a different value, and thus avoid collisions. When a radio is in the process of backing off, and another radio begins to transmit during a slot, the backing-off radio will stop counting slots, wait until the channel becomes free again, and then resume where it left off. That lets each radio take turns (see Figure 1).

 
Figure 1: The backoff procedure for two radios
However, nothing stops two radios from picking the same value, and thus colliding. When a collision occurs, the two transmitters find out not by being able to detect a collision as Ethernet does, but by not receiving the appropriate acknowledgments. This causes the unsuccessful transmitters to double their contention window, thus reducing the likelihood that the two colliders will pick the same backoff again. Backoffs do not grow unbounded: there is a maximum contention window. Furthermore, when a transmitter with an inflated contention window does successfully transmit a frame, or gives up trying to retransmit a frame, it resets its contention window back to the initial, minimum value. The key is to remember that the backoff mechanism applies to the retransmissions only for any one given frame. Once that frame either succeeds or exceeds its retransmission limit, the backoff state is forgotten and refreshed with the most aggressive minimums.
The slotted backoff scheme had its origin in the educational Hawaiian research network scheme known as Slotted ALOHA, an early network that addressed the problem of figuring out which of multiple devices should talk without using coordination such as that which token-based networks use. This scheme became the foundation of all contention-based network schemes, including Ethernet and Wi-Fi.
However, the way contention is implemented in 802.11 has a number of negative consequences. The denser and busier the network, the more likely that two radios will collide. For example, with a contention window of four, if five stations each have data, then a collision is assured. The idea of doubling contention windows is to exponentially grow the window, reducing the chance of collisions accordingly by making it large enough to handle the density. This would allow for the backoffs to adapt to the density and business of the network. However, once a radio either succeeds or fails miserably, it resets its contention window, forgetting all adaptation effects and increasing the chance of collisions dramatically.
Furthermore, there is a direct interplay between rate adaptation—where radios drop their data rates when there is loss, assuming that the loss is because the receiver is out of range and the transmitter's choice of data rate is too aggressive—and contention avoidance. Normally, most devices do not want to transmit data at the same time. However, the busier the channel is, the more likely that devices that get data to send at different times are forced to wait for the same opening, increasing the contention. As contention goes up, collisions go up, and rate adaptation falsely assumes that the loss is because of range issues and drops the data rate. Dropping the data rate increases the amount of time each frame stays on air—a 1Mbps data frame takes 300 times the amount of time a 300Mbps data frame of the same number of bytes takes—thus increasing the business of the channel. This becomes a vicious cycle, in a process known as congestion collapse that causes the network to spend an inordinate amount of time retransmitting old data and very little time transmitting new data. This is a major issue for voice mobility networks, because the rate of traffic does not change, no matter what the air is doing, and so a network that was provisioned with plenty of room left over can become extremely congested by passing over a very short tipping point.

No comments:

Post a Comment