ABSTRACT

Exhibit 1 lists the major components associated with the end-to-end analog-to-digital coding, transmission, and conversion back to voice of a packet carrying a digitized portion of a conversation. Note that the fixed and variable delays associated with the end-to-end transmission of a packet can generally range between approximately 60 and 289 ms. To put this time in perspective, a typical human ear can accept up to 250 ms of delay every once in a while before the conversation becomes annoying. While full-duplex transmission is desirable for most data applications, it is not useful and in fact creates problems when voice is carried over a network. This is because two rational humans do not have a conversation by talking at the same time. Instead, rational humans wait for one party to finish talking before the other party to the conversation begins a response. If the latency or delay begins to exceed a quarter of a second for significant portions of a conversation, the conversation will begin to resemble a CBradio conversation, with each party having to say “over” to inform the other party it is all right to talk. Otherwise, the delay will result in one party periodically thinking the other has finished talking when they have not, resulting in a full-duplex conversation that requires one party to stop and the other party to begin anew. In fact, the International Telecommunications Union (ITU) standard for one-way delay for a voice call requires a maximum latency of 150 ms, which ensures that the call will not turn into a full-duplex conversation. For most organizations, a maximum latency of 200 ms and a mean latency approaching or under 150 ms should be sufficient to provide good quality of reconstructed voice.