Introduction to TCLw and Echo Response
Echo in VoIP telephone calls
Echo is an undesirable characteristic of a telephone call where the user of a telephone hears a copy of the sound transmitted by their own telephone that is delayed in time. Echo is a greater concern for VoIP telephones than it was in analog telephony. This section provides an introduction to echo and how it is generated in a telephone call.
When sound is applied to the handset transmitter of a VoIP telephone, it is digitized, processed, encoded and transmitted from the network port of the telephone as an IP packet. In addition to being sent to the network, the sound appears at the receiver of the same telephone. There can exist several different paths for the transmitted sound to reach the telephone’s receiver. Some of these paths are intentional and desirable, while others are not desired and produce objectionable signal components. The sound that is heard in the receiver is the composite of the send signal being returned to the receiver by the various paths. Each path may have a different frequency response and delay characteristic.
The following figure shows a simple VoIP telephone call between two telephones. This example shows some of the possible signal paths for the send signal to be returned to the receiver of the same telephone.
Direct acoustic coupling exists through the plastic and cavities of the near-end handset itself. In addition, the near-end telephone intentionally sends some of the electrical signal from the transmitter to the receiver. These signal components make up the Sidetone of the telephone. Sidetone is desirable because the user perceives that the telephone is active. It is natural for the talker to hear their own voice when speaking. The sidetone simulates this.
The sidetone signal has only a small amount of delay because it is generated locally in the telephone. The delay of the sidetone is too short to be perceived by the user of the telephone.
For the simple VoIP connection between two telephones shown above, transmit and receive data travel in separate packets on the IP network. There is no possibility of echo being generated until the transmit packets arrive at the far-end telephone.
When the transmit packets reach the far-end telephone, they are decoded and converted to an electrical signal that is used to drive the receiver of the far-end telephone. Electrical crosstalk in the far-end telephone couples some of the receive signal back into the transmit path. In some cases, electrical crosstalk may also occur in the handset cord. These signal paths can generally be avoided by good circuit design and layout techniques.
The receiver in the handset of the of the far-end telephone generates an acoustic output that is coupled back to the transmitter. This coupling can occur through the air, and also through the handset itself.
The combined signal from both the electrical and acoustic paths of the far-end telephone is digitized, encoded, and transmitted back to the near-end telephone where it appears at the receiver. These components of the received signal have longer delays than those generated in the near-end telephone because they have passed through A to D and D to A converters, encoders / decoders, and the IP network.
When the signal delay exceeds about 15 ms, the user perceives the delay between the transmit signal and the signal in the receiver. These delayed components of the received signal are referred to as the echo. The presence of echo is objectionable and lowers the perceived quality of the connection.
The signal paths that cause echo exist for both Analog and VoIP telephones. For the case of POTS telephones, an additional component of the receive signal is generated by imperfect matching whenever a 2 to 4 wire hybrid is used. Fortunately, the delay in the analog telephone system is low enough that these signals are not generally perceived as echo.
For VoIP telephony, significant delays are present in telephones and the network. A VoIP telephone will typically capture a 20 ms frame of audio from the transmitter, encode the audio using a codec, and then transmit a packet to the network. The send latency of the telephone is the time from the start of capture of an audio frame, to the time the transmit packet appears at the network port of the telephone. A telephone should have a send latency of less than 35 ms.
The send latency can be improved by capturing smaller frames of audio and by increasing the speed of encoding in the telephone, but the potential to reduce frame size is limited by efficiency concerns in the IP network, and the smallest packet size in common usage is 10 ms.
For the receive direction, latency is the time from when a packet arrives at the network port of the telephone to the time when the sound begins to play at the receiver of the telephone. A good telephone should have a receive delay of less than 65 ms (per TIA-810-A).
The receive latency is longer than the transmit latency because the telephone is required to maintain a buffer of packets to account for jitter in the timing of received packets, and to re-order packets that traveled to the telephone via different network paths. As with Send latency, the potential to reduce the receive latency is limited.
From the typical send and receive latency delays in VoIP, it is apparent that any component of the transmit signal that is returned to the receiver via a path outside the near-end telephone will have enough delay that it is clearly perceptible to the talker as being an echo. The round trip delay time from the near-end telephone to the far-end telephone and back again is typically on the order of two hundred milliseconds.
For the case of more complex networks of VoIP telephones, there are other potential sources of echo. For example, a network of VoIP telephones may connect through a gateway to the PSTN network. The gateway will have a 2 wire to 4-wire hybrid. Since the hybrid will not perfectly match the impedance of the PSTN network line, an echo is generated, and because the VoIP telephone has delay, the echo will be perceptible to the user of the telephone.
Echo control for VoIP telephones
The Send and Receive Latencies of VoIP telephones introduce enough delay that any portion of the transmit signal that returns to the receiver of the same telephone will be perceived as echo. As a result, echo control is required.
For centralized telephone systems such as digital cellular networks, it is possible to locate echo cancellers at a central location in the network. For VoIP, calls are established directly between telephones without any central location where echo control can be implemented. As a result, VoIP telephones are required to control the amount of echo that they generate. Simply, this means that a signal applied to the network receive port of the telephone should not be transmitted back to the network on the transmit port.
Note that a telephone controls echo that it generates; it does not attempt to reduce echo caused by other telephones or gateways in the network; it is the responsibility of each network device to control the echo that it generates.
One consequence of strategy is illustrated in the following figure:
When a telephone call is established between two telephones, one of which has a good design that generates little echo, and another of poor design that generates a large echo signal, it will be the user of the well designed telephone that hears echo. This occurs because the poorly designed telephone couples the signal it receives from the well designed telephone back into the transmit path and sends it back to the well designed telephone, where the user hears a delayed copy of their own voice. In the other direction, the well designed telephone does not couple the signal it receives into the transmit path, and so the use of the poorly designed telephone does not hear echo.
This situation can mislead the user of the poorly design telephone to believe that their telephone is working well, and mislead the user of the well designed telephone to believe that their telephone has an echo problem when in fact the opposite is true.
For Handset telephones, echo control is relatively simple to achieve because the receiver is intended to be close-coupled to an ear and so its output amplitude is low. Similarly, the transmitter microphone is relatively close to the talker’s mouth, and so its sensitivity is not high. This, combined with the relative positions of the two transducers, tends to result in only a small amount of acoustic coupling of the received signal back to the transmitter.
For Handsfree telephones, echo control is much more complex. Compared to a handset telephone, the speaker of the handsfree telephone is much louder because it is not closely coupled to the user’s ear, and the microphone of the handsfree telephone is more sensitive because it is further from the user’s mouth. As a result, the acoustic coupling of the speaker to the microphone is high.
Various techniques have been developed to control echo. Handsfree telephones frequently employ more than one method to achieve echo control. Methods of echo control include the following:
- Echo Cancellers
An echo canceller is a software algorithm that filters the transmit signal of the telephone and attempts to remove the portion of the signal that is due to the transmitter picking up the receive output of the telephone. When the telephone receives a signal from the network, the echo canceller will dynamically adjust its parameters to try and minimize the amount of echo being transmitted. Typically this adjustment is only done when there is no transmit signal being applied to the telephone (single-talk mode).
The cancellation of the echo is never perfect.
- Non Linear Processing
When the user of the telephone is not speaking, it is possible to apply attenuation to the microphone path. The amplitude of the echo signal is reduced by the amount of the attenuation. This is a relatively simple method of reducing echo, but is only applicable for single-talk conditions.
- Non Duplex / Partial Duplex Operation
Handsfree telephones can also be designed to switch between distinct sending and receiving states. Software in the telephone determines which state the telephone should operate in based on the relative signal levels in the transmit and receive directions. Signals in the opposite direction to that in which the telephone is operating are attenuated. Telephones that attenuate by up to 20 dB are said to be partial duplex, and those that attenuate by more than 20 dB are said to be non duplex. A telephone with less than 3 dB attenuation is considered full duplex.
- Location of Transducers
The placement of the speaker and microphone on the telephone affect the acoustic coupling between them. Echo
performance can be improved by the careful design and placement of these transducers.
Measurements of Echo Performance
Echo Response is a measurement used to quantify the echo performance of a telephone. It is a measurement of the ratio of the transmit output signal to the applied receive signal, measured at the network port of the telephone. Echo response is measured over a range of frequencies.
Weighted Terminal Coupling Loss (TCLw) is an average of the echo response taken over a range of frequencies. TCLw is a single number that indicates how well the telephone attenuates it’s echo signal. TCLw is expressed in dB. A higher TCLw indicates more attenuation of the echo.
Telephony standards typically specify echo control requirements in terms of TCLw. Often, the TCLw measurements are to be adjusted to normalize the result based on the nominal Send Loudness Rating (SLR) and Receive Loudness Rating (RLR) of the applicable standard. For example, the TIA-810-A standard for narrowband IP telephones requires a normalized TCLw of 52 dB for handset, and a normalized TCLw of 45 dB for handsfree.
As can be seen from the above limits, the echo signal path requires a large amount of attenuation in order for the telephone to provide acceptable performance. Due to the high attenuation, the measurement of TCLw involves the measurement of low amplitude signals. The maximum value of TCLw that can be measured is limited by noise.
TCLw tests must be performed in a high quality anechoic chamber to eliminate the effects of background noise. The telephone itself also generates noise. Usually it is the send noise of the telephone that is the limiting factor on the maximum TCLw that can be measured. It is important to be aware that the maximum TCLw that can be measured may be relatively close to the required limits for TCLw.
If a telephone generates excessive noise, either high send noise, or extreme receive noise, then it may not be possible to obtain a TCLw measurement that passes the required limits. Noise problems should be corrected before attempting to improve TCLw performance unless it can be verified that the noise is not limiting the TCLw reading.
When the normalized TCLw is being measured, it is important that the send and receive loudness ratings of the telephone are not significantly quieter than nominal. If the telephone is too quiet, the normalization will subtract from the TCLw reading. Since the maximum TCLw is limited by noise, it may become impossible to achieve a TCLw that passes the required limits. If a telephone has problems with send and receive loudness, these should be corrected before making measurements of normalized TCLw.
The type of test signal used also affects the maximum TCLw that can be measured. In terms of noise immunity, a sine wave is the best test signal to use. The sine wave has the lowest peak amplitude relative to it’s RMS level. This allows high stimulus levels to be applied to the telephone without clipping, and all the energy of the test signal is applied at a single frequency at any given time. Measurement can be performed through a narrow bandpass filter to remove much of the noise from the measurement. As a result, a sine wave stimulus signal is least affected by noise.
White noise or pink noise can also be used as test signals for TCLw. These signals have higher peak amplitudes, which limits the maximum receive level that can be used for testing. The energy in these signals is spread across a range of frequencies. Send noise of the telephone has a greater effect on these test signals than sine wave signals.
Both the noise and sine wave signals can be modulated or pulsed to reduce that chance that the telephone will interpret the test signal as a steady state background noise. Some telephones will change gains, switch states, or attempt to mute such signals.
Voice or artificial voice can also be used as test signals. These signals are the least desirable in terms of noise immunity. They typically have peak amplitudes approximately 20 dB about the RMS level. With a telephone having high noise it may be difficult to pass TCLw limits when using these test signals. Even send noise that is within the limits permitted by standards may be too much to allow TCLw measurements using voice like signals.
The measured value of TCLw will vary depending on the test conditions. Telephones that use Non Linear Processing will typically have better TCLw when tested in single-talk mode, that is, when sound is applied only at the network receive port. In this case, the telephone is able to apply attenuation to the transmit path that reduces the amplitude of the echo signal. When these telephones are tested in double-talk mode, where a second sound source is applied at the transmitter, these telephones will disable the transmit path attenuation, resulting is a lower TCLw.
The ability of an echo canceller to remove the receive signal from the transmit path depends on the characteristics of the receive signal. Typically, an echo canceller will work better with sine wave test signals than with a more complex signal such as noise, voice or artificial voice. TCLw testing using sine wave signals is not recommended when an echo canceller is being used because the results obtained will likely not be representative of the performance in actual use. It is necessary to use a different test signal, which is less desirable in terms of immunity to noise, to test these telephones.
Echo Response and TCLw are not the only measurements that can be used to characterize the echo performance of a telephone. Temporally Weighted Terminal Coupling Loss (TCLT) can also be used. This method, describes in IEEE 1329-1999, was designed to take into account time dependent behavior and psycho-acoustic effects of echo, and may also help to overcome the limitations on TCLw measurement due to a telephone’s sending noise level.
For information on how to upgrade the IP Phone Test System to perform the TCLw and Echo Test Kit contact us.