Analysis of sound data streamed over the network

2013 
FEJFAR JIŘI SŤASTNÝ JIŘI, POKORNÝ MARTIN, BALEJ JIŘI, ZACH PETR: Analysis of sound data streamed over the network. Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, 2013, LXI, No. 7, pp. 2105–2110 In this paper we inspect a diff erence between original sound recording and signal captured a er streaming this original recording over a network loaded with a heavy traffi c. There are several kinds of failures occurring in the captured recording caused by network congestion. We try to fi nd a method how to evaluate correctness of streamed audio. Usually there are metrics based on a human perception of a signal such as “signal is clear, without audible failures”, “signal is having some failures but it is understandable”, or “signal is inarticulate”. These approaches need to be statistically evaluated on a broad set of respondents, which is time and resource consuming. We try to propose some metrics based on signal properties allowing us to compare the original and captured recording. We use algorithm called Dynamic Time Warping (Muller, 2007) commonly used for time series comparison in this paper. Some other time series exploration approaches can be found in (Fejfar, 2011) and (Fejfar, 2012). The data was acquired in our network laboratory simulating network traffi c by downloading fi les, streaming audio and video simultaneously. Our former experiment inspected Quality of Service (QoS) and its impact on failures of received audio data stream. This experiment is focused on the comparison of sound recordings rather than network mechanism. We focus, in this paper, on a real time audio stream such as a telephone call, where it is not possible to stream audio in advance to a “pool”. Instead it is necessary to achieve as small delay as possible (between speaker voice recording and listener voice replay). We are using RTP protocol for streaming audio. Dynamic Time Warping, sound signal processing There are two scenarios when transmitting audio data through the network: streaming audio from a server to the user without a need for a real-time response but with a need for a good quality. Another case is an audio streaming with a need for the realtime response. The example of the fi rst scenario can be given by a user listening to a favourite Internet broadcast. It is not important for the user listen to sound events (music / words) at the same time as they are recorded (in case of live broadcasting). It is possible to hear such a broadcast with a 10s delay or more. The radio time sound signals might be inaccurate, but they are not used nowadays in the Internet radio broadcasting, as there are diff erent methods for clock synchronization. In this case we can use a long buff er when transmitting audio, which overcomes network bandwidth bottlenecks. When we are transmitting audio recorded earlier, we can send it to the user in advance, so it is not predisposed to suff er from network traffi c dynamic changes. In the second case, there is a completely diff erent scenario, when we transmit voice of users talking to each other. Delay about 150 ms (sometimes 250 ms) will be disturbing in that case and when the delay is longer the communication turns to be inarticulate. That’s the reason why we deploy diff erent QoS mechanisms to give the voice traffi c precedence over the data and video traffi c. 2106 Jiři Fejfar, Jiři Sťastný, Martin Pokorný, Jiři Balej, Petr Zach We are comparing original recordings (broadcasting from computer A) with its counterparts (recorded on computer B) transmitted over the network utilizing QoS or without QoS being deployed. The network is in both of the cases congested with traffi c generated by voice (VLC), video (RTSP) and data streams (wget and web server) (Zach, 2012). METHODS AND RESOURCES
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    0
    Citations
    NaN
    KQI
    []