Audio Streaming and Song Form Pattern
Kevin Curran, Magee College

Robust Audio Streaming using Song Form Pattern Recognition

When receiving streaming media over a low bandwidth wireless connection, users can experience not only packet losses but also extended service interruptions. These dropouts can last for as long as 15 seconds. During this time no packets are received and, if not addressed, these dropped packets cause unacceptable interruptions in the audio stream. A long dropout of this kind may be overcome by ensuring that the buffer at the client is large enough. However, when using fixed bit rate technologies such as Windows Media or Real Audio, this may only be done by buffering packets for an extended period (10 seconds or more) before starting to play the track. During this period, many users are likely to lose interest or become frustrated.

The focus of this research is to examine new methods for streaming music over bandwidth constrained networks. One overlooked method to date, which can work alongside existing audio compression schemes, is to take account of the semantics of the music.

Songs in general exhibit standard structures that can be used as a forward error checking mechanism. For instance, many songs start with a verse, then proceed to a chorus and then repeat this structure until the end with little variation. Suppose we assume that a song starts streaming with the standard 10 second buffering interval at the start. Lets also assume that this is sufficient to provide smooth delivery of the initial verse of the song. The chorus follows next with say, 15% losses. In the meantime, an audio pattern-matching algorithm is classifying the song into well-known forms such as chorus and verse. Next time that we approach the verse, the pattern matching algorithm will ‘pick up the trail’ after a few chords, and correctly predict that the streaming song is now re-entering the verse part.

The idea here is that whenever packets are lost in the stream, corresponding packets are inserted into the stream from the buffered section of the ‘previous’ verse thus providing a perfect match. What we need is a novel error concealment algorithm for repairing streaming audio over error prone networks using the semantics of the music. Pattern matching run-time algorithms would do most of the work in identifying portions of the audio stream and when a dropout does occur - a relevant section of the buffered audio is inserted so as to create a perfect match for the lost audio.

Links

JMusic - Java framework for working with MIDI files

Task

Use JMusic to implement the above scenario.......Any questions?

Home