Packing It In: The Evolution of Online Video and Audio Tech, Part 1

The explosion in Web-delivered music and video we’re seeing today just wouldn’t be possible without the use of sophisticated encoded compression algorithms, or codecs, and the file storage formats in which compressed audio, images and data are saved. As Internet bandwidth and broadband access has expanded, so has the transmission of much denser digital audio and video files. Compression algorithms and file format development has had to keep up and in-step.

Business-wise, developing proprietary compression technology has been a key to securing market “lock-in” for individual companies — the predominance of Apple’s iTunes in the digital music world, and efforts by Microsoft and a variety of other industry leaders to supplant it, being perhaps the best case in point.

It’s long been recognized that proprietary control of codecs and compression technology can hold back market development and growth, however. That has led leading players spanning the computing, telecommunications and entertainment industries to band together and coordinate the development of industry-wide audio-visual standards, such as Mpeg-4 AVC (Advanced Video Coding), also known as H.264.

Taking this reasoning a step further, proponents of open source are going to great lengths to open up the development, distribution and resolution of compression technology and standards even further, efforts that at times have led to court cases and extended and expensive patent and intellectual property rights litigation.


Developed by a Joint Video Team assembled by the ITU-T (International Telecommunication Union Telecommunication Standardization Sector) Video Coding Experts Group and the Moving Pictures Experts Group, a working group of ISO-IEC (International Organization for Standardization-International Electrotechnical Commission), Mpeg-4/H.264 has become the compression codec of choice for just about any company involved in delivering multimedia content over the Internet, as well as being used in the Blu-ray high-definition disk format.

Vendors of products that make use of H.264/Mpeg-4 AVC are expected to pay patent licensing royalties for the patented technology that they make use of. A separate and unaffiliated organization, Mpeg LA, administers the licenses for applicable H.264/Mpeg-4 AVC patents, as well as a variety of other patented H.264/Mpeg-4 AVC and previous generation compression technology.

Paltalk, vendor of personal and business-to-business multimedia Web conferencing services that encompass voice, video and data communications, was one of H.264/Mpeg-4 AV’s early adopters.

“H.264 seems to be emerging as one of the best video codecs today,” commented Perry Scherer, Paltalk founder and chief technical officer. “These streams produce some of the lowest bit rates for equivalent video quality. Macromedia Flash has recently supported H.264, which is another bit of data showing its prominence. It is built into QuickTime 7 and is now mandatory for … Blu-ray specifications.”

Fast-Forward From Past to Present

It was only a few years ago that most people were using dial-up connections to connect to the Internet. Audio compression codecs were necessary in order “to transport audio in any sort of real-time fashion,” Scherer explained. “Many of the audio codecs available then and now compressed to bit rates significantly below the bandwidth limitations of a dial-up modem. Real-time video of any quality is constrained by upload limitations and requires efficient and high-quality codecs for transport.”

Paltalk around the turn of the millennium initially introduced a “one-on-one” audio/video/data Web communications service. It has since moved on to “one-to-many” and “many-to-many” distribution platforms, which has required “higher and higher bandwidth loads on the servers and clients. Without video codecs to limit stream size, this sort of product would not be possible,” he commented.

Paltalk started out using the GSM 6.10 and Motion JPEG for audio and video compression, respectively, Scherer recounted. “GSM 6.10 provided us with slightly better than cellular quality, and the Motion JPEG was slowed in frame rate to accommodate any connection speed.”

While this served Paltalk’s user community well during the days of dial-up modem connections, things have changed a lot since then. “Now we use the state-of-the-art H.264/Mpeg-4 standard for our video streams and a variety of audio codecs tailored to the content of our rooms: GSM 6.10, Speex (for speech rooms) and Siren (for music and general sound content). Speex provides great speech quality over the range of human vocalization while Siren provides great music quality for rooms with purely music/DJ content,” he explained.

State of the Art in Audio Compression

A host of incremental improvements and new features comprise what today is considered “state of the art” in audio compression codecs and technology, according to Shy Keidar and Meni Berman, members of the team behind Musepack, an open source audio codec and supporting technology that according to the two sources is the second-most widely used non-proprietary, general purpose and open source “lossy” audio format after Ogg Vorbis.

“Speech compression formats have progressed significantly in recent years. Modern lossless formats can achieve a compression ratio that is better than or comparable to that of older formats, but require much less processing power when decoding or encoding,” they told LinuxInsider.

Primarily the work of one Nicolas Botti, “an experienced programmer with a background in video and image format development, Musepack differs from all other lossy formats by one important aspect: Its purpose, and the reason it was created, is to provide transparent audio quality while maintaining high-compression ratio, which lossless formats don’t provide,” Berman explained.

“Transparent” in this sense means high-quality audio that perceptually is identical to the original source, Keidar continued. “No other ‘lossy’ audio format was created for this purpose. Musepack doesn’t suffer from problem cases as much as other ‘lossy’ formats do and can fulfill its task of providing transparent audio while using the least processing power when decoded.”

Lossy and Lossless

As an example, Berman and Keidar noted that Musepack is the most battery-efficient lossy format that makes use of open source RockBox firmware, which offers more playback hours and can play audio on an iPod, iAudio and iRiver, as well as other players. It is also the first format that standardized ReplayGain information as a mandatory part of its specification, Keidar and Berman noted. “No industry standard format has official support of ReplayGain, or anything similar to ReplayGain.”

Lossy compression methods and techniques — so called because original recordings aren’t duplicated exactly bit for bit — such as Musepack’s use advanced mathematical psychoacoustic models “that mimic the way that the human ear perceives audio signals so well that the quality achieved is far higher than what was available in the past.”

Advocates, as well as proponents, of open source standards development and licensing for compression technology, the Musepack team is working on “a major new version of Musepack called ‘SV8’ (Stream Version 8). Public betas have been available for a while and currently we are finalizing everything and making sure no issues are left before the final release,” Berman reported. “This new version solves all limitations of the format, improves compression ratio and performance further, and makes it what we believe is an optimal audio format for end-users as well as developers.”

Proprietary, Industry Standard or Open Source

Nothing much has changed recently when it comes to the development of industry standard audio-visual compression technology, according to the Musepack team. “The control has been and remains in the hands of companies that are directly related to the movie industry. A controlling association of companies known as the ‘Motion Picture Experts Group’ (MPEG) is the only body that decides what formats will be ‘standardized’ — formats that they will license to those who intend to use them professionally for huge amounts of money.”

Open source developers are working to change that. They are developing a variety of audio and video codecs and file formats that proponents say are at least as good as — and in various ways better than — MPEG and other industry standards, efforts which have at times run up against patent infringement and intellectual property rights infringement claims.

“Feature-wise, industry standard formats lack even the most basic features,” Berman and Keidar assert. “Today, more than ever, the ability to properly catalog your digital audio files is important. PEv2 tags were developed especially for the Musepack format. Today they are the standard tagging format of the Musepack, Monkey’s Audio, Wavpack and OptimFROG formats. Ogg Vorbis also has its own standard tagging format, Vorbis Comments. Other open source formats do as well,” they added.

“Open source codecs such as Speex and GSM 6.10 are used because of their convenience, price and availability of source, which are always the benefits of open source,” Paltalk’s Scherer commented. “Paltalk benefited form open source codecs in the beginning when revenues were scarce and the market was yet to be explored.

“The open source stuff seems every bit as high quality as anything we have licensed, so we have not suffered for lack of a vendor to back it up. The only possible downside is that there is no barrier to entry; i.e., anyone can download and use them. For a commercial product hoping to differentiate itself in the market, this can be an issue.”

Packing It In: The Evolution of Online Video and Audio Tech, Part 2

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

Related Stories

LinuxInsider Channels