IBM Makes Speech Code Open Source

IBM today announced it is contributing proprietary voice-recognition software to two different open-source software groups.

The Apache Software Foundation will receive Reusable Dialog Components (RDCs), used for handling simple words for dates, times and locations. In addition, IBM is proposing a project at the Eclipse Foundation to donate markup editors for speech standards established by the W3C, the company said in a statement.

Voice-recognition technology is used in call centers, for hand-free mobile phone use in cars, and elsewhere.

Competing with Microsoft

According to the New York Times, the software IBM is donating cost the company about $10 million to develop.

IBM’s motive is not altruistic. By making the software open source, IBM hopes that other programmers will make improvements, thus speeding the development of speech applications and allowing IBM to compete with Microsoft in this growing market.

In March Microsoft introduced Microsoft Speech Server 2004 for speech-enabled applications. According to Microsoft, more than 100,000 programmers have downloaded Microsoft’s free developers’ kit for building speech applications on its Windows .Net technology.

Driving Innovation

“This is the latest step in IBM’s contribution to open source and to speech technology,” said Gary Cohen of IBM Pervasive Computing in a statement.

“By giving more standards-based speech resources to the development community, IBM hopes to accelerate development and drive innovation in all areas of the speech ecosystem — from speech vendors, to ISVs, to platform providers,” Cohen added.

IBM says the move is aimed at ending the battles over competing, proprietary specifications. It will spur the availability of speech-enabled applications by making it easier for developers to build and add speech recognition capability in a standardized way, the company said in a statement.

Supporters of the IBM open-source initiative include Apptera, AT&T, Audium, Avaya, Cisco, Fluency, Genesys, Kirusa, Loquendo, Motorola, Nortel, Nuance, Openstream, ScanSoft, Siebel, Syntellect, Telisma, TuVox, V-Enable, Viecore, Vocomo, VoiceGenie, Voice Partners and VoxGeneration.

Apache and RDCs

RDCs are pre-built speech software components that handle basic functions such as date, time, currency and locations (major cities, states, zip codes). For example, they allow a caller to book a flight over the phone.

Developed by IBM Research, RDCs are Java Server Page tags that enable dynamic development of voice applications and multimodal user interfaces, according to an IBM statement.

IBM says that it hopes to ensure that speech components built using it will work together, regardless of the vendor that created them. Both the framework and a set of example tags are to be contributed to the Apache Software Foundation.

Eclipse and Speech Markup Editors

IBM’s contribution of speech markup editors to Eclipse is aimed at making it easier for developers to write standards-based speech applications as well as create and utilize RDCs within those applications, the company said in a statement.

Still in proposal stage, this contribution gives speech developers a standard way of writing VoiceXML applications. It will also give Web developers tools to more easily add speech access to their Web applications.

This comprises the initial formation of a project at Eclipse for open source tools for voice application development, which will be further developed by several companies in the VoiceXML community.

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

LinuxInsider Channels