In May 2000, the World Wide Web Consortium (W3C) Voice Browser Working Group agreed to adopt the Voice eXtensible Markup Language (VoiceXML) specification from the VoiceXML Forum (founded by AT&T, IBM, Lucent Technologies, and Motorola) as the basis for developing a voice markup language. VoiceXML is an XML-based markup language that lets users access Web content and services using an ordinary phone. The language uses a basic voice-recognition engine that resides on a VoiceXML server. The server facilitates the navigation through content using voice commands and numeric key input on the phone.

The Voice User Interface
Although VoiceXML is a relatively new concept, it has great growth potential. Rather than using a computer and Web browser to access a Web server, a user calls on either a conventional or cellular phone to access a VoiceXML gateway. The gateway retrieves a designated Web page on a VoiceXML content server. The gateway translates the information on that Web page into speech and reads it to the user. Programming scripts can retrieve information from the Web site's database, as they can in traditional Web applications. Market analysts estimate that by 2005, 45 million wireless phone users in North America will regularly use "voice portals" and that the worldwide voice-browser market will grow to $26 billion.

The benefits of using VoiceXML technology include:

  • Wireless voice access to the Web using cellular phones
  • Anytime, anywhere retrieval of information from Web pages, which might include stock and portfolio alerts, directions, top news stories, weather, and movie times
  • Access to intranet corporate directories, including contact databases
  • Reading and sending email using a phone
  • Easier user interface than Wireless Application Protocol (WAP) devices, which have small displays and limited input facilities
  • Ability to use existing corporate Web infrastructure

The implementation of VoiceXML is simple. First, define a VoiceXML gateway. (Several gateways provide a platform for developers to create and test their VoiceXML scripts. TellMe and BeVocal are popular examples.) Second, develop VoiceXML scripts, which look very similar to HTML documents. For example, VoiceXML uses a FIELD tag to indicate an input field. The primary source for the standard is the VoiceXML Forum. Third, establish a VoiceXML content server. The content server can be any Web server on the WAN. You just need to configure two MIME types, .vml and .vxml.

I have developed a VoiceXML script that you can download. In this script, the caller selects a contact from the corporate directory by stating the contact’s name. The script then automatically calls the contact's cellular phone. The advantage to this approach is you don't need to carry a hard copy of your contact list all the time; you can just use the phone. To hear the script, call Tellme at 877-678-TELL. Then, enter developer ID 25585 and PIN 0765.

The following is a sample interaction.

Computer: Welcome to InterKnowlogy's voice-activated company directory. Please say the name of the employee you wish to contact. For Gail Fitzmaurice, say "Gail;" for Tim Huckaby, say "Huckaby."

Human: Gail

Computer: I heard you say "Gail." How would you like to contact the person? For email, press or say "1." For cell phone, press or say "2."

Human: 2

Computer: You said "cell." <calls Gail's cell phone>

What's Next for VoiceXML
You can imagine the multitude of business opportunities that VoiceXML technology makes available, including customer care, voice-commerce (v-commerce), and access to intranet applications over the phone. The VoiceXML Forum is promoting its capabilities, resulting in an increase to more than 260 supporting organizations, including Cisco Systems, Ericsson, and Oracle. VoiceXML is the only anytime, anywhere technology that lets you access vital corporate data using the easiest user interface of all—the human voice.