|
The Voice Portal -- Gateway
To A Universe Of Data
BY TIM WALSH
[Go right to Speak
To The Web]
With industry pundits estimating that 18 million consumers will use
some kind of speech-recognition portal by 2005 to access the Web, and that
by 2005 revenues from the voice portal market could top $12 billion
(Source: The Kelsey Group), it is no wonder that voice portals have
captured the attention of start-ups, investors, and the media. Indeed, the
future of voice portals seems promising. But how this future will play out
in terms of application development, consumer adoption, and business
models remains to be seen. While exact predictions about the future are
best left to accredited profits, we can, nonetheless, draw on what we do
know about speech recognition technology, consumer behavior and business
logic to make some educated guesses.
Advancements in speech recognition technology have made voice portals
possible, and it is safe to say that further enhancements and refinements
in this technology will play a major role in shaping the future of the
burgeoning voice portal industry. People want fast and easy access to
information, and voice portals, like the Internet, the newspaper, and the
radio, provide a means for accessing this information. The key
differentiator is that voice portals allow people to use the most natural
interface -- the human voice -- to access information when and where they
need it.
THE POWER BEHIND THE PORTAL
While the voice portal concept is relatively straightforward, delivering
on this concept poses a significant technological challenge. Cheaper, more
powerful computers, wide proliferation of wireless devices, and vastly
improved speech recognition software has spurred the voice portal
industry. Moving forward, success will increasingly hinge on the ability
of a voice portal's speech recognition engine to create positive caller
experiences by performing the following complex functions:
- Support both open and closed grammars, industry terms that mean
developers and integrators can build applications that either
recognize any and all ways a request or statement can be phrased, or
that only recognize specific sentence and question constructions. With
open grammars, the application developer only needs to specify the
categories of information that are required to proceed. Callers can
speak "in their own words," a feature known as natural
language understanding. Unknown words are allowed and calls are not
rejected. By supporting both techniques, application developers can
choose whichever method suits their needs or mix the techniques in a
single application.
- Correctly interpret the meaning of a caller's request. Called
"attribute semantics," this capability focuses on
associating a meaning with a request or statement, rather than simply
recognizing the individual words. For example, for the statement
"I would like the score of last night's Knicks game," the
recognizer understands the meaning of "last night" and
applies the appropriate date.
- Expedite the dialogue process by "accumulating
confidences." Through this technique, words that have the same
meaning can be assigned the same attribute, and then the system's
confidences for each attribute can be accumulated. The result is that
an application is very certain when it recognizes a statement and
interprets its meaning. The system is more likely to proceed
intelligently, with less need to verify with the caller and thereby
prolong the dialogue. In addition to the benefits to the caller, this
feature also offers various timesaving benefits to the speech
application developer.
- Process complex statements by employing "mixed models" --
whole-word and phonetic recognition methods employed simultaneously to
understand complex statements better.
- Facilitate "mixed initiative" dialogues, which eliminates
complex menu structures and annoying prompts by enabling the caller to
take control of the dialogue.
- Support multiple languages and accommodate regional dialects.
DEFINE MARKET NEEDS
While the scope of voice portal applications is seemingly infinite,
developing unique services for targeted markets is paramount. With voice
portals, offering fewer frequently used services may be more feasible than
hitting the market with a big bang service offering. For example, the
largest and longest running voice portal (Italy's Omnitel 2000 with 3,000
ports) offers 100 services, but the vast majority of the calls Omnitel
receives go to a relatively small number of services. (Horoscopes are
number one and lotto numbers are in the top five.) In this instance, a
large-scale service offering turned out not to be as important as the
project's planners had envisioned.
Another emerging voice portal play is in enhancing the functionality of
enterprise intranets. More and more, companies rely on intranets to
disseminate information among employees. With the ubiquity of mobile
devices and an ever-increasing mobile workforce, businesses can leverage
enterprise voice portals to streamline communications, increase
productivity and drive cost-saving efficiency gains.
Like consumer-facing voice portals, the key will be to build
applications around specific, need-driven services. Application will no
doubt vary from company to company, but some possible services may include
benefits information, the company stock price, help desk tickets, purchase
requests, or just voice dialing from the company directory.
A common denominator in the success of both consumer- and
business-facing voice portals is ease-of-use. To this end, technological
advances that "humanize" voice portal applications are vital.
The beginning and end of voice portals lies in the power of the spoken
word, with all of its idiosyncrasies, dialects, mispronunciations and
phrasings.
Things like eliminating complex menu structures and annoying prompts;
allowing callers to speak as they would in natural conversations, in their
real voices as if they were talking to another person; and supporting
multiple languages and accommodating regional dialects are the inviolable
prerequisites to the widespread adoption of voice portals.
THE TELCO ADVANTAGE
Voice portal providers have varying business models. Most Internet-based
voice portal companies rely on advertising dollars, subjecting callers to
anything from a brief sponsorship mention, a five second ad, or as much as
a twenty second commercial. But ads could pose problems. According to a
recent survey of 1,000 consumers, nearly half are very likely to use voice
portals, but less than a third are willing to use them if forced to listen
to ads (Source: Cahners In-Stat Group).
Moreover, Internet-based voice portal companies employing an
advertising-supported business model must learn to compete in the
advertising sales business -- a core competency few start-ups possess.
Finally, Internet-based voice portal companies must devote significant
resources to promoting their names and toll-free numbers, and to educating
consumers on a brand new way to search for information. The high
expenditures associated with these activities present a formidable
obstacle along the path to profitability.
On the other hand, telcos or otherwise experienced service providers
already have relationships with their customer base and a mechanism for
generating revenue -- by minute, by call, by monthly access rate, and so
on. Some telcos may even offer the voice portal for free, as a
differentiating service providing a "sticky" relationship with
the customer.
We have ample evidence that consumers will accept all the free services
they can get.
As for the overall business surge in voice portals, they must now
endure the test of time. Perhaps no factor will be more critical to their
sustained popularity than that they expose more and more people to the
power of the human voice.
Tim Walsh is vice president sales and marketing, Americas, Philips
Speech Processing. Visit www.speech.philips.com
for additional information.
[ Return
To The March 2001 Table Of Contents ]
|
| Speak
To The Web
BY STEVEN DUNCAN
Although the technology is still in its toddlerhood, the term
"voice portal" typically elicits images of mobile consumer
services providing everything from horoscopes to stock quotes to traffic
reports. Voice portals' ease of use and entertaining nature are ensuring
their place in the spotlight right now. The true value of voice portals,
however, lays not in the general consumer application but in the directed
or enterprise application. These targeted voice portals allow staff,
partners, clients, or other specific audiences to use speech recognition
technologies to gain access to personalized information, while maintaining
privacy.
The beauty of speech recognition technology lays in its familiarity to
the end user, its speed (since users can skip menus and cut to the heart
of their request), and its broad applicability. But in the business world
the largest benefit is in access to information and transaction capability
for a mobile audience. In enterprise or business-to-business (B2B)
applications suppliers and buyers, clients and partners, can access
information off of each other's Web sites without the requirement of a PC.
Functions such as scheduling, parts ordering and status, inventory
tracking, and the like can be completed over the phone, 24/7 without the
aid of a human agent. These technical capabilities, powered by voice, will
help to further cement existing partnerships and client relationships.
We are on the brink of developing true natural dialog applications,
making Web surfing via voice and unconstrained dialogs with an application
a fast reality. Combined with the industry's present high quality voice
verification technology for user authentication and security, this will
only help add to the growing number of voice sites being implemented. With
market projections for mobile commerce in the next five years ranging from
$12 billion to $200 billion depending upon analyst firm, this presents an
attractive opportunity for the growth of speech.
From an end user perspective B2B voice portals are made for speech
recognition. Business portals have an audience that is attuned to the
language used in a specific enterprise. As such, specifically targeted
voice portals allow the provision of business information of intrinsic
value focused specifically to the needs of an audience because they can
operate with a more limited grammar or vocabulary. This greatly increases
the accuracy of the speech recognition application, which in turn creates
greater end user satisfaction. Finally, since the delivery is in voice,
business voice portals can be crafted with a personality, and the audio
feel and persona that a business would like to present to its customers.
This personality is consistent without the variances of CSRs, and can
reinforce a brand or image for an enterprise.
As we move further on with the technology and industry standards
efforts, we will begin to see greater use of natural language recognition
in applications. Much as we have shopping bots and virtual assistants on
the WWW today, we will soon begin to see enterprise voice assistants to
help us conduct business so that users can more readily surf this growing
"voice Web" as well.
Steven Duncan is head of marketing for Applications and Messaging,
Mitel Corp. For more information, visit www.mitel.com.
|