|
Is speech recognition dictation software just for techno-geeks who are too
lazy to type? Do you know of anyone who actually uses speech rec dictation
software? As a test, stand up and look over into the adjoining cubicles.
Is anyone talking to himself? Probably not. Maybe shyness is why speech
rec dictation just hasn’t seemed to take off. Perhaps people are
self-conscious about talking to themselves, or they think that speech rec
is inaccurate. Maybe they don’t want confidential information dictated
aloud, or maybe they don’t want to spend the time and effort training
the speech rec software to recognize their voice patterns. Although speech
rec dictation technology has had its problems, we were eager to examine
L&H’s Dragon NaturallySpeaking Professional v5.0 software. We used
the product extensively to determine whether the knocks against speech rec
dictation were justified.
We tested the Professional version, but there are lower-end versions
(Essentials, Standard, and Preferred) which cost less and have fewer
features. All the products are generally the same, except bells and
whistles are added with each more expensive version. After installing the
software and running the audio and microphone optimization wizards, we
performed the first training session which consisted of reading some text
into a Plantronics DSP-500 headset/microphone (which is also reviewed this
issue). We liked how the text changed color as we read each individual
word. This color change acted as a placeholder. If we misspoke, or paused
for too long, an arrow pointed to the current word the software was trying
to train.
After just a 20-minute training session, we were ready to test drive
the software. The accuracy was pretty good, but we decided to improve the
accuracy by performing some more training, which included passages from 2001:
A Space Odyssey and Charlie and the Chocolate Factory. Other
interesting literary passages are also included, even some humorous ones
to help liven up the task of reading aloud for 20 minutes.
After our training, we decided to test the accuracy and performance of
the software by reading a few pages from a book we randomly selected from
a bookshelf. We read two pages, with 377 words on the first page and 389
words on the second for a grand total of 766 words. We counted all
instances of incorrect punctuation, incorrect capitalization, and words
incorrectly recognized as errors. We encountered 20 errors on the first
page dictated and 21 errors on the second page for a total of 41 errors
— resulting in an accuracy rating of 95 percent. The total time without
correcting any mistakes was 9 minutes 32 seconds, for an amazing average
of 80.3 words per minute. Touch typists eat your heart out!
Next, we decided to play with some homophones such as “bear and bare,”
“they’re, their, and there,” “steak and stake,” and the similar
sounding “we’re and where.” In the case of “bear and bare,” the
software always seemed to choose “bear” regardless of context. This
resulted in the incorrect sentence, “The cupboard was bear.” “Steak”
also seemed to be the preferred choice, regardless of the context. On the
other hand, the software always chose the proper spelling of “they’re,
their, or there” depending on the context of the sentence. Likewise, “we’re
and where” were correctly identified.
The NaturallySpeaking product line has something called the DragonBar,
which you can set to float independently on top of other applications, “cling”
to the top of the current window, dock to the top or bottom of the screen,
or simply dock in the system tray as an icon (our preferred mode). We can
simply click on the icon in the system tray to quickly turn the microphone
on or off — a very useful feature.
With the DragonBar application running, we could not only dictate into
DragonPad (the built-in word processing program), but we could also
dictate into almost any other Windows application, including Outlook and
Microsoft Word. We were even able to dictate large amounts of text into
MSN Messenger, which made the person at the remote end think we were
typing speed demons! You can also launch popular programs simply by
saying, “Start Internet Explorer” or “Start Word.”
The software features the ability to surf the Web without touching the
keyboard. You can say “Go to address” and then say the URL. You can
click text links and buttons on a Web page simply by stating the text link
or button name. If the text link or button name is long, you don’t need
to say all of it — just enough to distinguish it from other links on the
page. If more than one link matches, it will number the links on the Web
page along with an easy-to-see arrow pointing to it. Then you can simply
say “Choose 1” or “Choose 2” to load that page. Other navigation
features include the ability to say “Go back,” “Go forward,” “Refresh,”
“Go to favorite [favorite name],” “Go home,” and more. We were
impressed with the software’s ability to quickly process the hyperlinks
that we spoke, but we noticed that our Pentium 1GHz PC took a slight
performance hit. Also, we couldn’t break the habit of using the mouse
and keyboard to surf the Web, no matter how hard we tried. But we do
commend L&H for adding this feature, which can help make the Internet
more accessible for the disabled. We should mention that e-mail navigation
is also available, allowing you to open e-mail as well as dictate e-mail
to be sent. Those who send large amounts of e-mails per day will enjoy the
integration into several popular e-mail clients. You can even listen to
e-mail messages that your computer reads aloud via text-to-speech
technology.
The more you use the program, the more accurate it becomes at
recognizing your voice. We like that. You can also customize the
250,000-word vocabulary with words you use every day. In addition, you can
quickly add to the vocabulary by scanning existing documents for words you
use that are not in NaturallySpeaking’s vocabulary. Another neat feature
is that the software supports several mobile recorders for future
automatic transcription.
Although you can dictate in Microsoft Word, we found ourselves
preferring to use the DragonPad word processor because it actually records
your voice and associates it with the text displayed on screen. Within
DragonPad, you can use the mouse to highlight some text, right-click, and
then choose “play that back,” and you will actually hear the
highlighted text played back in your own voice. This is this useful for
determining why the software did not recognize something. Further, someone
who is dictating can simply speak continuously without stopping to correct
recognition mistakes. This is an especially important feature since the
person dictating may lose his train of thought if he has to stop and fix
recognition errors that he sees on screen. Also, the document with the
associated voice recording can even be given to a personal assistant for
double-checking and correcting any recognition errors.
The Professional version of NaturallySpeaking includes time-saving
macro shortcuts to insert boilerplate text, fill out forms, and more, all
by speaking a few simple commands. It also includes a guide for creating
voice commands and using the scripting language. Essentially, the
Professional version comes with two development tools: L&H SpeechLinks
and L&H SpeechDocs Filler. L&H SpeechLinks is a Microsoft Visual
Basic for Applications (VBA)-compatible program that you can use to record
mouse movements and keyboard entries, build formatted text macros, and
customize and integrate data among applications.
Overall, we were very pleased with the L&H NaturallySpeaking product
line. Multiple training sessions may be necessary to achieve the 95
percent accuracy that we achieved, but with a little patience, it’s
worth the effort. We felt that the $695 price tag for the Professional
version was a bit steep, especially since the Standard ($99.99) and
Preferred ($149.99) packages cost much less with virtually the same
functionality. Considering that most people won’t need the advanced
scripting capability of the Professional version, plus a couple other
bells and whistles, we do recommend going with one of the low-end
versions. Price notwithstanding, TMC Labs was quite impressed with
NaturalSpeaking Professional’s ease of use, tight integration, and
support of third-party applications, and of course its excellent
recognition accuracy.
[ Return
To The December 2001 Table Of Contents ] |