Posted By: mstefan | Feb 9th, 2007 @ 8:42 PM
page 1 of 1
Comments: 8 | Views: 5340
Well, I decided to finally replace my aged webcam and microphone for a new setup from Logitech. Remembering the post here a while ago asking about why Microsoft didn't heavily promote speech recognition in Vista, I decided to give it a shot.

I used the boom microphone that was part of the Logitech headset, not the webcam, since I figured that would give me the best results. I ran through the initial tutorial (which was done very nicely, by the way) and then through three training sessions. At that point, I figured that Vista would be all set and began my adventure.

Ouch. Well, I found the answer to my question. It simply doesn't work reliably. It would get seriously confused when I would say "Close That" or "Cancel", it didn't know how scroll down or select items in a number of applications I tested, and dictation... not a chance. Even something as simple as starting a program was dicey. Alas, that wonderful tutorial has no connection whatsoever to reality. When using dictation, simple phrases like "this is a test" came out as garbled nonsense in Word.

And to top it all off, I have the CPU gadget on my desktop. Normally, it's sitting around 1-3%, 10-15% if I'm doing a few things (that aren't compute-bound, obviously). With speech recognition, my poor system was constantly red-lined, pegging both cores at near 100%.

I can appreciate Microsoft wanting to push the envelope, but I'm sorry, this is just broken behavior. The "wow" is definitely nowhere to be found in their speech recognition code. Fortunately for their poor PSS folks, I doubt most users will even be aware that it's part of the operating system.

I love Vista, but this is a feature that I think would have been better left on the shop floor until it's retooled and working.

Something is wrong.

I have Vista Enterprise on an 3 year old GW M200 Tablet (1.2GHZ, 512MB RAM) and SR works REALLY well.

I have never really watched the CPU but I have never noticed SR having any effect on the performance of Outlook or IE.

I often lay in bed browsing websites without touching the mouse or keyboard and just using the built in mic and I would guess that I am getting 98% to 99% accuracy. (Controlling windows, following links, mouse-grid, very little dictation.)

And to top that off I have not done any training beyond the into tutorial.

Jorgie

I  have not experienced many problems when using speech. Sure, sometimes it will get it wrong, but normally only with dictation.

I have actualy found it very reliable when using commands, reliable enough that I regularly use it. 

I did sit down for an hour training it though.
I live in a building where the electricity is not grounded, and as a result I get a nasty buzz on the microphone input.

And I still get better results than what you describe. Have you tried recording you as you speak (using sound recorder or something) to see how you sound to the computer?
Sven Groot wrote:
I live in a building where the electricity is not grounded, and as a result I get a nasty buzz on the microphone input.

And I still get better results than what you describe. Have you tried recording you as you speak (using sound recorder or something) to see how you sound to the computer?


Yup, it sounds normal to me. No static, hissing or strange audio artifacts (that I can hear, anyway). I'm using a mid-range USB boom microphone/headset.

For basic commands,  I would say that it was about 70% accurate. For dictation, utterly useless. I could improve accuracy if I spoke like a robot in a monotone voice, pronouncing - each - word - very - slowly. Of course, that completely eliminates that usefulness of the voice dictation. I'm a fairly good typist (I'm not one of those "hunt and peck" coders), and in the amount of time it would take me to dictage a paragraph and make the corrections, I could probably have two pages typed out.

The thing that was most surprising to me was the sheer amount of CPU it was using. I realize that analyzing voice data is compute intensive, but when just the voice recognition is chewing up at least 90%, that makes it functionally useless for anything but the most trivial tasks. If your goal is to just open IE and browse websites, I guess that's okay. But for real world, practical business use? We ain't there yet.
'Start Listening' Smiley

I have found Vista Speech Recognition very reliable, sure it doesn't always get things right, but its still very powerful.

I recently did some grok talks, at DDD and WebDD in the UK, and have now uploaded the first demo video, so here is the link if anybody is interested.

http://www.nxtgenug.net/Article.aspx?ArticleID=145

'Stop listening' Smiley
'Mousegrid'

//sorry, could'nt resist

Perplexed
Ok... I am very mixed about Vista's Speech Recognition.  When it is working properly it is amazing.  My biggest issue with it is my innability to find resources when I have a problem.  My problems should be simple.  In some sessions, when I say 'stop listening' it will turn the SR mode to 'Off: Don't listent to anything I say'.  Since it is not listening to anything I say, it does nothing when I say 'Start Listening'.  Other times (i.e. after a restart), it will go to 'Sleep: Listen only for "start listening"'.  

I would be a happy man If I knew the answer to any of these questions:

-Why does the same command produce different results at different times?

-Is there a setting similar to "For the 'stop listening command, always put SR in Sleep Mode'"?

-Which Setting Governs this: When using Microsoft Office (Professional) Word 2007, why will it one session dictate what i say without question, and in another bring up the Correction Dialog after every phrase?

I tried going to the vista discussion board with these questions and I was told to go back to IBM OS/2 for a good SR.  That wasn't exactly helpful. 

mjmetzger@minnair.com if you'd like to reply directly.   I just created my Channel 9 account and hadn't had a chance to fill in my profile.

page 1 of 1
Comments: 8 | Views: 5340