This is…
…the story of my life.
And I don’t even have an accent, so what’s my excuse?
I’ve noticed that, in general, voice recognition systems don’t like me. Back when my arm injuries were very bad, I tried to use Dragon Dictate, and it had a dreadful time understanding my voice. And yet actual people seem to have no problem comprehending me, and when I hear a recording of myself speaking I don’t hear any lack of ability to articulate properly. What gives?
It’s different when you have two people with dramatically different accents. I recall a Reader’s Digest story about a WWII ship with a NE captain and a southern crew–the guy telling the tale was the intermediary between them: what the captain wanted them to do, and what they wanted the captain to know.
I recall a southerner asking me “where’s the piston rings?” I thought, that’s not right. Ah, he wants to know where is the pistol range!
The software may be designed to understand a neutral accent; that is not a characteristic of New England spoken language. If you live somewhere long enough you pick up the local accent; your neighbors understand you but your old friends from elsewhere might not so easily do so.
There’s a reason why newscasters are taught to speak in a generalized flat midwestern tone; could you imagine trying to understand someone with a thick Boston, Cajun, or Texan accent…?
New England does have an accident, Neo.
It’s those Valley Girl intonations and pacing. Also your habit of ending every sentence with a question mark.
I was a GI brat, then spent a career in the Army. I’ve been told I have a “Hi, I’m from Nowhere!” accent. Nonetheless, I cannot be understood in telephone menus (say Yes or No. Please say Yes or No. I’m sorry, I can’t make out what you said. Please say yes or no.)
I was well understood over radio in the military. Still, I have trouble being understood over the phone (like when ordering Chinese take-out). Even drive-through window service can be a problem.
Now I just hand the phone to my wife. I know she’ll be understood — she was an operator for Ma Bell for several years. (at least SHE understands me!)
I have an extremely generic Midwestern-type accent. I think I explained why in some post or other, but don’t have the time to search for it now. The gist of is that in college I was teased for my very slight NY accent, and I trained myself to speak generically. So the problem is not my accent.
Size. I have the same problem, and suspect it has to do with physical size. Real humans can read lips along with what we are saying, even if they don’t think they know how. But machines can’t. And, big or little, beyond a certain range, and there is just something atypical about how we speak. I have even worked hard to not have that nasally sound so many giants have, perhaps overdoing it, but… At least no one cringes when I speak, and most can understand. Even if you aren’t, perfectly, a pixy sized female, if you have very fine detail, and are rather light weight, your frame, internals, may be petite as well. For me it is pure size.
Just my two cents. Just a suspicion, as well, albeit. But…
Voice recognition software should be able to comprehend midwestern accents, as we in the midwest speak in a bland tone/accent. Just be glad you do not have a Scottish accent.
Voice recognition is nice for those whose hands don’t work very well, but for the rest of us, it’s Kewl-Factor Overengineering. Like the analog dials in an airplane cockpit, they’re not improved by going digital.
Also, a performative of British Accents.
h/t David Thompson, whom everyone should read
From what I recall the last time I heard Neo’s voice recording, some odd years ago, she was difficult to understand in the beginning, as if her voice did not get recorded. It took awhile but natural heuristics let me figure out the words.
I think there’s something with the audio software that just doesn’t pick up your audio, Neo.
It might be the microphone. I experimented with a couple or three voice recognition systems a few years ago. They were extremely sensitive about microphone quality.
I used to have that Army Brat “Man from Nowhere” accent (yes, I too am an Army brat), but I suspect after the last ten years I’ve picked up something of a South Philly accent. It’s jarring on the ears when you hear other people speaking with that accent, but you don’t notice it when you listen to your own speech unless you record yourself.
I have an ear for certain types of accents and can mimic them when I want to, and as I formulate the words in my mind I hear them in that accent, but I also know what they would sound like with a neutral accent. Your computer software would need to be able to do that in reverse, much like Ginger Rodger’s dancing with Fred Astaire. Unfortunately your computer software was programmed by Sheldon fro The Big Bang Theory…. 🙁
OMG I am so sorry to read about your pain. I deliberately did not read the link carefully because it requires more concentration than I can give it now. I will read tomorrow.
What doctors diagnose as nerve pain is often severe muscle pain caused by trigger points. Those are relatively simple to resolve (with dramatic results) but few doctors understand that.
You slip under the radar. You cannot be identified.
The Ghost in the Machine.
Your magical talents are interfering with the electronics. That is a common effect well known to the writers of urban fantasy.
I just go through about ten minutes of punching numbers to get a real person on the phone and hope they speak English well enough to help me. I have a West Texas plains accent that sounds like the bad extras in old, very old cowboy movies when they used real cowboys. Even though I still call oil wells, all wills and I get the all changed in my cur I can be understood by people most of the time but not ever by machines.
Sounds like language discrimination. They should add another language on top of Spanish then.
You are not alone, neo. As with others above, I cannot get the dadgummed machines to understand “yes” or “no.” Much less, “My computer doesn’t work.” Then when I get a human, they don’t do a good job of understanding me either because English (and western slang especially) is not their first language. What was supposed to increase efficiency has not. Except it costs the companies less. (No medical insurance, days off, vacation, FICA, etc.)
I don’t know if this will help, but I found out that microphones, especially those on headsets, vary significantly in coherence. Even the change to a stand-alone microphone may make a significant difference.
You might ask around for any recommendations on the best types. Just as with copiers and printers, and with OCR, the quality of the output varies. You might just need a bit more voice differentiation, and the headset microphone you used may not have had the ability to deal more finely with your voice.
For some people–such as you–the ability to take advantage of a program like Dragon can make a huge difference in their workload, so you might look further into it.
neo:
Please, please do not misunderstand my post. I am not whatsoever suggesting that your medical attention was mistaken because it is impossible for me to know.
I post only because I have some perspective on nerve pain and nerve diagnosis, which may be absolutely irrelevant.
And I totally realize that folks with pain have had their fill of advice (and I also understand your surgery was successful).
People blab at you without knowing.
I want to be careful in what I say, so I am not understood to say too much.
But from my own situation I now understand that what very good doctors diagnose as nerve pain, even with the damage you describe (swelling etc) is not nerve related (or originated, to be more accurate) but muscle related.
Trauma happens to the muscles (and related anatomy) which short-circuits the usual functions in a devastating way.
Again, I don’t want to sound like a babbling idiot, which I may very well be – – I am just a well meaning dolt.
I have had two very specific and horrific conditions which were misdiagnosed as nerve damage, something I definitively found out after many years, after which they were simply corrected.
It was never a consideration to sue because human life is complicated and it is a breach of peace on Earth to pursue those who make honest mistakes.
This may be, and probably is, a statement which does not apply to you, but I make it just as an avenue of exploration:
Pain diagnosed as “nerve” pain may be muscle pain caused by trauma to the functioning of the muscles and related anatomy. And it can be addressed (sometimes) by a very simple but highly effective, non-invasive procedure.
But it worked for me, after 25 years of suffering, and so I have a bias.
Yes, Neo, you do have an accent. It’s pretty soft and quite peasant, but everyone has some kind of accent. I already knew you were from New York, so it was no trick to say, “‘At girl’s not from Boston.” Someone mentioned a “thick Texas Accent”. Which one?There’s East and West, and I can hear the difference among Austin, San Antonio, Houston and DFW.
The damned machines ought to be able to understand all of them, though. There are little vowel differences, but mutually intelligible.
And also, I have always had a problem with people (and a fortiori) mechanical systems understanding my voice.
It has been a constant difficulty.
There are about two dozen explanations I consider for this reality, but it is a strange reality.
So what, that is life.
Tonawanda:
I know you’re trying to be helpful, and I’m not taking offense.
In my case though it was always, unequivocally nerve pain. For example: all of my pain was in the distribution of the ulnar nerves. My pain was clearly neuropathic, a type of pain which is very very different from muscle pain or other pains. It involves disturbed sensations primarily, of tingling, burning (especially burning, as though scalded, all along the path of the nerves), tremors and weakness in the fingers enervated by those nerves, and a host of other things too tedious to mention but very clear. What’s more, when they did surgery on me and exposed the nerve at the elbow, they discovered that the nerve itself (even though this was 9 years after the original injury) was very red and swollen, and had been pinned down by an unusual amount of scar tissue that they had to dissect away.
And I am about 85% to 90% better now than I was before the surgery.
There’s really little doubt it was a nerve injury, and no doctor I went to ever doubted that. The question was whether surgery would help, and if so which one, because the nerve was thought to be compressed at at least two levels.
From my experience, Neo’s experiences correspond to a damaged or pressured nerve. In martial arts, impacts can deaden nerves, especially on the arm or near the ulna elbow spot.
One time I slammed my elbow down on a desk and the nerve compressed, producing extremely strong signals.
A lot of natural stretching also tends to pull on the nerves, and people experience a weaker form of it when they hit the limits. Stretching doesn’t make nerves regrow or grow longer, but it can rearrange the internal muscles and tissues so that they stop pulling on the nerves.
http://www.clubbell.tv/
Something like that and the education there in, should improve joint health for Neo. But softer and slower movements, like Taiji Chuan, would have a lot less impact on the nerves and muscles. Depends on fitness and energy levels which one would be better rehabilitation.
I have to swing a steel sword around without poking myself with it, so shoulder flexibility and strength matters, a lot, to me. If joints get damaged, bad things start happening.
Voice recognition, like handwriting recognition, is due Real Soon Now.
It can work fairly well for a wide array of languages and dialects for a small number of words, especially if you take the time to “train” it to your own tongue/dialect.
But generic voice recognition is still a hard task not well accomplished.
When I AM trying to speak to one, I attempt to place as little inflection and speak in as “robotic” a manner as possible. “Yes”. “No.” “One. One. Five. Three. Seven. Bee. Cee. Ef.” with the periods representing full and clear stops for a substantial fraction of a second.
Realize that much of what makes humans humans — “common sense” — is not available to computers at this point. When we hear someone, a lot of time we’re actually figuring out what they said as much as hearing it. We hear something that sounds like “x y z” and only as we process it do we realize they said “Ecks: Why ‘Z’?” because we place it in context. We do more than hear, we THINK about what we hear, too.
So reliable, general-purpose voice recognition is still a good ways off.
}}} The damned machines ought to be able to understand all of them, though. There are little vowel differences, but mutually intelligible.
No, you’re wrong here, because you’re presuming a skill that computers don’t yet have, particularly not to the level that humans do:
pattern recognition
Y’all should understand — humans are literally AMAZING pattern recognizers. It’s what we’re spectacularly good at. We’re so good, we see patterns that AREN’T there. That’s what “Numerology” and “Astrology” and the like are all about.
Computers can do SOME things well in this regard — they’re better at pulling THOUSANDS of data objects up and finding SIMPLE patterns in them.
But complex patterns, like “if ‘t’, then ‘a’, if ‘y’, then ‘x’, if ‘t’ then NOT ‘a’, but ‘z’, then ‘f’ — the more complex the alternatives get to be, the harder it is for the computer to do it right.
And it will NEVER infer from what has been coded, the way a human would.
Example: I have a cat. It has fleas. Tell me how you would go about getting rid of the fleas.
Well, a computer program might realize — heat kills fleas, so throw the cat in a furnace.
You and I would know automatically that was not an acceptable answer, but the computer does not know enough to INFER that, since the cat would also die, that’s not an acceptable solution at all.
Don’t confuse beating a grand master at chess, or even winning on Jeopardy, as a sign that computers understand things. These are well-defined problems, and even so, with the latter, there were some very oddly incorrect choices and answers arrived at. Hunt through youtube for post-win analysis of the Watson v. Jeopardy Champs, they speak of how it came up with some very very wrong answers in one or two cases.
A computer would need a DNA biological parallel processor or a quantum parallel processor.
Equivalent to … what was it, 15 billion semiconducting parallel processing circuits in the palm of one hand, I think it was.
That would pretty much exceed one person’s brain, but perhaps not 100 genius level brains put together.
I work for a Scotsman and with a lot of Scottish expats. This video is as good as it gets describing my work environment, and the relative difficulty a newcomer has understanding the dialect and sense of humor of my buddies. funny as hell
https://www.youtube.com/watch?v=SGxKhUuZ0Rc
@IGotBupkis
Correct.
The human mind is clearly not a Turing machine (see Penrose http://en.wikipedia.org/wiki/Shadows_of_the_Mind
)
I (50 years in the computer business) strongly doubt that we will ever see a general speech recognition machine that approaches human capabilities.
My husband has a voice that simply does not carry. It’s as if he’s inside a thick glass globe – he can yell as loud as he’s able, and his voice will die out within maybe fifty feet. My voice carries basically as far as I need it to, within human limits. I’ve been a singer my whole life, which probably has something to do with it, but I know a lot of people who (a) agree that my husband’s voice has an unusually short range, and (b) can project as far as I can with no professional training, so I think it’s just something about the frequency or timbre of his voice. Maybe something like this?