Vocal Joystick for accessibility

University of Washington researchers are developing a new "Vocal Joystic" interface to make software more accessible for people who don't have use of their hands or arms. The software converts simple vowel sounds and other intonations into cursor movement. The louder the sound, the faster the cursor moves. Saying "K-Ch" represents a mouse click and release. Follow the link for a video demonstration. From the University of Washington Office of News:
"A lot of people ask: 'Why don't you just use speech recognition?'" (electrical engineering professor Jeffrey) Bilmes said. "It would be very slow to move a cursor using discrete commands like 'move right' or 'go faster.' The voice, however, is able to do continuous commands quickly and easily." Early tests suggest that an experienced user of Vocal Joystick would have as much control as someone using a handheld device...

"While people use their voices to communicate with just words and phrases," Bilmes said, "the human voice is an incredibly flexible instrument, and can do so much more."
Link

Discussion

Take a look at this

This reminds me of the story about aiming a helicopter's guns by tracking the pilot's eye movements. A certain type of squint was the "fire" command. For some reason, nobody ever heard any more about it. Maybe some pilot got a twinch and wiped out a command post or some dumb thing like that.

I'm very suspicious of tricks like this. Even regular software doesn't work right yet.

Take a look at this

"...as much control as someone using a handheld device..."

pwned at a LAN party by a kid with no arms, how embarassing would that be?

Take a look at this

I can only hope for the sake of users' coworkers and caretakers that future products let users control their computers with a sing-song hum instead of the meditative focusing noises in the video. It sounds like the test user has his mouth open, which would lead to a dry mouth after less than an hour of casual web surfing.

Consider also the ability to tell what sort of task someone is up to by the complexity, tone, and change frequency of their pointing. A manager with disabled employees could close his eyes and pick out which are making diagrams, which are navigating through documents, which are doing spreadsheets, and which are slacking off to play Tower Defense with just a few seconds of careful listening.

Take a look at this

Straight out of Snow Crash :)

Take a look at this

Re closed-mouth control: it is using vowels to tell which direction. That does make mouth dryness an issue. Surf the Internet with a bottle of water? Tower Defense with a Camelbak? It's a whole new use for your favorite hydration beverage.

Post a comment

Anonymous