Speech Dasher is a novel interface for the input of text using a combination of speech and navigation via a pointing device (such as a mouse). A speech recognizer provides the initial guess of the user's desired text while a navigation-based interface allows the user to confirm and correct the recognizer's output.
It is hoped that Speech Dasher will provide a text input interface which is:
Entering text using Speech Dasher begins with the user speaking their desired sentence to a continuous speech recognition engine (currently Microsoft's Speech Recognizer v5.1 or Dragon Naturally Speaking 7, 8 or 9). A word lattice is generated from the recognizer's results and then expanded to cover likely recognition errors. The expanded lattice is used to estimate the probability of what letter the user might enter next based on both the recognition results and what has already been entered. The interface also seamlessly integrates a default language model, allowing the user to efficiently enter words missed completely by the recognizer. These probability estimates are then used in the continuous navigation-based interface Dasher to allow the user to confirm and correct their dictated sentence.
An early research prototype is available for download below. So try it out and send me your feedback. I'm especially interested in how it could be improved for people with disabilities.
Currently I'm working on a new version that uses the recognition lattice obtained from the PocketSphinx recognizer. The first set of videos below show the latest version in action.
Speech Dasher: Fast Writing using Speech and Gaze
CHI '10: Proceedings of the ACM Conference on Human Factors in Computing Systems, 2010.
Efficient Computer Interfaces Using Continuous Gestures, Language Models, and Speech
M.Phil thesis, University of Cambridge, 2004.
YouTube video showing mouse and eye tracker control