On March 2, 1992, Casper was introduced by Dr. Kai-Fu Lee at Apple on Good Morning America as the first speaker-independent continuous speech recognizer which could interact with human. Apple was too far ahead of time. The technology was not quite there.
In 2004, Dr. Lee tried it again at Microsoft, advocating “NUI” (pronounced as /n uw iy/) or Natural User Interface and it was shut down once more.
It wasn’t until 2010, after iPhone took over the world, that a personal assistant like Siri shed its light. The advancement and ubiquitous appearance of mobile devices renders NUI a great deal of sense, as typing on small devices is painful. The breakthrough deep neural network (DNN) in 2012 brought speech and image recognition accuracy to a new territory and rallied both research and industry. From then on, artificial intelligence (AI) is finally awake and large. Established companies and startups alike are placing their bets, hoping to be at the forefront of this evolution. Mobvoi, as an established startup, is one of these firms which are determined to play an important role in this historical evolution.
I will talk about what our company does and what our vision is, and present some technical work we’ve built along the way.
Mei-Yuh Hwang obtained her Ph.D. in computer science in 1993 from Carnegie Mellon University. She learned the core of speech recognition from Dr. Kai-Fu Lee and Raj Reddy. After graduation, she worked at Microsoft Research, later Microsoft speech products, Bing Translation and Chinese Cortana, while serving as a researcher at University of Washington in 2004-2008. Her publications include speech recognition, machine translation and language understanding in the major conferences, IEEE journals, and U.S. patents. She joined Mobvoi in 2016 and is leading its R&D division in Redmond, Washington.