Last week, Microsoft presented its new translation system called Monolingual TTS at its TechFest 2012.
Photo: a 3D talking head, a snapshot from Microsoft’s video
Monolingual TTS uses a digital recreation of the user’s voice and a realistic rendering of their face to make translation to 26 languages. How does it work? As a user speaks English the system makes a real-time translation of his speech into, for example, Mandarin. It can also listen to responses and translate them to the user. Besides, Microsoft has added a mixed languages response option to the system.
“In a foreign country, it would be convenient if a user of car-navigation system who is not fluent in that particular foreign language can hear instructions in mixed-codes, i.e., entities like street names synthesized in the local language and routing directions in the user’s native language,” says Microsoft on its Research web site.
TTS is turned into multi-lingual system due to special algorithms that render speech sentences of different languages for building mixed-coded, bilingual TTS systems. Thus, the system can now synthesize any mixed language pair out of the 26 languages.
And it’s not all. Microsoft has added to the system a non-verbal part of the communication. TTL animates a 3D scanned avatar of the user in time to the output speech. It syncs head, eyes and lips movements to the output sounds and it seems like the user is actually speaking the chosen language, although he or she doesn’t know it. The video demonstarting how it works can be viewed here.
According to The next web, Microsoft’s Bing search engine is developing a new program for startups and businesses. The details are not disclosed yet but probably it’s a part of the company’s effort to promote Bing.