Dienstag, 11. Oktober 2011

How speech-recognition endangers languages

Let start this article how lot of articles started this month: Apple introduced new iPhone4S. Main features of this phone are better processor, high resolution camera and speech recognition called Siri. Siri is available in English, German and French and is an intelligent assistant for daily life. It can not only recognize text for SMS, but also answers questions like: How is the weather tomorrow or how to get from A to B. For this Siri needs constant connection to internet, the recognition happens on server and the information is collected from various databases e.g. Wolfram Alpha. The service works best in English, because the databases are not always available with information for other languages, but will be filled soon. Other languages will be added in near future.

So far it sounds very promising. Instead of searching for information over various websites, I can simply ask a question and with a bit of luck I get correct answer, probably even spoken one. So far so good, if I communicate with my phone in English. With German or French life becomes bit harder, not all information is available, and now the real question is which languages will be supported in the future? Probably Top 10 languages, maybe Top 25. But we've just learned that supporting a language by Siri is not enough, also the information-containing databases must have information available, which is for sure not a simple task. Now think a few years forward, what will happen, if technology matures and devices come to market whose only input is voice of their owner? Will these devices be able to support all languages or just concentrate of couple of markets where most money can be earned? Is it too much effort to teach the device such language as Estonian with 800.000 speaking it as mother tongue?

Let take a look at Wikipedia. More than 100.000 articles are available in 39 languages (2.5 Mio articles in English, more than 1 Mio articles in German and French), more than 10.000 articles are available in 64 languages, more than 1000 articles are available in 109 languages, more than 100 articles are available in 95 languages. That's impressive, but ethnologists guess there are about 6500 languages spoken now, about 2/3 of them are endangered.

Look what happened with OCR software. Company Abbyy, with one of the best products on the market supports 189 languages, 45 of them with dictionary support. I could not find a statistic how many languages can be written, but it is quite clear, that these are many more than 189. So documents which are written in these languages cannot be OCRed, which is at least annoying. Situation with Wikipedia is better as with OCR software, probably because it is quite simple to start entering articles in a new language, no technical knowledge is required (the operating system has to support the characters which are used by the language, but with Unicode-support which contains 65.000 symbols this is less of an issue).

Speech recognition requires much more linguistical and technical knowledge than OCR. Feeding databases with information required for a meaningful and useful conversation with a machine is even harder. Therefore it is quite certain that only few languages will be supported with full range of information. What does it mean for the people, who are speaking different mother tongue? If they want to use new devices, they have to use a different language then their mother tongue. So dear Estonian and dear Switzerdutch speaker, be prepared to learn English, German or Russian if you want to have one of these new shiny gadgets.