Universal Text Input: Microsoft Translator Labs’ web-based input method editor

For large parts of the world, not only is English not the primary language but the localized text is not even composed of Latin/Roman characters. Besides the fact that most IBM PC-standard keyboards only have Latin characters, it’s physically impractical to design one for languages such as Chinese with thousands of unique characters.

That’s where Input Method Editors, or IMEs, come in. Now an experimental Microsoft Translator Labs project “Universal Text Input” is beginning to port an IME to the web, independent of the operating system. Currently only a small set of languages are available: Arabic, Chinese, English, French, Greek, Japanese and Russian.

It is worth noting Microsoft is not the first to do this, Google already has a transliteration IME available and integrated into many of its services. Both systems use a combination of Javascript that sends users’ characters to a JSON-powered web service that attempts to translate the Latin characters into the desired language. Both also offer a bookmarklet to enable this functionality on any website.

However, one difference in Microsoft’s system is that it also offers English as-you-type corrections. Unfortunately the latency to the web service makes it quite difficult to use fluidly.

Although desktop-based IMEs are already more advanced than both of these solutions with prediction algorithms, I anticipate web-based IMEs will have the upper hand in the near future when it comes to crowd-sourcing telemetry and updating its algorithms to become smarter the more people use it.

8 insightful thoughts

  1. Neat, but it looks like they still have some work to do. こんにちは doesn’t even appear in the list when you type ‘konnichiha’, even though that’s the example they give for Japanese…

    > I anticipate web-based IMEs will have the upper hand in the near future when it comes to crowd-sourcing telemetry and updating its algorithms to become smarter the more people use it.

    Some desktop IMEs already send data back about mis-conversions if you give them permission.

  2. “web-based IMEs will have the upper hand in the near future when it comes to crowd-sourcing telemetry and updating its algorithms to become smarter the more people use it.”

    Couldn’t the same thing be done with the IMEs built into the OS or web browser?

    This sounds useful as a crutch for people browsing the web via OS/browsers/devices that lack good IMEs, but it still seems preferable to have a good native IME, and also to have text input behave consistently across the different web sites and applications you might use.

  3. New technology is now making it very easy to learn new languages. Modern tools can help us now to bridge the barriers. learning other language is fun especially using moderns tools like speechtrans.

    Thanks,
    Jack (www.speechtrans.com)

  4. Desktop-based IMEs are far more powerful because you can enter text absolutely anywhere. Hopefully, Windows 8 will include a transliteration IME for all possible scripts. Transliteration is really an innovative idea. Microsoft’s Indic IME is fantastic.

Leave a Reply