Last week I stumbled upon this amazing service called Google Transliteration that can be accessed through a bookmarklet (jargon explained at the bottom). You can use this to type in one of the Indic languages in any text input box on the internet! (whether it really gets saved depends on the website ๐ ) Language currently supported: Arabic, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Nepali, Persian, Punjabi, Tamil, Telugu & Urdu.
Update (21-Feb-10):
After reading this post, one my valued readers questioned the utility of this service! And this is what I wrote back:
Few years back acquiring Indic fonts, and learning to use Indic keyboard layout was a challenge. Google eased that with a web service which takes away reluctance to reply in local languages.
With such a service, an application developer need not provide for transliteration as a feature (its a feature in Gmail). Creating a database with double-byte storage is enough to record input in any language.
Also, Transliteration can help people understand how words are pronounced when they are familiar with a different script. However, this may not work when the same word is spelled in multiple ways. eg. Mohammed [Read more]
With CJV languages, transliteration will often yield only an approximate result.
You can get bookmarklets for several Indic & other languages at the official Google project page. As instructed on the usage page, simply drag a bookmarklet on the bookmarks toolbar in your browser. It should look like this!
Once you have the bookmarklet, you just need to click it once before using the service, on which a ‘Loading Google Transliteration’ message will appear.
Activation of the service is confirmed by the little (เค ) icon in the text-box. Start typing the nearest English equivalent to get best results.
Clicking the bookmarklet will toggle the service into disabled mode, and the (เค ) will disappear.
Further improvement can be done by transliterating character-by-character instead of word-by-word. Although a costly affair, this will help provide the user a type ahead list with options. For example, when typing in Marathi, every consonant has 2 velantis, the type ahead list can show what English characters are needed to produce them: like e for the first one (short velanti) and ee for the second (long velanti). Read more about the anatomy of Devanagari script here.
Bookmarklets have not yet progressed to a level wherein users can provide inputs to it before running them. This would allow having a single bookmarklet for all languages, and select one before use. This can be easily achieved by adding a Language drop-down to the yellow div that displays ‘Loading Google Transliteration’. But for now, you will need one bookmarklet per language.
Jargon:
Transliteration is not translation! Translation (t13n, similar to i18n & g11n) involves changing the script (writing system) while keeping the meaning same. Transliteration, on the other hand, only changes the script for the input text – it does not bother to retain the meaning! Bookmarks, similar to ones used when reading, are pointers you retain to your favorite pages. Several browsers now let you keep bookmarks in the ‘bookmarks toolbar’ for easy access.
Bookmarklets are bookmarks that carry scripts that not only call a page but also instruct some function and/or behaviour. For example, the Google Bookmarks (to store bookmarks online) bookmarklet not only opens the ‘Add new bookmark’ page, but does so in a pop-up which disappears on saving the bookmark.