| To Standardize or Not? The lack of standardization in software codes, fonts as well as keyboards, is
one of the main reasons why Indian language software has not taken off in a big
way. When Indian language software development started in the early ‘80s and
gained momentum through the ‘90s, data re-usability and inter-operability were
not seen as important issues. Since font-based solutions worked on top of
existing English-oriented applications, they threw up their own problems. To
cope with this, there were work-arounds. However, these English applications
were not designed to handle Indian-language situations and solutions had severe
limitations in terms of processing the data - especially tasks like searching
and sorting.
None of the Indian languages have an internationally recognized ‘character
set’. As against the universally accepted fonts in English, Indian language
software users cannot view documents created by other users unless both use the
same software package and fonts. As for Indian languages on the Internet, some
websites store information as bit map images, thereby carrying absolutely no
linguistic information directly in electronic form. Majority of the Indian
language web sites, however, store the text in the form of font glyphs. (A glyph
is a graphic symbol that provides the appearance or form for a character. A
glyph can be an alphabetic or numeric font or some other symbol that pictures an
encoded character. )
Content on these sites can only be viewed if the same fonts are installed on
the local machine. Using dynamic fonts could solve this problem to a certain
extent, but it involves additional cost of transmission. None of these problems
exist for English, since English sticks to the ASCII standard.
In 1991, the Bureau of Indian Standards adopted the Indian Standard Code for
Information Interchange, the ISCII standard that was evolved by Department of
Electronics and by a standardization committee comprising CDAC and few other
vendors in 1986-88. ISCII uses 8-bit coding, which again is not compatible with
the 16-bit coding of Unicode that is in use globally.
Unicode compliance is increasingly being positioned as the answer to the
standardization problem. However, with Unicode, there is an issue of
transmission efficiency. The transmission cost for Indian languages will be
three times that of English. Indian character codes occupy less than 127 codes
for each language. So what could have been transmitted in one byte if one uses
ASCII will be transmitted in a sequence of two to four bytes.
And though there is a significant effort across companies to ensure Unicode
compliance, companies like Summit Infotech say that the real problem lies in the
catching up to be done by other players. "The Indian language Oracle 9 i is
Unicode compliant, but the Quark Express legacy system it needs to integrate
with, is not. That means we go a step backward and provide the client what he
needs," explains Summit Infotech MD Rakesh Kapoor. Even as software
companies and users continue to grapple with the problem of standardization,
there are stray attempts to move away from proprietary to open source platforms.
"Practically every government agency goes through a fresh evaluation of
the encoding standards and font designs introducing endless delays in the
process of adopting this technology. The industry on it’s part has contributed
to the problem by fueling this debate. Therefore Linux as a low cost platform
offers a big oppurtunity for growth of this price sensitive market segment, and
Mithi is taking definite steps in this direction by adopting Linux and other
open source platforms" says Tarun Malaviya, CEO of Mithi Software
Technologies. Page(s) 1 2
|