Localization of Language in Digital Text on the Internet
Introduction
We have been considering the changing spaces of text and text technologies, and the impact this has had on teaching, learning and communication. In Bolter (2009) the author discusses “the breakout of the visual” (p.47), and originally this paper was to look at the possible increases in accessibility that a heavier reliance on visual elements over text might bring. The problem with that is that visual elements have a culturally constrained language, just as much as text.
At a very basic level, some elementary functions of web pages, like navigation buttons, are based on Western symbols, and text direction can be right-to-left, or top-to-bottom. As an example, help buttons are often circles with a question mark. If your script uses a different mark to indicate a question, you would have to learn the convention that “?” indicates the help menu.
So if graphics and visuals are not going to be the tools which localize content and access to the Internet, is there an equality of text on the WWW? Is this important? Is English really the dominant language of online text, and if it is, is this a major barrier to access for the large percentage of the population who are not literate in it? How is the English and Roman alphabet history of computing languages influencing the form and function of the Internet today? Is it possible that without invasion, or even physical contact, English-language communities will come to re-colonize the non-English literate societies by replacing local languages and scripts (Djite, 2008)? Is the legacy of the English-dominated origins of the Internet possible to overcome?
In the concept map that accompanies this paper, we examine two main points. First, that the development of text on the Internet and digital text in general, follows the same path as the development of movable type by Johannes Gutenberg. Both technologies began by utilising Western and European languages, but movable type is now able to represent almost any script used by humans. Is it now possible to represent non-English and non-Roman scripts in digital text? Second, if there is evidence that the dominant language of the internet, originally English, is being challenged by other scripts and languages, and in particular by non-Roman scripts, then we may tentatively conclude that the language path of the Internet will follow the same route as movable type and become more accessible internationally.
Figure 1 Click for Cmap Tools Main Page
The main body of this paper has been represented in a concept map, using Cmap Tools software from the Institute for Human and Machine Cognition. Cmap Tools allow for the building of collaborative concept maps and a spatial representation of ideas and thoughts. Please look at the concept map for an elaboration of the main argument (link to a web-based version of the map is below). When you first look at the map it is best to follow the blue arrows which indicate the most linear path through the information. As English is my first language, the map was created with a distinct left-to-right, top-to-bed bias.
The main summaries of the argument are in purple nodes joined by purple arrows. The diagram below gives an overview of the main sections of the map; the web version will be full sized. Nodes which have an annotation or a web link have a small icon in the lower margin of the node (see diagram below). Clicking on the icon will bring up the title of the resource and clicking on the title will open the web page, image or file.
Figure 2 Node with web link
WWW vs. Gutenberg? Concept Map Overview
Link to the Cmap concept map:
http://cmapspublic.ihmc.us/rid=1HY8M1WPT-119PY9T-1MWN/WWW%20vs%20Gutenberg.cmap
References
The African Language Academy (ACALAN) (2010). African Languages And Cyberspace. Retrieved November 16, 2010, from ACALAN.org: http://www.acalan.org/eng/projets/cyberespace.php
African Network for Localization. (2010). About ANloc. Retrieved November 30, 2010, from africanlocalization.net: http://www.africanlocalisation.net/
Anderson, D. (2007, January 4). A Field Linguist’s Guide to Unicode. Retrieved November 15, 2010, from Script Encoding Initiative: http://webcache.googleusercontent.com/search?q=cache:AuiEj5LSZN4J:www.ailla.utexas.org/site/lsa_olac07/Anderson_Unicode.ppt+script+encoding+initiative&cd=7&hl=en&ct=clnk
Askew, M. K., & Wilk, R. R. (2002). The Anthropology of Media: A Reader. Retrieved December 1, 2010, from Google Books: http://books.google.com/books?hl=en&lr=&id=L2h-pKb2tVAC&oi=fnd&pg=PR8&dq=chinese+movable+type&ots=BELncPoVk5&sig=lsMyaRDHiOfz75c4kQCANLGYf5U#v=onepage&q=chinese%20movable%20type&f=false
Ayna Corporation. (2010). مرحبا, تسجيل دخول | إنشاء حساب. Retrieved November 3, 2010, from Ayna: http://www.ayna.com/
Baidu Inc. (2010). Baidu Search Engine. Retrieved November 17, 2010, from 搜索设置 | 登录: http://www.baidu.com/
Bharath, A., & Madhvanath, S. (2008). Online Handwriting Recognition for Indic Scripts. Retrieved November 30, 2010, from Hewlett Packard: http://www.hpl.hp.com/india/documents/papers/HPL-2008-45.pdf
Bolter, D. J. (2009). Writing Space: Computers, Hypertext and the Remediation of Print (2nd Ed). New York: Routledge.
Connor, A. (2001). The Digital Divide . Retrieved November 3, 2010, from SIL International: http://scripts.sil.org/digitaldivide
Crystal, D. (2009). Engish Professor says Arabic may overtake English in the future. Retrieved November 30, 2010, from YouTube.com: https://www.youtube.com/watch?v=_IJk5Tzh8jM&feature=related
Diamond, J. (2005). Writing. Retrieved November 17, 2010, from Guns, Germs and Steel: http://www.pbs.org/gunsgermssteel/index.html
Djite, P. (2008). From liturgy to technology: Modernizing the languages of Africa. Retrieved November 15, 2010, from ingentaconnect.com: http://www.ingentaconnect.com/content/jbp/lplp/2008/00000032/00000002/art00002
Encyclopedia Britannica Online. (2010a). ASCII. Retrieved November 30, 2010, from eb.com: http://www.britannica.com/EBchecked/topic/37933/ASCII
(2010b). Unicode. Retrieved November 3, 2010, from eb.com: http://www.britannica.com/EBchecked/topic/1387633/Unicode
Ferrick, The Red Hat (2010). The British Empire. Retrieved November 15, 2010, from Wikimedia Commons: http://upload.wikimedia.org/wikipedia/commons/2/26/The_British_Empire.png
Hammo, B. H. (2008). Towards enhancing retrieval effectiveness of search engines for diacritisized Arabic documents. Retrieved November 3, 2010, from Springerlink: http://www.springerlink.com/content/5p6227t3j6128344/fulltext.html
Hauben, M. (2006). History of ARPANET. Retrieved November 12, 2010, from Instituto Superior de Engenharia do Porto: http://www.dei.isep.ipp.pt/~acc/docs/arpa-Introduc.html
Howeidy, A. (2010). مصر. Retrieved November 20, 2010, from Al-Ahram Weekly Online: http://weekly.ahram.org.eg/2010/998/eg2.htm
Keene, D. (1999). World within walls: Japanese literature of the pre-modern era, 1600-1867. Retrieved November 30, 2010, from Google Books: http://books.google.ca/books?hl=en&lr=&id=gwQTF-9axqoC&oi=fnd&pg=PR11&dq=movable+type&ots=QiiVbx7xP-&sig=kFJClxzLDPEE0HQy3WRST0uDMt8#v=onepage&q=movable%20type&f=false
Krek, M. (1979). The Enigma of the First Arabic Book Printed from Movable Type. Retrieved November 3, 2010, from Journal of Near Eastern Studies, University of Chicago: http://www.ghazali.org/articles/jnes-38-3-mk.pdf
Language Observatory. (2006). The First Pan African Cultural Congress, Nov 13 – 15, Ethiopia. Retrieved November 15, 2010, from Language Observatory: http://gii2.nagaokaut.ac.jp/gii/blog/lopdiary.php?catid=154&blogid=8
Market Wire. (2009). Ayna.com, the Most Visited Arabic Search Engine, Selects Basis Technology for Multi-Language Text Analysis. Retrieved November 15, 2010, from Market Wire: http://www.marketwire.com/press-release/Aynacom-Most-Visited-Arabic-Search-Engine-Selects-Basis-Technology-Multi-Language-Text-1084831.htm
Moukdad, H., & Cui, H. (2005). How Do Search Engines Handle Chinese Queries? Retrieved November 15, 2010, from Webology: http://www.webology.ir/2005/v2n3/a17.html
Mozilla Corporation. (2010). Firefox Dictionaries and Language Packs. Retrieved November 15, 2010, from Firefox Add-Ins: https://addons.mozilla.org/en-US/firefox/language-tools/
Nakamura, L. (2002). Cybertypes: Race, ethnicity, and identity on the Internet. Retrieved November 15, 2010, from Google Books: http://books.google.com/books?hl=en&lr=&id=pw0PK97lbrkC&oi=fnd&pg=PR7&dq=recolonization+english+internet&ots=Gc-WvAD4eG&sig=mF2HlydPBiJHS5SmFBZjppS56Ro#v=onepage&q=english&f=false
OpenOffice.org. (2010). Native Language Confederation Projects of OpenOffice.org. Retrieved November 3, 2010, from OpenOffice.org: http://projects.openoffice.org/native-lang.html
PanAfrican Localisation Project. (2010). PanAfriL10n. Retrieved November 10, 2010, from African localisation wiki – wiki pour la localisation en Afrique: http://www.panafril10n.org/index.php/PanAfrLoc/HomePage
Pimienta, D., Prado, D., & Blanco, A. (UNESCO) (2009). Twelve years of measuring linguistic diversity in the Internet: Balance and perspectives. (C. a. Edited by the Information Society Division, Ed.) Retrieved November 3, 2010, from Unesdoc.unesco.org: http://unesdoc.unesco.org/images/0018/001870/187016e.pdf
Pym, A. (2008). 03. How did English become so important? . Retrieved November 15, 2010, from YouTube.com: https://www.youtube.com/watch?v=15D2kQcmZ8I&feature=related
Richards, J. F. (1997). Early Modern India and World History. Retrieved November 23, 2010, from Journal of World History, University of Hawai’i Press: http://muse.jhu.edu/journals/journal_of_world_history/v008/8.2richards.html
Script Encoding Initiative. (2009). What is the Script Encoding Initiative? Retrieved November 3, 2010, from Department of Linguistics, University of California, Berkeley: http://linguistics.berkeley.edu/sei/
SIL International. (2006). Welcome to SIL’s Non-Roman Script Initiative . Retrieved November 7, 2010, from SIL International: http://scripts.sil.org/Welcome
Tang, F. Y., & Stribley, K. (2005). Project SILA: SIL.ORG Graphite and Mozilla Intergration Project. Retrieved November 27, 2010, from mozdev.org: http://sila.mozdev.org/index.html
Texin, T. (2010). International Domain Name Examples. Retrieved November 15, 2010, from Internationalization (I18n), Localization (L10n), Standards, and Amusements: http://www.i18nguy.com/markup/idna-examples.html
(2005). Internationalizing Web Addresses. Retrieved November 15, 2010, from 27th Internationalization and Unicode Conference, Berlin: http://www.i18nguy.com/markup/Internationalizing%20Web%20Addresses-iuc27.pdf
Translate.org.za. (2007). Home. Retrieved November 15, 2010, from Translate.org.za: http://translate.org.za/
UNESCO. (2003). Recommendation Concerning the Promotion and Use of Multilingualism and Universal Access to Cyberspace. Retrieved November 3, 2010, from Legal Instruments: http://portal.unesco.org/en/ev.php-URL_ID=17717&URL_DO=DO_TOPIC&URL_SECTION=201.html
Unicode Consortium. (2010a). Unicode 6.0 Character Code Charts. Retrieved November 3, 2010, from The Unicode Consortium: http://www.unicode.org/charts/#scripts
(2010b). Emoji and Dingbats. Retrieved November 3, 2010, from http://www.unicode.org/faq/emoji_dingbats.html
Universal Declaration of Linguistic Rights. (1996). Universal Declaration of Linguistic Rights. Retrieved November 13, 2010, from http://www.linguistic-declaration.org/index-gb.htm
Watson, K. (2007). Language, education and ethnicity: Whose rights will prevail in an age of globalisation? Retrieved November 3, 2010, from Science Direct, International Journal of Educational Development: http://www.sciencedirect.com/science?_ob=MImg&_imagekey=B6VD7-4MW95HF-3-1&_cdi=5975&_user=1022551&_pii=S0738059306001441&_origin=search&_zone=rslt_list_item&_coverDate=05%2F31%2F2007&_sk=999729996&wchp=dGLzVlz-zSkzk&md5=a7e2bda5f04e4fea29a61d8d58cf77ec&ie
Yahoo!Maktoob. (2010). عربي. Retrieved November 15, 2010, from Maktoob: http://en-maktoob.yahoo.com/?p=xa
Яндекс. (2010). Яндекс.Видео. Retrieved November 15, 2010, from Yandex Search Engine: http://www.yandex.ru/