Specialty Forums
Security and Virus
General Hardware
CPUs/Overclocking
Networking
Digital Photo/Video
Office Software
PC Gaming
Console Gaming
Programming
Database
Web Development
Digital Home

General Forums
Windows XP
Windows Vista
Windows 95/98
Windows Me
Windows NT
Windows 2000
Win Server 2008
Win Server 2003
Windows 3.1
Linux
PDAs
BeOS
Novell Netware
OpenVMS
Solaris
Disk Op. System
Unix
Mac
OS/2

Drivers
Driver Scan
Driver Forum

Software
Automatic Updates

BIOS Updates

My Computing.Net

Solution Center

Free IT eBook

Howtos

Site Search

Message Find

RSS Feeds

Install Guides

Data Recovery

About

Home
Reply to Message Icon Go to Main Page Icon

ABCs of transliterations

Original Message
Name: Phil Perry
Date: July 9, 2006 at 19:06:25 Pacific
Subject: ABCs of transliterations
OS: Ubuntu Linux
CPU/Ram: P4/256
Model/Manufacturer: Dell
Comment:
Is there any place on the Web listing accepted, widely-used, and even official transliteration systems to write non-Latin alphabet languages using ASCII? The transliteration should be lossless (someone can go back to the original language unchanged). Multicharacter equivalents are OK (almost unavoidable), and the Latin-1 alphabet with standard (Western European) diacritics (accents) might be tolerable. I can go with conversion into either Unicode (base letter + diacritics in one 16-bit number) or Unicode base letter + separate codes for various accents, tonal marks, and whatnot.

I'm curious how it's done for something like LaTeX with Babel, or ozTeX, or the like. Do users switch between Latin-1 for the commands and their native keyboard for the text itself? What do people do when all they have is a standard English keyboard and they want to write Greek, Russian and other Cyrillic languages, Hebrew, Arabic, various Indic languages, various Chinese languages, etc.? I'd prefer not to have to invent my own transliteration systems -- better to use those already accepted by lots of computer users.

I have some ideas for Web page software I'm thinking of commercializing, and it would be nice for people to be able to create Web pages in non-Latin alphabets. I can use TeX-style accents (e.g., \' for an acute accent on the following letter) for Latin alphabet-based languages. The source for a page would need to be a mixture of ASCII commands and ASCII or Latin-1 transliterated text. How do various systems handle this? Needless to say, there will probably be multiple transliteration systems in use for any language! My software would have to decode the transliteration and output the Unicode symbols in the HTML (or even output binary 8 or 16 bit characters).

Thanks, Phil


Report Offensive Message For Removal


Response Number 1
Name: anonproxy
Date: July 12, 2006 at 00:11:55 Pacific
Subject: ABCs of transliterations
Reply: (edit)
"Is there any place on the Web listing accepted, widely-used, and even official transliteration systems to write non-Latin alphabet languages using ASCII?"

The two places I would look are with Apple and Microsoft. They will be on top of any working standards.

"Do users switch between Latin-1 for the commands and their native keyboard for the text itself?"

Depends on the language. On the Internet, it is conventional to use ASCII programming languages, API's, etc. The character data is treated like any byte stream in most cases.

"What do people do when all they have is a standard English keyboard and they want to write Greek, Russian and other Cyrillic languages, Hebrew, Arabic, various Indic languages, various Chinese languages, etc.?"

In all cases mentioned they have a modified keyboard layout - in a pinch many are trained to use a Latin QWERTY keyboard with software mappings. Some are more solidified than others - there are many many Chinese variants, for example.

"I'd prefer not to have to invent my own transliteration systems -- better to use those already accepted by lots of computer users."

Honestly, I never thought about it. I'd prefer in almost every case to use the language directly. And decoding transliteration sounds like a nightmare.



Report Offensive Follow Up For Removal

Response Number 2
Name: Phil Perry
Date: July 17, 2006 at 09:22:12 Pacific
Subject: ABCs of transliterations
Reply: (edit)
"anonproxy", thanks for the information. I know there are some transliteration schemes around (e.g., multiletter for Cyrillic). So, do all authors have ASCII available to enter commands, and some way to directly enter their own alphabet? That sounds like your preferred method. It would be easiest for me, too, so long as there is no overlap between byte(s) used for non-ASCII text and bytes used for ASCII command sequences. My ASCII commands will be buried within their non-ASCII language text. Without knowing much about their alphabet, I need to be able to pick out where ASCII commands start. This isn't quite like a regular programming language, where non-ASCII text can be set off cleanly in strings. I suppose I'll need to be told how many bytes per character, any escape codes, and byte ranges used, so I can figure out where their character ends and an ASCII command might begin? HTML, (La)TeX, and others must have solved this kind of problem already!

Phil


Report Offensive Follow Up For Removal




Use following form to reply to current message:

   Name: From My Computing.Net Settings
 E-Mail: From My Computing.Net Settings

Subject: ABCs of transliterations

Comments:

 
  Homepage URL (*): 
Homepage Title (*): 
         Image URL: 
 


Data Recovery Software




XP Installed to G?

exessive internet traffic

ZoneAlarm Question. Blocked Connect

Windows Live Messenger Problem

Delete $Uninstall after SP3 updates


The information on Computing.Net is the opinions of its users. Such opinions may not be accurate and they are to be used at your own risk. Computing.Net cannot verify the validity of the statements made on this site. Computing.Net and Computing.Net, LLC hereby disclaim all responsibility and liability for the content of Computing.Net and its accuracy.
PLEASE READ THE FULL DISCLAIMER AND LEGAL TERMS BY CLICKING HERE

All content ©1996-2007 Computing.Net, LLC