[pmwiki-users] Translation [was pmwiki-users] i18n and iso-8859-13

Algis Kabaila akabaila at pcug.org.au
Thu Apr 7 03:53:12 CDT 2005


On Sunday 03 April 2005 02:01, Patrick R. Michaud in Answer to Old Al (who was 
"up the creek without a paddle", wrote:

> However, I strongly recommend going with utf-8 if at all possible,
> and would prefer the PmWikiLt.* pages on pmwiki.org be done in 
> utf-8 instead of iso-8859-13.  
>

That one email from Pm had a wealth of information and I should save it for 
posterity - and I remain deeply in debt to Patrick.  Everything that Pm said 
was spot on!

The PmWikiLt is in utf-8 and has never been in anything other than utf-8.  I 
have reinstalled my home wiki and specified utf-8 for all languages.  That 
was done with the help of J. Durchholz's suggestion and 
PmWiki/Internationalizations pages.  Actually, I would like to add (in 
Lithuanian) simple instructions of making a site accept the Lithuanian 
charactes and operate with utf-8 encoding, if there are no objections.

Actually, I was under a misconception that the keyboard (Linux, kde 3.3) 
issued one byte codes for "Lithuanian" ("high ascii") characters.  I selected 
from kde a keyboard that has the diacriticals on the top numeric row of the 
keyboard, so when i press 1, the displayed character is ą  if the keyboard 
flag is set to lt.  Interestingly, if I write into a simple editor, such as 
vi, each character is actually encoded into utf-8 (I wrote a little Python 
script to see the "ords" of all characters).  So it seems to me that the OS 
actually maps all keypresses first to utf-8 and to write in iso-8859-13 it 
needs to map it again.

Curiously, there are very few "English" sites that use utf-8 encoding and it 
is the low ascii that is easiest to map to utf-8 -- no mapping is required!
 At least in languages that utilise high-ascii,  some characters need to be 
translated - not the American English or English English or even down-under 
English needs any mapping to be in utf-8.

Translation of some words, viz. "by" is difficult because its meaning is 
dependent on context.   I thought that the best was not to attempt 
translation of "by" at all, in order to avoid the "hydraulic  hammer" 
becoming "water sheep".  I see that a note on the XLPage tells to delete 
items that are not translated.  Can we just comment it out with # or will 
that not work?

Also, would you mind if I ask a 'non-wiki' question - how to correctly specify 
utf-8 encoding in a web page?  (I have used meta tags under the wrong 
impression that this was a standard way of doing it. Currently my **wrong** 
header looks like this:
 <meta content="text/html; charset=utf-8" http-equiv="content-type">  ).  

Kind regards,
Al.
-- 
Algis Kabaila
http://www.pcug.org.au/~akabaila




More information about the pmwiki-users mailing list