[Pmwiki-users] Re: cyrillization

Konstantin Zadorozhny kzadorozhny
Sat May 22 00:26:08 CDT 2004


Hello, Anton!
You wrote  on Mon, 17 May 2004 23:25:40 -0500:

 AB> I'm new to this list, and i have a question that maybe was already
 AB> answered somewhere else, but at least i couldn't find anything on
 AB> Patrick's site. Is it possible to have the English version of pmwiki
 AB> display correctly the pages that have some text in cyrillic (for
 AB> example Russian) and some in latin script?

    You could take a look here
http://www.pmwiki.org/wiki/PmWiki/Internationalizations. There is file for
iso-8859-5 encoding. Personally am trying to make work UTF-8 right now.
Generally everything is fine except some problems with output.



PmWiki fails to match FmtWikiLink pattern correctly for is there is certain
characters inside word.

Foe example, this word “Ðóññêèé” (binary ‘d0a0d183d181d181d0bad0b8d0b9’)

in index.php this matches to
"/(\\b[[:upper:]][[:alnum:]]*(?:[[:upper:]][[:lower:]0-9]|[[:lower:]0-9][[:u
pper:]])[[:alnum:]]*(#([A-Za-z][-.:\\w]*))?)/" pattern here:



    foreach($linkpats as $pat=>$rep) {
      $re = "/($pat)/";
      while(preg_match($re,$x,$match)) {
        $x=preg_replace($re,"$lp$lpcount$lp",$x,1);
        if (function_exists($rep))
          $txt = $rep($pat,$match[1],NULL);
        else $txt = preg_replace("/^$pat\$/",$rep,$match[1]);
        $lpv[$lpcount++] = $txt;
      }
The problem is that out is “Ðó?/span>??ñêì/span>??é”. Only 3 characters  2-4
from word get matched.



I don’t know how to fix that. Any suggestions from developers are welcome.
With best regards, Konstantin Zadorozhny.






More information about the pmwiki-users mailing list