[pmwiki-users] Re: Make xlpage-utf-8.php free from mbstring module?

Elias Soong songjl at 163.net
Fri Sep 16 03:02:34 CDT 2005


Sorry, maybe I didn't understand your meaning.. Below is what I have test:

* For old xlpage-utf-8.php with empty $MakePageNameFunction, if I input 
[[¨¦dition]], the rendered page give out a link to Group.Dition.

* When use your new xlpage-utf-8.php, it still the case, give the link 
to Group.Dition.

Is this correct?

Patrick R. Michaud wrote:
> On Fri, Sep 16, 2005 at 12:32:08AM -0500, Patrick R. Michaud wrote:
> 
>>I'm working on an updated xlpage-utf-8.php module now -- I'll send
>>it along shortly.
> 
> 
> Attached is a new version of xlpage-utf-8.php that doesn't require
> the mbstring module.  Try it out and let me know if there are any
> errors.
> 
> One nice thing about this version of xlpage-utf-8 is that it can 
> perform upper/lowercase conversions on utf-8 page names (although 
> it still cannot detect utf-8 wikiwords).
> 
> Again, let me know if it works.
> 
> Pm
> 
> 
> ------------------------------------------------------------------------
> 
> <?php if (!defined('PmWiki')) exit();
> /*  Copyright 2004 Patrick R. Michaud (pmichaud at pobox.com)
>     This file is part of PmWiki; you can redistribute it and/or modify
>     it under the terms of the GNU General Public License as published
>     by the Free Software Foundation; either version 2 of the License, or
>     (at your option) any later version.  See pmwiki.php for full details.
> 
>     This script configures PmWiki to use utf-8 in page content and
>     pagenames.  There are some unfortunate side effects about PHP's
>     utf-8 implementation, however.  First, since PHP doesn't have a
>     way to do pattern matching on upper/lowercase UTF-8 characters,
>     WikiWords are limited to the ASCII-7 set, and all links to page
>     names with UTF-8 characters have to be in double brackets.
>     Second, we have to assume that all non-ASCII characters are valid
>     in pagenames, since there's no way to determine which UTF-8
>     characters are "letters" and which are punctuation.
> */
> 
> global $HTTPHeaders, $KeepToken, $pagename,
>   $GroupPattern, $NamePattern, $WikiWordPattern, $Author,
>   $CaseConversions;
> 
> $HTTPHeaders[] = 'Content-type: text/html; charset=utf-8';
> 
> $KeepToken = "\263\263\263";
> $pagename = $_REQUEST['n'];
> if (!$pagename) $pagename = $_REQUEST['pagename'];
> if (!$pagename &&
>       preg_match('!^'.preg_quote($_SERVER['SCRIPT_NAME'],'!').'/?([^?]*)!',
>           $_SERVER['REQUEST_URI'],$match))
>     $pagename = urldecode($match[1]);
> $pagename = preg_replace('!/+$!','',$pagename);
> 
> $GroupPattern = '[\\w\\x80-\\xfe]+(?:-[[\\w\\x80-\\xfe]+)*';
> $NamePattern = '[\\w\\x80-\\xfe]+(?:-[[\\w\\x80-\\xfe]+)*';
> $WikiWordPattern = 
>   '[A-Z][A-Za-z0-9]*(?:[A-Z][a-z0-9]|[a-z0-9][A-Z])[A-Za-z0-9]*';
> 
> if (!isset($Author)) {
>   if (isset($_POST['author'])) {
>     $Author = htmlspecialchars(stripmagic($_POST['author']),ENT_QUOTES);
>     setcookie('author',$Author,$AuthorCookieExpires,$AuthorCookieDir);
>   } else {
>     $Author = htmlspecialchars(stripmagic(@$_COOKIE['author']),ENT_QUOTES);
>   }
>   $Author = preg_replace('/(^[^[:alpha:]\\x80-\\xfe]+)|[^-\\w\\x80-\\xfe ]/',
>     '', $Author);
> }
> 
> SDV($PageNameChars, '-[:alnum:]\\x80-\\xfe');
> SDV($MakePageNamePatterns, array(
>     "/'/" => '',                           # strip single-quotes
>     "/[^$PageNameChars]+/" => ' ',         # convert everything else to space
>     "/(?<=^| )(.)/eu" => "utf8toupper('$1')", 
>     "/ /" => ''));
> 
> function utf8toupper($x) {
>   global $CaseConversions;
>   static $lower, $upper;
>   if (function_exists('mb_strtoupper')) return mb_strtoupper($x, 'UTF-8');
>   if (!$lower) {
>     foreach($CaseConversions as $k => $v) {
>       if (!preg_match('/^([0-9a-f]+)(-([0-9a-f]+))?(\\/(\\d+))?$/', $k, $m))
>         continue;
>       $cp0 = hexdec($m[1]); $cp1 = hexdec(@$m[3]); $step = @$m[5];
>       if ($cp1 < $cp0) $cp1 = $cp0;
>       if ($step < 1) $step = 1;
>       $s = $cp0; $t = hexdec($v);
>       while ($s <= $cp1) {
>         if ($s < 128) $lower[] = chr($s);
>         else $lower[] = chr(0xc0+(($s >>6 ) & 0x1f)) . chr(0x80+($s & 0x3f));
>         if ($t < 128) $upper[] = chr($t);
>         else $upper[] = chr(192+(($t>>6) & 0x1f)) . chr(128+($t & 0x3f));
>         $s+=$step; $t+=$step;
>       }
>     }
>     print "<pre>";
>     print_r($lower);
>     print_r($upper);
>     print "</pre>\n";
>   }
>   return str_replace($lower, $upper, $x);
> }
> 
> SDV($CaseConversions, array(
>   # ASCII
>   '61-7a'   => '41',
>   # Latin-1
>   'e0-f6'   => 'c0',                        
>   'f8-fe'   => 'd8',
>   # Cyrillic
>   '450-45f' => '400',
>   '430-44f' => '410',
>   '48b-4bf/2' => '48a',
>   '4c2-4ce/2' => '4c1',
>   '4d1-4ff/2' => '4d0',
> ));
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> pmwiki-users mailing list
> pmwiki-users at pmichaud.com
> http://host.pmichaud.com/mailman/listinfo/pmwiki-users





More information about the pmwiki-users mailing list