[pmwiki-devel] Possible bug in xlpage-utf-8.php and RFC:AsSpacedUTF

Petko Yotov 5ko at free.fr
Sat Dec 9 15:38:28 CST 2006


Hello!

I am using the xlpage-utf-8.php script to have a wiki in UTF-8 and the 
AsSpaced function doesn't work as expected, neither for roman (french) 
extended letters, nor for Cyrillic. So, I wrote an AsSpaced replacement and 
while testing I believe I found a bug.

The $CaseConversions array in xlpage-utf-8.php contains two values, defined 
as "\x49\x00" and "\x53\x00" (lines 92 and 111) and I believe there shouldn't 
be a "\x00" character. The uppercase characters are actually \x49 and \x53 
respectively (I and S), when I convert them in my text editor (kate). So, 
when I removed the "\x00", my function works fine.

Here is the new function AsSpacedUTF -- please take a look and tell me if I am 
not doing something wrong. I'll then post it in the Documentation.


function AsSpacedUTF($text)
{
	global $CaseConversions;
	if(!@$CaseConversions) return AsSpaced($text);
	if (!@$lower) {
		$lower = implode('|', array_keys($CaseConversions));
		$upper = implode('|', array_values($CaseConversions));
	}
	
	$text = preg_replace("/($lower|\\d)($upper)/", '$1 $2', $text);
	$text = preg_replace('/(?<![-\\d])(\\d+( |$))/',' $1',$text);
	return preg_replace("/($upper)(($upper)($lower|\\d))/",
		'$1 $2', $text);
}
$AsSpacedFunction = 'AsSpacedUTF';

Thanks,
Petko



More information about the pmwiki-devel mailing list