[pmwiki-users] Defaulting PmWiki to utf8

ml ml at simple-groupware.de
Fri Nov 16 16:59:05 CST 2007


I would add a BOM header to the files. That way pages can be converted
on the fly.
(see http://en.wikipedia.org/wiki/Byte_Order_Mark)
For filenames, urlencode() / urldecode() may be used. urldecode should
work for both, urlencoded and not-encoded filenames.


sti at pooq.com wrote:
> Patrick R. Michaud wrote:
>> On Thu, Nov 15, 2007 at 12:35:27AM -0500, sti at pooq.com wrote:
>> So, perhaps the correct baby step is to switch PmWiki to using utf8
>> by default via its present mechanisms (i.e., without name mappings),
>> and then add name mapping features as a post-2.2.0 improvement.
>> Folks who prefer the somewhat nicer encodings for pagenames (i.e.,
>> %e7 instead of %c3%a7) will still have the option of selecting
>> iso-8859-1 for their systems.
> That is certainly doable on its own.
>>> Of course, in cases like Chinese where EVERY name is manged,
>>> that file may grow very big, very fast.
>> Yes, but for the moment I'm principally concerned only with mapping
>> of iso-8859-1 names.  People who are using PmWiki in Chinese are
>> already using utf-8 and I don't feel as pressing a need to solve
>> url mapping issues there yet.  Nor am I familiar enough with Chinese to
>> know the character mappings... but if someone can provide it we can
>> give it a try.
> I'm not familiar with Chinese, but a couple of programmer friends of mine are,
> and I think I can get one of them to help with the character mapping. One of
> them mentioned the 25,000 pinyin line table I talked about earlier.
> I have no idea how one would tell PHP to load that into memory once, so it
> didn't have to be reread on each page-load though.
