[pmwiki-users] More template queries and UTF-8 display

Patrick R. Michaud pmichaud at pobox.com
Thu Jun 16 11:36:22 CDT 2005


On Thu, Jun 16, 2005 at 11:20:28PM +0930, Clytie Siddall wrote:
> 
> Can pmwiki.org not use UTF-8? As Hans was saying, the recent changes  
> result is affected too.

Not easily, or at least not throughout the entire site.  While it
seems like it would be nice to say "well, let's just use utf-8
everywhere", doing so poses some very real drawbacks:

  - Not all PHP installations include support for utf-8.  Thus,
    if PmWiki were to rely on the availability of PHP's utf-8 
    functions, it wouldn't work on many sites.

  - Even when it's included, PHP's support for utf-8 is still
    limited.  To give one particularly distressing example, PHP 
    can't easily distinguish between uppercase and lowercase utf-8 
    characters or convert between them, making the handling of 
    WikiWords and other similar patterns very difficult.

  - The choice of utf-8 versus other character encodings has
    real impacts on the internal storage of files and filenames in 
    PmWiki.  If a site chooses to use utf-8, then all of its pages
    will be stored using utf-8 encoded text, and its pagenames will
    have utf-8 encoded filenames.  While this works great in some
    platforms, there are many of us in the western world who still
    use editors and operating environments that work better with Latin
    encodings rather than utf-8 ones. 

Most sites don't have to worry about multiple character encodings --
they just select one encoding appropriate to the language and it gets
used throughout the site.  However, since pmwiki.org is used to host
multiple languages in different character encodings, I have to do a
little behind-the-curtain magic there to try to get each language's 
files to be stored and distributed in the encoding that is generally
most appropriate for that region/language.  On occasion the curtain
falls, and everyone gets to see the parts where pmwiki.org's magic 
isn't 100% successful.  :-)

Pm



More information about the pmwiki-users mailing list