[pmwiki-users] UTF-8 in page names

Hans Bracker design at softflow.uk
Sat Feb 11 05:45:54 PST 2023


Hi Petko,
on the subject of UTF-8 in page names I like to go further.
I am trying to write the code, for a recipe hopefully, so that with UTF-8 enabled one can create easily very readable page names, by changing from CamelCase naming to using "-" dashes (hyphens) as punctuation between words, and also not insisting on having words with the first letter in Upper case.

I got this as $MakePageNamePatterns, which works quite well:

 array(
    '/[?#].*$/' => '',                # strip everything after ? or #
    "/'/" => '',                      # strip single-quotes
    "/[^$PageNameChars]+/" => '-',    # convert everything else to hyphen
    '/ /' => '',                      # strip spaces
    '/-+/' => '-',                    # strip extra hyphens
  ); 

This will let me create nice URLs, with dashes between words, and words in lower case or with upper case letters as per the link. Basically I can throw in  nearly anything between [[ and ]] and it comes up with a very readable URL.

But then when clicking such link to open the edit form for the new page, PmWiki seems to insist to make the first letter of the name and the group (if it is a new group) as a Capital. I wonder if this could be prevented, and how? I cannot find the code in PmWiki for this happing (FmtPageName??).

So from adding a link like "[[car wash/rinsing the car]]"
I like to create a page "car-wash.rinsing-the-car",
but get a page "Car-wash.Rinsing-the-car".
UTF-8 characters work fine in such links with UTF-8 enabled.
Having different ways writing  upper and lower case letters in links works also well, perhaps because of $StrFoldFunction = 'UnaccentUTF8'; or some other reason I do not understand as yet.
[[Rinsing the Car]], [[rinsing the car]], [[rinsing-the-car]] will all open the page if it exists, which is fine.

I know one problem with using "-" dashes instead of CamelCase is links to PmWiki, Site and SiteAdmin groups, and links inside those groups. But this can be overcome by using the original $MakePageNamePatterns for those groups, and either being careful with links from other groups to pages in those groups, or using an alternative function for $MakePageNameFunction. I written one in which the $MakePageNamePatterns gets set according to which group the link is pointing to.
That way all works quite smoothly.

But I got those Capitalizations for the first letter of Group and Name which I would love to cancel, if possible.


~Hans




More information about the pmwiki-users mailing list