[pmwiki-users] Search for terms with ss and ß
Petko Yotov
5ko at 5ko.fr
Thu Feb 2 06:32:08 PST 2023
On 02/02/2023 13:29, Hans wrote:
> I noticed when searching and the query contains a 'ss' or a 'ß' , that
> PmWiki will search for both and deliver the right results, seemingly
> treating 'ss' and 'ß' as equivalent. This is great, but I wonder how
> it is done, as it may be a useful behaviour for TextExtract too.
On pmwiki.org we have enabled UTF-8 and this conversion is done in the
function utf8fold() in scripts/xlpage-utf-8.php.
This function "folds" (converts to lowercase) letters which have a lower
case. This is done before saving the page terms in wiki.d/.pageindex.
When someone searches, it similarly folds the search terms.
utf8fold() uses the $StringFolding array which defines "ß" ("\xc3\x9f")
as "ss".
Normally you should use the global $StrFoldFunction(terms) to fold your
search terms - this ensures you use the same function as the one that is
used when storing the page index data.
I recently wrote a function what replaces accented letters with plain
ones, so you can search for "voilà" or "voila" and it will find both
(also in Cyrillic).
When I find the time to publish it and document it as a cookbook recipe,
it will work by redefining the $StrFoldFunction variable to the name of
the custom function. So again you would use $StrFoldFunction(terms), as
will PmWiki.
Petko
More information about the pmwiki-users
mailing list