[pmwiki-users] special characters revealing anchors

adam overton a at plus1plus1plus.org
Sun Nov 16 22:49:51 CST 2008


cool, so this was from a little while ago, but i finally figured out  
what was meant by "redefine the anchor rule so that it recognizes non- 
ASCII characters in anchors" (duh - took me a minute to realize what  
was meant) - so i found the rule and changed it (in my config.php) to:

## [[#anchor]]
Markup('[[#','<[[','/(?>\\[\\[#([A-Za-z0-9\w][-.:\\w]*))\\]\\]/e',
   "Keep(TrackAnchors('$1') ? '' : \"<a name='$1' id='$1'></a>\",  
'L')");


while fixing this, i also discovered that the anchor-markup won't  
allow anchors that begin with a digit.
anyhow, this works for my needs, as it doesn't seem to break any code  
(holler if you think otherwise), and i'm happy with an anchor or two  
failing (now silently) here and there for my purpose. for the  
curious, i've noticed that anchors beginning with either a special  
character or a digit seem to work fine in firefox, whereas only  
anchors that start with digits seem to work in safari.

thanks for the leads, Patrick & JF.
adam


On 28 Oct 2008, at 8:03 AM, Patrick R. Michaud wrote:

> On Tue, Oct 28, 2008 at 09:42:38AM +0100, Jean-Fabrice [gmail] wrote:
>> 2008/10/28 adam overton <a at plus1plus1plus.org>:
>>> i recently discovered some broken/visible anchors on a user's page.
>>> his use of a special character at the beginning of the anchor seems
>>> to be the culprit, as it causes the anchor to become visible  
>>> (special
>>> characters within a word don't seem to cause problems). here is an
>>> example;
>>>
>>>     [[#àdroite]]
>>
>> afaik, pmwiki respects w3c standards and recommendations while this
>> syntax ([[#àdroite]]) does not.
>> Take a look at http://www.w3.org/TR/REC-html40/types.html#type-name :
>> ID and NAME tokens must begin with a letter ([A-Za-z]) and may be
>> followed by any number of letters, digits ([0-9]), hyphens ("-"),
>> underscores ("_"), colons (":"), and periods (".").
>
> This is correct -- PmWiki follows the w3c standards here, and only
> recognizes A-Za-z, digits, hyphens, underscores, colons, and periods
> in anchors.  Anything else causes PmWiki to not recognize the [[#...]]
> as an anchor.
>
> There has been some discussion of getting PmWiki to automatically fold
> non-ASCII characters into the ASCII set, so that [[#àdroite]] would
> generate "adroite" in the anchor tag, and thus be valid HTML.  But
> this would undoubtedly confuse people because a url ending with
> ...#àdroite would not find the anchor.
>
> It's also possible to redefine the anchor rule so that it
> recognizes non-ASCII characters in anchors and uses them in the
> output; this of course results in invalid HTML if used (at least
> according to the current spec).
>
> Pm




More information about the pmwiki-users mailing list