[Pmwiki-users] Re: null characters or pattern breaking characters
Christian Ridderström
chr
Wed Jan 14 05:56:23 CST 2004
I forgot to send this for a few days, so I hope it's still useful :-)
/Christian
On Mon, 12 Jan 2004, Patrick R. Michaud wrote:
> I've done a bit more research and several comments come to mind:
>
> 1. It'd be really handy if the "null character" sequence began
> with a character that's already not considered to be part of a valid
> URI. In PmWiki that set is currently
> space < > [ ] " ' ( )
Why would it be handy?
(In my original suggestion, the null-token would be substituted with a
'null-token-character' before any other substitutions are done. Then the
usual substitutions etc are performed and finally, during the actual
output of HTML-code the 'null-token-character' is just removed.)
> 2. After re-reading RFC 2396 and RFC 2732, it's apparent that there are
> a characters that are not allowed in URIs that PmWiki currently
> allows. In particular, the following characters are not allowed
> in the path component of URIs:
> space < > " { } | \ ^ `
> Of course, this doesn't mean that there aren't people and systems that
> build URIs using these characters (e.g., the vertical brace)--it just
> means that those URIs aren't technically valid. So, there's a reasonable
> argument to be made that PmWiki should add each of the above to the
> URI delimiter
I agree that pmwiki definitely shouldn't include the characters above in
URIs (the user will just have to write them using %xx).
>, which would likely eliminate much of the need for the
> null character sequence in the first place (unless I'm missing a case).
If you mean situations where it's *necessary* to use a null-token, that
might be true, but there are still cases where using a null token is very
convenient. For instance, I just saw that this is how I
should quote directives:
[=[[=]include:...]]
which looks a bit confusing to me and you have to be careful how you
nest the brackets... OTOH, using some different alternative null tokens,
any of these alternatives could be used:
token= &NULL; &; ``
[[&NULL;include:...]] [[&;include:...]] [[``include:...]]
[&NULL;[include:...]] [&;[include:...]] [``[include:...]]
[[include:...]&NULL;] [[include:...]&;] [[include:...]``]
where they have the advantage that you can put the null token anywhere.
----
A parallell idea: Maybe backticks could escape directives? E.g.
`[[include:...]]
I'm not sure about the implementation though... what would this produce:
`[[http://www.bla.org [[http://www.bla.org]]]]
----
> 3. On the other hand, PmWiki sometimes departs from rigorously following
> a standard in order to be consistent with common practice or meet other
> goals. For example, parentheses and single quotes *are* valid characters in
> a URI, but PmWiki excludes them from the URI sequence because they're
> more commonly used in PmWiki as delimiters than as components of other
> URIs. So, just as PmWiki disallows some characters that the URI spec
> allows, there may be practical reasons that PmWiki should continue to
> allow characters that the URI spec disallows.
I think it's perfectly fine for PmWiki's URI patterns to be *overly*
restrictive and only swallow a subset of the allowed characters,
because the user can always write the unusual characters using %nn
(we should add some documentation of this though, and a small table of
what the codes are for some of the unusual (but valid) characters)
> 4. Finally, after writing #3 above it occurs to me that we already have
> a null character sequence that would work: '''' (four single quotes).
> PmWiki already excludes single quotes from the URI pattern, and four
> single quotes becomes an empty italics sequence. In fact, this is the
> common "null character" sequence in many existing wikis, which use it
> for pluralization and alternate endings of WikiWord''''s.
> http://www.pmichaud.com/wiki/Test/InterLinkPattern demonstrates that
> this works as desired.
It works sometimes, but sometimes it fails... see this page
for examples:
http://www.pmichaud.com/wiki/Test/NullTokenTests
/Christian
--
Dr. Christian Ridderstr?m, +46-8-768 39 44 http://www.md.kth.se/~chr
More information about the pmwiki-users
mailing list