[pmwiki-users] Re: Modified (:markup:)
Joachim Durchholz
jo at durchholz.org
Sun Mar 20 16:38:41 CST 2005
chr at home.se wrote:
> Could this be made to nest properly?
Nesting is a *very* hairy issue if parsing is regex-based.
With standard regexes, it's impossible. (The CS slogan is "regexes can't
count". To do proper nesting, they'd have to count opening and closing
parentheses and suspend matching until the count reaches zero.)
Perl-style regexes aren't standard regexes. However, even these can't
count well enough to do nesting.
There's an experimental "recursive pattern" feature, but the
descriptions on
http://www.php.net/manual/de/reference.pcre.pattern.syntax.php sounded
quite unattractive to me:
* needs PHP >= 5.0 (Debian woody currently is at 4.1.3)
* seems to require merging all delimiter pairs into a single regex
* inner structures need to be reparsed
(substructures aren't capturable in $1, $2, ... variables)
* needs once-only subpattern hackery to avoid inefficiencies
Alternatively, we could roll our own parsing machinery. The ruleset
machinery would remain largely unchanged, just the nestable rules would
need to have the replacement substitution to be deferred until after the
parser had a chance to establish the nesting structure. ("Parser" is one
of those scare words, but parsing for parenthese-style constructs isn't
really difficult.)
Nothing in this is exactly rocket science, but getting the details right
requires some careful design, and actually implementing it would also
take some serious effort.
Personally, I'd go for it despite the workload.
But that's just my personal preference, others might find other things
more relevant.
Just my 5c.
Regards,
Jo
More information about the pmwiki-users
mailing list