[Pmwiki-users] stdmarkup.php quotes (was: (:addcomment:) markup)

Patrick R. Michaud pmichaud
Thu Oct 28 14:13:41 CDT 2004


On Thu, Oct 28, 2004 at 09:07:29PM +0200, Knut Alboldt wrote:
> 
> BTW one question to stdmarkup.php:  often, you used double quotes in the 3. 
> and 4. argument of markup() which makes the regex more complicated (using 
> more escaping \) and more difficult to read.
> 
> Markup('if','fulltext',"/\\(:(if[^\n]*?):\\)(.*?)(?=\\(:if[^\n]*?:\\)|$)/se",
>   "CondText(\$pagename,PSS('$1'),PSS('$2'))");
> 
> could it be :
> 
> Markup('if','fulltext','/\(:(if[^\n]*?):\)(.*?)(?=\(:if[^\n]*?:\)|$)/se',
>   'CondText($pagename,PSS('$1'),PSS('$2'))');

Short answers: 
   Argument 3:  Probably no, because of the \n's.
   Argument 4:  No, because of the way PHP's preg_replace function modifies
      quotation marks in the $1 and $2 arguments.

Long answer for argument #4
---------------------------
Nope, especially for argument #4.  First, note that

     'CondText($pagename,PSS('$1'),PSS('$2'))'

is a syntax error, you'd have to use either
     
     'CondText($pagename,PSS("$1"),PSS("$2"))'
or
     'CondText($pagename,PSS(\'$1\'),PSS(\'$2\'))'

and the second one is actually longer and has more escapes than the original.
But even here there's a problem, in that the "/e" option in the 3rd argument
causes PHP's preg_replace() function to automatically put backslashes in front
of any single quotes it finds in $1 or $2 of argument 4 if argument 4 is
a single quoted string, or preg_replace() puts backslashes in front of 
any double quotes it finds in $1 or $2 if argument 4 is a double quoted 
string.  Note that it doesn't add backslashes in front of all quotes it
finds.  Note also that it doesn't put backslashes in front of backslashes 
it findsin $1 or $2, so you can't just eliminate/process any backslashes you
happen to find because $1 might have really contained a backslash in the
original somewhere.

Thus, this is the purpose of the PSS() function -- it fixes any backslash+
quote sequences that might be in its argument.  But to do this it has to
know what kind of string was used in the replacement, so I wrote PSS to
always assume argument #4 is a double-quoted string.  I could've chosen to
make it work with single quoted strings, but I decided that always
using double-quoted strings is better because there I can prevent 
$-interpolation using a backslash but there's no way for me to force
$-interpolation into a single quoted string if I need it.  

(Whew!)

Note that you don't have to use PSS() if you know the replacement
string won't contain any double quotes, or if you're not using the /e
option in the 3rd argument.  And in these cases you can use whatever
quoting style you want in the 4th argument.


Long answer for argument 3:  

You can't simply replace

   "/\\(:(if[^\n]*?):\\)(.*?)(?=\\(:if[^\n]*?:\\)|$)/se"

with

   '/\(:(if[^\n]*?):\)(.*?)(?=\(:if[^\n]*?:\)|$)/se'

because the first string ends up having newlines in it while the
second string contains backslash+n, and I'm not sure that pcre_replace
correctly thinks of \n as "newline" when it runs.  I've had problems
with it in the past, where I've had to end up doing \\n or \\\n or \\\\n
to get it to work right.  Doing it in double quotes "just works".

And it's probably not necessary to double all of the backslashes in the
double-quoted version anyway--one could probably do

   "/\(:(if[^\n]*?):\)(.*?)(?=\(:if[^\n]*?:\)|$)/se"

and everything would work alright.  But I've found that PHP's rules for 
backslash processing are just plain weird, in that "\n" is a newline, 
"\t" tab is a tab, but "\c" is a backslash+c.  (Most languages would 
typically replace "\c" with just 'c'.)  And you can't rely on letters 
vs. punctuation to know what will happen, either: "\#" is a backslash+#, 
but "\$" is a single dollar-sign, and "\"" is a double-quote while "\'" 
is backslash+single-quote.  (And '\'' is a single-quote while '\"' is a 
backslash+double-quote.  Got it? :)

So, to keep myself from getting tripped up on PHP's nuances, I've
adopted the personal guideline that anytime I want a backslash in the 
result string I always use \\, even if PHP might let me get away with a 
single \.  That way if I later change the character following the \ (\\) 
or the quotes around the string, it still does exactly what I intended
in the first place.

Pm



More information about the pmwiki-users mailing list