[Pmwiki-users] Valid XHTML generation

Kirill Lapshin kir at lapshin.net
Wed Jun 4 09:10:30 CDT 2003


> I'm definitely aiming towards xhtml compatibility.  pmwiki-0.5.beta1
> fixes the problems with the uppercase tag names (xhtml tags are always
> lower case).  Getting the <p> and <li> tags to close properly may
> be a bit tricky, as you've already surmised, but I'm willing to work
> on it.  Fixing the <br> and <img> tags ought to be fairly simple.

Wow, that was fast! Honestly I did not expect this problem to be resolved
anytime soon, so I started looking for new wiki engine which would work
without DB and would generate valid xhtml, but now I will put this search on
hold and try to help you out with xhtml generation.

I don't know PHP well enough, but programming is my profession, and PHP is
not tremendously difficult language. A little bit bigger problem is that I
am not that familiar with PmWiki design. So I guess I can't really do any
complicated development but I can try to install current development
snapshot, play with it and maybe fix some bugs. If you could send me your
current version I can give it a shot. Or, maybe you have public CVS?

> Are there any other issues I should know about?  Also, do all browsers
> understand XHTML?  I know that didn't used to be the case, and I don't
> want to switch everything to use XHTML only to discover that it causes
> some browsers to no longer be able to display PmWiki documents.
Well, I see quite a lot of sites switching to XHTML, so it should be
supported quite well by at least 99% of the browsers. Also keep in mind that
to be truly XHTML compatible, you have to serve pages as text/xml MIME, not
text/html. Serving them as text/xml might impose some problems with old
browsers I guess, also by doing so you are making a commitment that you
always generate valid pages -- if browser gets a page as xml it will try to
validate it, and will not show invalid pages.

I would suggest to use text/html by default but give user ability to set
mime type and DTD from local.php. For instance I would use text/xml mime and
supply url to special MathML enabled DTD.

It might be tricky though to make sure you generate valid xhtml all the
time. Closing tags is one problem, not allowing situations like
<p><ul></p></ul> is the other.

As far as old browser support goes, I know people suggest to use <br />
instead of <br/> (add space before the slash), same applies to other tags. I
never seen browser which does not understand <br/> without space, but space
will not hurt for sure, and may improve rendering in some antique browser.

----

I need your advice on plugging in LaTeX formula parsing. Suppose I want to
support following syntax:
\[ formula goes here \]
It is a common LaTeX notation and it should not conflict with Wiki markup.

Now the problem is that I can't really use InlineReplacements or
DoubleBrackets, because I have to call some special function which will
parse formula and spit out MathML markup. So I need some way, preferably
from local.php to say whenever this regex matches replace it contents with
output of this function. Currently I modified pmwiki.php to have something
like:
while (preg_match(...)){
    $text = preg_replace( ... $match[1] .. )
}
Also tricky part is that formulas might me quite long, so they gonna be
multiline. Another complication is that it might have something like <math
xmlsnt="some URL here">...</math> and currently this URL is being processed
by your code which links URLs. I can avoid this by doing latex formula
processing at the very end, and hoping that other markup processing will not
change anything in the formula.

I can see two solutions:
1. I am changing pmwiki.php to do all the formula processing only if some
global variable like $ProcessLatex is set to true
2. You are providing some hooks so that I could do this as custom markup
from local.php.

Second solution seems to be cleaner but it is more work.

----

One more thing. When I first tried PmWiki, I was quite puzzled that even
though you support multilevel numbered and bulleted lists, one can not mix
styles on diferent levels. E.g.:
# one
#* bullet
# two
#* bullet

It would be nice to have, and should not be hard to implement. What do you
think?





More information about the pmwiki-users mailing list