[pmwiki-users] Using inote markup as a systemwide LatestNews flag/sticky

Patrick R. Michaud pmichaud at pobox.com
Thu Aug 11 21:00:39 CDT 2005


On Fri, Aug 12, 2005 at 12:15:28PM +1200, John Rankin wrote:
> This raises an issue which I have been dodging for a while.
> I would welcome advice and help.
> 
> It's *really* *really* hard to transform wikistyles and style blocks 
> into print for our wikipublisher typesetting project. I think I'm 
> missing something at the conceptual level.

No, I think you're running into the same problem I struggled with
for many years -- the inherent limitations of HTML.  In theory
HTML is supposed to be a semantic markup language.  Unfortunately,
it doesn't provide any way to create new custom semantics (i.e.,
custom tags) -- the best one can do is overload an existing
semantic such as <div> or <span> and then do the rest in CSS.

In my ideal world, HTML would provide a way for me to define
a custom <postitnote>...</postitnote> tag, and then use CSS
styles to tell the browser how it should be rendered --
something like:

    postitnote { display:block; float:right; width:200px; }

Instead the best we can generally do is <div class='postitnote'>, 
which is somewhat less than ideal because semantic information
associated with 'postitnote' is now tied into the CSS attribute 
and not into the HTML tag (where the semantics are *supposed* to go).

And yes, XML holds the promise to make it possible for people
to define custom tags, but the current problem is still that 
the custom tags have to be defined *somewhere else* (i.e., a DTD).  
What's worse, that someplace else uses a totally different language
with a different syntax that has to be learned, and browsers
don't really know what to do with it anyway.

(I haven't looked deeply into XSLT to see if it holds any answers
yet.  But you get the idea -- at present we're limited to 
whatever tags HTML gives us.)

> <issue>
> What to do with >>blocks<< -- the best idea we have come up with is
> to treat these as LaTeX mini-pages and wrap them in a <group> tag
> on output from pmwiki. The problem, as I see it, is that the 
> >>block<< markup is based on html style rules, ie presentation,
> whereas both the print dtd and LaTeX are structure-oriented.

The >>block<< markup is just <div> in disguise -- it simply
gets translated into <div ...> ... </div> on output.  In fact,
the rule for ">>block<<" just changes it to be "(:div:) %block div%"
Depending on the definition of %block%, this then gets turned into
something like <div class='block'> on output.

> ...
> Where I'm stuck is how to map wiki structures like >>sidenote<<
> and %sidenote% into something other than xhtml in a way that
> is meaningful. Most wiki markup is intrinsically simple to
> deal with, because it describes what the object is and we just 
> have to map it to a corresponding print element. HTML css
> is a tougher nut.

Well, it depends on what one thinks of as "most wiki markup".
Certainly %red%, %blue%, %bgcolor=white% are examples of
wiki markup, but they're different in that the same markup
sequence (%...%) is used for a number of different
effects.  It's a way for us to not have to invent a unique
(and sometimes cryptic) character sequence for every
possible thing we might want to encode.  (HTML is the same 
way, except it uses <...> as its delimiters.)

> A simple example: {=a small note=} is easy to translate into
> a marginal note for printing. But how do we decide what to do 
> with %stickynote%a small note% and having decided, how to we
> then apply the correct markup rule?

If we want to create a special markup for every possible
structure and meaning we can identify, then this remains easy.
But the problem is coming up with the special markups in
a way that people can use and remember them.

> <whatWouldHelp>
> 1. detailed comments in scripts/wikistyles.php that explain
>    how the code actually works -- please!

That may take a little while.  I can give an overview of
what happens -- in each markup line it looks for %...%
sequences, which just becomes an array of name=value
pairs.  So, %color=white bgcolor=blue% becomes the
equivalent of

    array('color'=>'white', 'bgcolor'=>'blue')

The special name "define" in a wikistyle places the current
array into a table of named styles, thus 
%define=reverse color=white bgcolor=black% does the
equivalent of

    $WikiStyle['reverse'] = array('color'=>'white', 'bgcolor'=>'black');

Using a name without a value in a wikistyle merges the
named entry in $WikiStyle into the current wikistyle, thus
%reverse font-size=small color=red% sets the current style to

    array('color'=> 'red', 'bgcolor'=>'black', 'font-size'=>'small')

That's all there is to the wikistyle definitions.  The
remainder of the magic takes a wikistyle array and figures
out how to apply the name=value pairs to the elements that appear
in the current markup line.  Most things end up in a <span> around
the text:

    <span style='color:red; background-color:black; font-size:small'>...</span>

Some attributes such as 'target=', 'value=', 'hspace=', and 'vspace='
only make sense with certain HTML elements such as <a>, <li>, and
<img>, so they're applied directly.

The special "apply=" attribute of a wikistyle allows the styling to
be applied to a specific element of the output, such as <div>, <p>,
<ul>, <li>, etc.

> 2. advice on how to translate a given named style into the 
>    correct print-oriented inline tag and attributes
> 3. advice on how to translate a named block style into
>    the correct print-oriented block tag and attributes
> ...

One possible first step would be to drastically limit wikistyle
use in such applications.  If one changed the $WikiStylePattern
to only recognize things of the form %stylename% (no spaces,
no other parameters), then authors would be limited to the
wikistyles predefined by the administrator or markup rules,
and it'd be fairly easy to translate them into the appropriate
print-oriented semantics.  

Another possibility is to replace the ApplyStyles function
with something that still computes the wikistyle attribute
arrays as described above, but then translates those values
into something more readily understood by the printing
mechanism.

Combined with the above, one could decide to allow only
a subset of the existing WikiStyle attributes; i.e.,
allow things like color, bgcolor, and float, but don't
try to accept all of the attributes that PmWiki defines
by default (which are themselves a fairly small subset of
everything that CSS defines).

> I don't understand how to handle arbitrary user-defined 
> styles, unless we have a general way to translate css 
> into something that LaTeX always understands.

Arbitrary user-defined styles may indeed be tough.  FWIW,
currently user defined styles are limited to the following CSS 
attributes (in $WikiStyleCSS):
    color, background-color, text-align, text-decoration,
    font-size, font-family, font-weight, font-style,
    float, display, margin(-left|-right|-top|-bottom),
    padding(-left|-right|-top|-bottom), list-style,
    width, height

So, it's at least a limited subset.

If I were working on something like this, I think I'd
probably start by trying to move as much stuff into
class='somestyle' attributes and ignoring anything
that appears in the style='...' attribute of HTML
output tags.  Printing would then focus on the standard
set of class='...' attributes that are defined by the
administrator and look for delimiting things based on
<span class='xyz'>...</span>  and <div class='xyz'>...</div>,
where the xyz's are already predefined by the administrator
and the formatter.

It's not a great answer, but it's the best I have at the
moment.

Oh wait, no it's not -- I just thought of another.  :-)
Instead of trying to parse the HTML, it certainly wouldn't
be hard to associated custom names with specific wikistyles
that are generated into the output.  Thus if we defined
%postit% as

   %define=postit rframe bgcolor=yellow latex=postit%

then the wikistyles code could generate

    <!--latex=postit-->
    ...
    <!--/latex=postit-->

around the text to be styled.  Or, the value 'postit'
for 'latex=' could be a lookup key for an array that has
the custom latex start-and-end sequences to be generated
for anything being styled by the postit wikistyle -- i.e.,
something like

    /postit{ ... }

This wouldn't entirely help with combined wikistyles like

   >>postit bgcolor=blue<<
   ...
   >><<

but at least the markup interpreter would be correctly
render the enclosed thing as a postit note block as
opposed to some other block.
  
Pm





More information about the pmwiki-users mailing list