[pmwiki-users] WebFeeds does not seem to work properly

Patrick R. Michaud pmichaud at pobox.com
Mon Sep 27 17:55:03 CDT 2010


On Mon, Sep 27, 2010 at 07:28:42PM +0200, kirpi at kirpi.it wrote:
> Daniel, thank you for your detailed explanation.
> The more I read it, the more I feel I probably miss the real meaning
> of rss, and the less I understand what you say.
> :-|
> 
> I often get *full* pages (title, text, images, whatever) delivered
> through through feedburner.
> How can I do the same thing? How can I let feedburner grab the whole
> page? 

Originally, RSS was intended to be a syndication mechanism to publish
frequently updated works -- i.e., a list of what has changed on the
site, when it changed, and other "metadata" about the changes.

One of the fields in an RSS entry is "description", where the feed can
provide a summary of the new entry.  The intent was that when you viewed
an entry from the feed, you'd see a summary of the change or article
and then decide whether you wanted to follow the link to the full
article.

In recent years, however, many sites have chosen to place the entire
contents of the referenced page into the RSS "description" field, instead
of just a summary.  For many sites with static content this simplifies 
things on the generating end:  one doesn't have to generate a separate
summary -- just use the article text directly.  And on the reading end
it means that instead of receiving a summary of the change, the user sees
the entire article.

But this approach doesn't entirely work out for a system like PmWiki.
First, traditionally there wasn't an official standard for encoding HTML
markup within the description tag, so there was a lot of inconsistency
among RSS feed renderers.  This has become much less of a problem in
recent years, and we could probably update PmWiki to take advantage of
recent standards here.

Of greater difficulty is the fact that PmWiki tends to want to re-render
the HTML output from the source each time it is used, and that can get
*very* expensive on the server -- especially if a feed contains more than
a few pages that need rendering.  So, PmWiki's current default is to
simply use whatever description is provided by the (:description ...:)
markup, and to not provide a description if no markup has been provided.

If you'd like to change PmWiki so that it provides the source markup
in the webfeed instead of a description summary, something like the 
following *might* work:

  $FmtPV['$SourceText'] = 'htmlspecialchars(RetrieveAuthSection($pn,$pn))';
  $FeedFmt['rss']['item']['description'] = '{$SourceText}';

If you really want it to display the text as HTML -- we might be able
to do that... but it will make the RSS generation much more expensive.
We might also be able to get it to use the PageHTMLCache... but this
will still be a little on the expensive side.

Still another possibility would be to have the webfeeds themselves
be cached, so that we incur the expense only the first time a given
feed is generated instead of on each request.  This is also not a
simple modification -- it requires a fair bit of updating to the
web feed code to implement.

Pm



More information about the pmwiki-users mailing list