[pmwiki-users] Speed up PmWiki

Patrick R. Michaud pmichaud at pobox.com
Tue Aug 14 17:36:08 CDT 2007


On Wed, Aug 15, 2007 at 12:12:33AM +0200, Thomas Bley wrote:
> Hello,
> 
> $EnableHTMLCache is 1 and it is being used (I removed some NoCache() 
> calls for this), but it is a pmwiki-2.1.10, so maybe newer versions are 
> faster.
> The page tested has no (:include :) or other special things.


> Differences in the code:
> The pageCacheFile is written with:
> fwrite($fp, serialize(array($FmtV['$PageText'])));
> => so it can't be used by a mod_rewrite redirect (performance 
> improvement only comes with by-passing php)

...but in the code given by PITS.00966, the value being written
to the "cache file" is the result of MarkupToHTML(), yes?
That would seem to imply that none of the skin, sidebar, pageaction,
or other items are being saved in the cached .html file, and
displaying that .html file directly would lose all skin
and header information...?

I'm not saying it's not possible to save the entire page
(sidebars and headers included) -- I'm just saying the code
given in PITS.00966 doesn't seem to handle this.

> Editing one page invalidates the complete cache. This handles all 
> dependencies correctly.
> But if there are only a few dependencies among a large number of pages, 
> this method costs a lot of performance.

Part of the assumption for caching is that updates are relatively
infrequent.  So, "costs a lot of performance" is purely relative.

> Other solutions for dependencies are:
> - define dependencies explicitly in (:static ... :)
> - build a dependency index similar to .linkindex

It's very difficult to keep track of dependencies correctly --
and I think it's unrealistic for authors to do this manually
and reliably.  Also, if I'm reading it correctly, the approach 
being described seems a bit backwards, in that if pages A, B, 
and C depend on page Y, I have to keep track of that fact 
inside of page Y.  And if I add a page D that depends on X, Y, and
Z, I have to remember to update the (:static ...:) list in each
of those pages.

Part of the reason for using serialize() in the HTML cache
is to eventually be able to do smarter dependency checking --
i.e., to check the individual dependencies instead of treating
every update as invalidating the entire cache.

> I'm currently using the last method but for a larger group of users, 
> this may not be ideal. Maybe you have some more ideas ?

One very important point to note about the mod_rewrite approach
is that it completely bypasses any ability to perform authorization
checks.  So, the code that saves the HTML version of the page needs
to do so only if there are no page restrictions on the page.
And any modifications to GroupAttributes need to invalidate
all of the cached pages in the group.

Pm



More information about the pmwiki-users mailing list