[pmwiki-users] Wikifarm question

Patrick R. Michaud pmichaud at pobox.com
Fri Sep 8 11:49:17 CDT 2006


On Fri, Sep 08, 2006 at 05:06:42PM +1200, John Rankin wrote:
> When we refer to a page in a different field, we do one of 
> 3 things:
> - reject the link if the field is undefined (this works)
> - if the page exists, display a browse link
> - if the page doesn't exist, display an edit link
> 
> We do ths by temporarily resetting $WikiDir to point to 
> the appropriate field's wiki.d. 

As a quick aside: Something about this approach is causing 
alarm bells to ring in my head, but I can't put my finger 
on exactly why.  For now let me just note that the idea of
swapping $WikiDir in the middle of processing is *way* outside
of my mental design spec, so I can't be certain that something
won't break later on.

> In detail (quoting Donald Gordon):
> 
> The problem we have is that PageName() caches its results; once it's
> decided on a page's existence, it will always return the same value for
> that page, even if $WikiDir or $WikiLibDirs has changed in the interim.
> [...]
> I see two possible ways to change this: either make $pe a global
> variable (presumably with a nicer name), or allow a global variable to
> specify a prefix (effectively, a "page namespace") to the index of $pe.

For now I've gone with making $pe a global variable, named 
$PageExistsCache.  But I've also put a note in the code that
says it might go away at some point, so hopefully people won't
rely on it too heavily.

$PageExistsCache is available in 2.1.25 (just released).

Another possibility I considered was to make the page exists cache
specific to each PageStore object, instead of global to the process.  
Then swapping out $WikiDir wouldn't cause an issue, since each PageStore
would keep track of its own cache.  But I decided the global approach
was simpler (and sufficient) for now.

The rest of this message just describes why the page exists cache
is even there, and may be safely skipped.

Background
----------

One may reasonably ask why the $PageExistsCache even exists;
the answer is that when pmwiki.org was running slow earlier in the
year, I ran some benchmarks to find out where the slowdowns were
occurring, and a lot of times it appeared it was in PageExists().
On the surface this seems very odd, since all that PageExists()
does is to check the filesystem for the existence of a given file
in wiki.d/ .  Note that PageExists() doesn't actually open or read
the file, nor is it doing a entry-by-entry scan of the wiki.d/
directory to find the matching file.  It literally is simply asking
"does 'wiki.d/Group.PageName' exist?"

What I later concluded is that my server (a virtual private server)
often goes through periods of high filesystem latency.  I'm guessing
this is because the filesystem is actually physically on another
storage device accessed via a network and shared among many virtual
private servers.  For accessing any single file, there may be a short
delay (which isn't noticeable overall), but when PmWiki is checking
on the existence of lots of files the request queues may be getting
a little congested.  So, I added the "page exists cache" so that
PmWiki would at least perform such a check only once. 

At some point I may determine that it is in fact something else that
is causing the bottlenecks, or that there's a better way of testing
page existence than through the filesystem/cache array, which is why
there's now a note in the code that $PageExistsCache may be removed 
in some future version (and why it wasn't global in the first place).

Hope this helps,

Pm





More information about the pmwiki-users mailing list