[pmwiki-users] Faster searches and categories

Martin Fick fick at fgm.com
Mon Sep 12 15:47:17 CDT 2005


On Mon, Sep 12, 2005 at 02:52:02PM -0500, Patrick R. Michaud wrote:
> On Mon, Sep 12, 2005 at 02:45:28PM -0400, Martin Fick wrote:
> > I mean optimistic because you are hoping that someone builds the
> > index before you need it.  The worst case scenario is that there
> > is no index and the first category pagelist request needs to 
> > search every page.
> 
> Yes.  But I figure that one-time costs aren't truly significant
> in the long run, and the site admin is generally going to be the
> person incurring the one-time cost.
> 
> > Turns out the grep is still slightly faster in most
> > situations. The situation where it is slower is actually 
> > when I just search the Category pages.  My find is not
> > terribly smart: it does not used the pattern passed in to
> > limit the pages searched (the filtering is handled by
> > pmwiki afterwards so it still works).  This means that it
> > is actually searching the entire site and it is still
> > within a few percent of the index method's time!
> 
> Oh, I totally agree that the grep will drastically speed things up.  
> I'm just not sure how to make use of it in a portable manner
> at the moment.  Even in the grepsearch.php code, there's a likelihood
> that the script will fail totally when a certain number of files
> are reached, because their names won't all fit in a single shell
> command line or in an environment variable.  


Yes, the shell command line was already filled on my setup,
thus the use of the environment variable. :)  I assume that
could fill up too, but I'm not sure that is any more
limiting than apache or php memory limits.  I have just
under 1500 files, it doesn't seem to be a problem on
linux.  I guess other legacy OSes  might run into
problems.  I realize the recipe is a hack, but it is a
simple hack that works for me and I figure that it could
help others today.  I think an alternative to the index
feature could be usefull for some.  Without either feature
categories are a definite no go, so at least now we have 2
solutions to pick from instead of none! :)


With the indexes, what happens if users change wiki pages
manually on disk, could the results be wrong?  I may be
taking a leap in assuming that some users expect to be able
to just drop in files which may overwrite older/newer
versions of those files.  I think this ability currently
makes pmwiki elegant and much simpler than other
alternative wikis, like the ones that use a db to store
page data.  Might this be an unintended sacrifice?


-Martin






More information about the pmwiki-users mailing list