[pmwiki-users] Speedy search?

Patrick R. Michaud pmichaud at pobox.com
Fri Feb 17 12:36:07 CST 2006


On Fri, Feb 17, 2006 at 02:00:51PM +0000, Karl Loncarek wrote:
> I included titledictindex.php of the Cookbook/DictIndex reciepe.
> 
> My call is:
> (:pagelist group=Techlib fmt=dictindex list=normal:)

OOOOPS!  I just now noticed that you were using the titledictindex.php
recipe instead of dictindex.php.  So, repeating my previous analysis...  
with titledictindex.php as it appears in the Cookbook, I get:

    00.00 00.00 MarkupToHTML begin
    00.06 00.02 TitleDictIndex begin
    00.06 00.02 MakePageList begin
    00.61 00.07 MakePageList scanning 1070 pages, readf=1
    01.65 00.40 MakePageList sort
    01.84 00.50 MakePageList end
    01.84 00.50 TitleDictIndex create names
    02.60 00.97 TitleDictIndex sort names
    02.62 00.99 TitleDictIndex format output
    02.89 01.20 TitleDictIndex end
    02.92 01.21 MarkupToHTML end
    02.93 01.23 MarkupToHTML begin
    02.97 01.25 MarkupToHTML end
    02.97 01.25 MarkupToHTML begin
    02.98 01.25 MarkupToHTML end
    03.24 01.26 now

Once again, it's the create names section that is eating up a lot
of time.  But here we've actually got a double-whammy... the
titledictindex.php recipe is re-building and re-sorting the pages
by title, which MakePageList has already done!

So, if we just eliminate that entirely from the recipe, and
adjust the formatting output to use $item['title'] instead of
$item['name'], we get:

    00.00 00.00 MarkupToHTML begin
    00.01 00.00 TitleDictIndex begin
    00.01 00.00 MakePageList begin
    00.06 00.03 MakePageList scanning 1070 pages, readf=1
    00.39 00.30 MakePageList sort
    00.48 00.38 MakePageList end
    00.49 00.38 TitleDictIndex format output
    00.76 00.63 TitleDictIndex end
    00.78 00.64 MarkupToHTML end
    00.79 00.64 MarkupToHTML begin
    00.80 00.65 MarkupToHTML end
    00.80 00.65 MarkupToHTML begin
    00.81 00.66 MarkupToHTML end
    01.05 00.67 now

Now TitleDictIndex is requiring only 0.63 seconds of CPU time,
as compared to 1.20 seconds before.

Still looking at improving the format time.  But no matter how
we look at it, scanning and processing 1000 (or 10000) pages 
is a little on the expensive side.

Pm




More information about the pmwiki-users mailing list