[pmwiki-users] Highlighting Pagelist Results [was:Recipe Update: Fox]

The Editor editor at fast.st
Tue Jan 29 08:29:29 CST 2008


On Jan 29, 2008 6:27 AM, Hans <design5 at softflow.co.uk> wrote:
> Monday, January 28, 2008, 8:03:54 PM, The Editor wrote:
>
> > The pagelist retrieves the list of pages, then a simple expression
> > markup in the pagelist template retrieves the first line in the page
> > with the searched term (passed to it from the pagelist directive),
> > highlighting the term. Of course you could format the template any
> > other way.
>
> Thanks for your hint!

No problem. I've been following your progress on this, and had a
couple ideas stimulated in me as well. Thanks for all your good work!

> I just tested this idea by just adding a condition to the main
> function of the extract.php script, which checks if $FmtV['$Needle']
> is set, and sets the search pattern to this.
>
> That way I could use a standard (:searchbox :) with a custom
> fmt=#extract, and define this #extract template with the extract
> markup expression.
>
> This works without any modifications to pagelist.php.

I hope Pm takes note of the idea, as he has contextual search results
down the road for PmWiki, and it could be implemented quite easily at
present with a strategy like this. Or at least as a very simple plugin
for those who only need to get contextual search results.

> But it is a lot slower than using the custom (:extract:) searchbox or
> using a fox form as a searchbox. And the results are not as
> comprehensive!
>
> I think the slow speed (a third slower) results from pagelist first
> opening all pages to look for the query term, and then the expression
> in the fmt template opens every page pagelist provides again to find
> the term, process directives, do the highlighting etc.
>
> I do not know why I get more comprehensive results with
> (:extract:). Perhaps pagelist is ignoring matches to terms in markup.

Fortunately, BoltWire pre-indexes all the pages, so it's search
results are quite fast generally. I thought PmWiki also indexed
things--but maybe not. If it is indexed though, I suspect PmWiki would
be faster on a large site, as the wiki would only need to open the
pages with matches. In fact, on a large site, using the extract method
would likely be untenable. Imagine having to open 1000 pages to find
10 matches!

> A big difference to your context-sensitive searchbox on your site
> is that TextExtract finds all matches, not just the first one.
> Another is the use of regex, which matches a string even if it is
> inside another word. One needs to specify wordboundaries specifically
> (with regex \b ), which I think is an advantage, as terms are often
> just part of a word.

Getting it to return one match or all is pretty trivial, though I
agree returning all of them is probably a better idea. As for regex,
that's what I use as well, and with \b. I'd like to get it to show
something like 50 chars with the key word as close to the center as
possible depending on where the line breaks are. Just haven't taken
the time to figured out exactly how to do the code...

Here were a couple additional problems with my solution I have not yet
solved, that perhaps you have considered:

1) When doing a case insensitive search--I get the matches in the
search function, but not the display.
2) When doing a boolean like apples && oranges--I also get the matches
but not the display
3) When searching for a phrase I get false matches. IE pages with the
words--but not the phrase
4) What to do with markup. Right now, my returns show the markup.
Processing it causes problems. Perhaps it could get stripped out to
some extent.
5) How do I block results where the term happens to be in the markup
(like a link for example)

These may or may not be relevant to PmWiki, as I don't remember the
intricacies of how the pagelist/template function works in detail. But
thought I would mention them for your consideration.

> All this has little to do with the Fox update, apart from that one can
> build customised search forms now.

Yes, but it would also be nice to release a little script simply for
those who want contextual search results. It seems separating the two
would allow users to easily do this without having to install Fox.
Notice I posted this with a different subject line.

Cheers,
Dan



More information about the pmwiki-users mailing list