[pmwiki-users] TextExtract (Search recipe) update

ABClf languefrancaise at gmail.com
Wed Sep 16 05:54:17 CDT 2009


Hi Hans,
I use your today's last TextExtract version and there is something strange
in the header templated output :
on page1 (http://www.languefrancaise.net/docs/Argot/Evaluations) I use :
{(extract 'eval_' Argot.*,-RecentChanges markup=cut
header="%rfloat%{$$pagecnt} pages sur {$$listcnt} - {$$time} %%'''Valeur
recherchée :''' {$$pattern}" unit=line prefix=link snip=%.*%)}

on page2 (http://www.languefrancaise.net/docs/Abclf/Accueil) I use :
>>bgcolor=#ffeeee<< *Chercher dans la doc locale de PmWiki (:extract
group=PmWiki* page=* name=-RecentChanges header="%rfloat%{$$pagecnt} pages
sur {$$listcnt} - {$$time} %%'''Valeur recherchée :''' {$$pattern}" regex=1
unit=line markup=cut linenum=1 pagenum=blue timer=1:) (:extractresult:) >><<

My question is about this snippet : '''Valeur recherchée :''' {$$pattern}
-on page 1 the output /Valeur recherchée :/ is in bold as expected ;
-on page 2 the output is litteraly ("include" was the targeted
pattern) : <strong>Valeur
recherchée :</strong> include
(in case the email program can't reproduce html here it is the result with
"false" _ : <_strong>Valeur recherchée :<_/strong> include)

Maybe a cosmetic bug ?

2009/9/16 ABClf <languefrancaise at gmail.com>

> Thank you Hans for this updated version of your recipe (and for your last
> advice about markup=on).The new templated header is fine.
>
> Gilles.
>
> 2009/9/15 Hans <design5 at softflow.co.uk>
>
> Friday, September 11, 2009, 1:09:42 PM, Hans wrote:
>>
>> > For TextExtract I cannot just use PmWiki's search engine,
>> > because we need to extract text. But thanks to your suggestion I was
>> > inspired to look at the handling of search terms again, and will
>> > incorporate the way PmWiki's search handles search terms, so we can
>> > have input like
>> >   'abc xyz' => output with 'abc' AND 'xyz' in the page;
>> >   '"abc def" xyz' => output with 'abc def' AND 'xyz' in the page;
>> >   'abc -xyz' => output with 'abc' but NOT 'xyz' in the page;
>> >   'abc|xyz' => output with 'abc' OR 'xyz' in the page;
>>
>> Now available in the latest release.
>> http://www.pmwiki.org/wiki/Cookbook/TextExtract
>>
>> I also added some template variables for use in parameters
>> header= , footer= , phead=
>> for instance a header with a custom title and the search time:
>>   header="%rfloat%{$$time}%%'''Listing'''"
>>
>> I split regular expression search from standard search, to allow
>> easier term input, and added a checkbox for regular expression search
>> to the search form.
>> I added a checkbox for 'Match whole words' for whole word searches.
>>
>> A note on efficiency:
>> TextExtract with its in-built pagelist function runs faster than using
>> PmWiki's pagelist, or MakePageList() function, mainly because
>> PmWiki's pagelist process opens every page to check if the user is
>> authorised to see the page, because it does not want to output any
>> non-authorised pages, for instance read-protected pages. This file
>> opening can be quite time consuming.
>> On the other hand TextExtract constructs a pagelist including even
>> read-protected pages, authorisations are not checked at this stage in
>> the process. Only later when each page on the source list is opened
>> will authorisation be checked, before text lines are extracted and
>> processed. So  a lot less pages need to be opened, which makes for
>> a faster process. That is the main reason I did not use MakePageList()
>> as a source pagelist generator.
>>
>> Still, a possibility remains to use the PmWiki searchbox with  a
>> fmt=#extract option, which will use PmWiki's pagelist functions
>> and TextExtract formatting functions. Useful if you need to pass
>> pagelist parameters TextExtract does not understand.
>>
>>  ~Hans
>>
>>
>> _______________________________________________
>> pmwiki-users mailing list
>> pmwiki-users at pmichaud.com
>> http://www.pmichaud.com/mailman/listinfo/pmwiki-users
>>
>
>
>
> --
> ---------------------------------------
> | A | de la langue française
> | B | http://www.languefrancaise.net/
> | C | languefrancaise at gmail.com
> ---------------------------------------
>



-- 
---------------------------------------
| A | de la langue française
| B | http://www.languefrancaise.net/
| C | languefrancaise at gmail.com
---------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.pmichaud.com/pipermail/pmwiki-users/attachments/20090916/d3e3511b/attachment-0001.html 


More information about the pmwiki-users mailing list