Hi Hans,<div><br></div><div>I use your today&#39;s last TextExtract version and there is something strange in the header templated output :</div><div>on page1 (<a href="http://www.languefrancaise.net/docs/Argot/Evaluations">http://www.languefrancaise.net/docs/Argot/Evaluations</a>) I use :</div>

<div><span class="Apple-style-span" style="font-family: -webkit-monospace; white-space: pre-wrap; ">{(extract &#39;eval_&#39; Argot.*,-RecentChanges markup=cut header=&quot;%rfloat%{$$pagecnt} pages sur {$$listcnt} - {$$time} %%&#39;&#39;&#39;Valeur recherchée :&#39;&#39;&#39; {$$pattern}&quot; unit=line prefix=link snip=%.*%)}</span></div>

<div><font class="Apple-style-span" face="-webkit-monospace"><span class="Apple-style-span" style="white-space: pre-wrap;"><br></span></font></div><div><span class="Apple-style-span" style="white-space: pre-wrap;"><font class="Apple-style-span" face="arial, helvetica, sans-serif">on page2 (</font><span class="Apple-style-span" style="white-space: normal; "><span class="Apple-style-span" style="white-space: pre-wrap;"><font class="Apple-style-span" face="arial, helvetica, sans-serif"><a href="http://www.languefrancaise.net/docs/Abclf/Accueil">http://www.languefrancaise.net/docs/Abclf/Accueil</a>) </font></span><span class="Apple-style-span" style="white-space: pre-wrap; "><font class="Apple-style-span" face="arial, helvetica, sans-serif">I use :</font></span></span></span></div>

<div><font class="Apple-style-span" face="-webkit-monospace"><span class="Apple-style-span" style="white-space: pre-wrap;">&gt;&gt;bgcolor=#ffeeee&lt;&lt;

*Chercher dans la doc locale de PmWiki

(:extract group=PmWiki* page=* name=-RecentChanges header=&quot;%rfloat%{$$pagecnt} pages sur {$$listcnt} - {$$time} %%&#39;&#39;&#39;Valeur recherchée :&#39;&#39;&#39; {$$pattern}&quot; regex=1 unit=line markup=cut linenum=1 pagenum=blue timer=1:)

(:extractresult:)

&gt;&gt;&lt;&lt;</span></font></div><div><font class="Apple-style-span" face="-webkit-monospace"><span class="Apple-style-span" style="white-space: pre-wrap;"><br></span></font></div><div><span class="Apple-style-span" style="white-space: pre-wrap;"><font class="Apple-style-span" face="arial, helvetica, sans-serif">My question is about this snippet : &#39;&#39;&#39;Valeur recherchée :&#39;&#39;&#39; {$$pattern}</font></span></div>

<div><span class="Apple-style-span" style="white-space: pre-wrap;"><font class="Apple-style-span" face="arial, helvetica, sans-serif">-on page 1 the output /Valeur recherchée :/ is in bold as expected ;</font></span></div>

<div><font class="Apple-style-span" face="-webkit-monospace"><span class="Apple-style-span" style="white-space: pre-wrap;"><font class="Apple-style-span" face="arial, helvetica, sans-serif">-on page 2 the output is litteraly (&quot;include&quot; was the targeted pattern) </font>: <span class="Apple-style-span" style="font-family: Georgia; font-size: 15px; white-space: normal; line-height: 20px; ">&lt;strong&gt;Valeur recherchée :&lt;/strong&gt; include </span></span></font></div>

<div><font class="Apple-style-span" face="Georgia" size="4"><span class="Apple-style-span" style="font-size: 15px; line-height: 20px;"><font class="Apple-style-span" face="arial, helvetica, sans-serif"><span class="Apple-style-span" style="font-size: small;">(in case the email program can&#39;t reproduce html here it is the result with &quot;false&quot; _ :</span></font> &lt;_strong&gt;Valeur recherchée :&lt;_/strong&gt; include)</span></font></div>

<div><br></div><div>Maybe a cosmetic bug ?<br><br><div class="gmail_quote">2009/9/16 ABClf <span dir="ltr">&lt;<a href="mailto:languefrancaise@gmail.com">languefrancaise@gmail.com</a>&gt;</span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

Thank you Hans for this updated version of your recipe (and for your last advice about markup=on).<div>The new templated header is fine.</div><div><br></div><div>Gilles.</div><div><font color="#222222"><span style="border-collapse:collapse"><br>


</span></font></div><div><div class="gmail_quote">2009/9/15 Hans <span dir="ltr">&lt;<a href="mailto:design5@softflow.co.uk" target="_blank">design5@softflow.co.uk</a>&gt;</span><div><div></div><div class="h5"><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>Friday, September 11, 2009, 1:09:42 PM, Hans wrote:<br>

<br>

&gt; For TextExtract I cannot just use PmWiki&#39;s search engine,<br>

&gt; because we need to extract text. But thanks to your suggestion I was<br>

&gt; inspired to look at the handling of search terms again, and will<br>

&gt; incorporate the way PmWiki&#39;s search handles search terms, so we can<br>

&gt; have input like<br>

&gt;   &#39;abc xyz&#39; =&gt; output with &#39;abc&#39; AND &#39;xyz&#39; in the page;<br>

&gt;   &#39;&quot;abc def&quot; xyz&#39; =&gt; output with &#39;abc def&#39; AND &#39;xyz&#39; in the page;<br>

&gt;   &#39;abc -xyz&#39; =&gt; output with &#39;abc&#39; but NOT &#39;xyz&#39; in the page;<br>

&gt;   &#39;abc|xyz&#39; =&gt; output with &#39;abc&#39; OR &#39;xyz&#39; in the page;<br>

<br>

</div>Now available in the latest release.<br>

<div><a href="http://www.pmwiki.org/wiki/Cookbook/TextExtract" target="_blank">http://www.pmwiki.org/wiki/Cookbook/TextExtract</a><br>

<br>

</div>I also added some template variables for use in parameters<br>

header= , footer= , phead=<br>

for instance a header with a custom title and the search time:<br>

   header=&quot;%rfloat%{$$time}%%&#39;&#39;&#39;Listing&#39;&#39;&#39;&quot;<br>

<br>

I split regular expression search from standard search, to allow<br>

easier term input, and added a checkbox for regular expression search<br>

to the search form.<br>

I added a checkbox for &#39;Match whole words&#39; for whole word searches.<br>

<br>

A note on efficiency:<br>

TextExtract with its in-built pagelist function runs faster than using<br>

PmWiki&#39;s pagelist, or MakePageList() function, mainly because<br>

PmWiki&#39;s pagelist process opens every page to check if the user is<br>

authorised to see the page, because it does not want to output any<br>

non-authorised pages, for instance read-protected pages. This file<br>

opening can be quite time consuming.<br>

On the other hand TextExtract constructs a pagelist including even<br>

read-protected pages, authorisations are not checked at this stage in<br>

the process. Only later when each page on the source list is opened<br>

will authorisation be checked, before text lines are extracted and<br>

processed. So  a lot less pages need to be opened, which makes for<br>

a faster process. That is the main reason I did not use MakePageList()<br>

as a source pagelist generator.<br>

<br>

Still, a possibility remains to use the PmWiki searchbox with  a<br>

fmt=#extract option, which will use PmWiki&#39;s pagelist functions<br>

and TextExtract formatting functions. Useful if you need to pass<br>

pagelist parameters TextExtract does not understand.<br>

<div><div></div><div><br>

  ~Hans<br>

<br>

<br>

_______________________________________________<br>

pmwiki-users mailing list<br>

<a href="mailto:pmwiki-users@pmichaud.com" target="_blank">pmwiki-users@pmichaud.com</a><br>

<a href="http://www.pmichaud.com/mailman/listinfo/pmwiki-users" target="_blank">http://www.pmichaud.com/mailman/listinfo/pmwiki-users</a><br>

</div></div></blockquote></div></div></div><br><br clear="all"><div class="im"><br>-- <br>---------------------------------------<br>| A | de la langue française<br>| B | <a href="http://www.languefrancaise.net/" target="_blank">http://www.languefrancaise.net/</a><br>


| C | <a href="mailto:languefrancaise@gmail.com" target="_blank">languefrancaise@gmail.com</a><br>---------------------------------------<br>

</div></div>

</blockquote></div><br><br clear="all"><br>-- <br>---------------------------------------<br>| A | de la langue française<br>| B | <a href="http://www.languefrancaise.net/">http://www.languefrancaise.net/</a><br>| C | <a href="mailto:languefrancaise@gmail.com">languefrancaise@gmail.com</a><br>

---------------------------------------<br>

</div>