[pmwiki-users] Trouble with .pageindex when too much _new_ data to index (+ sqlite)

ABClf languefrancaise at gmail.com
Fri Feb 6 05:31:45 CST 2015


Well, well, well.
Thank you for your help Peter ; your code (last version, autorefresh) is
working well for building a new pageindex, localhost speaking.

Online, (I'm not talking about reindex), while basically browsing my wiki,
I'm facing fatal issues with memory and/or error which make me afraid there
is no doubt I'm going in the wall with my project to use pmwiki to handle
200k short and very short pages as if it were mysql like.
Well working sqlite database (100 mo) + pageindex (40 mo), while running
not bad on localhost as long as I don't edit a set of pages, do run out of
memory online the first time a basic pagelist is asked.

(:pagelist group=Synonyme fmt=#title:) (2000 results expected) : Fatal
error: Out of memory (allocated 56098816) (tried to allocate 36 bytes) in
/homepages/18/d269604285/htdocs/_dev2/scripts/pagelist.php on line 370

(:pagelist link=Synonyme.1003 fmt=#title:) (2 results expected) : Fatal
error: Out of memory (allocated 56098816) (tried to allocate 36 bytes) in
/homepages/18/d269604285/htdocs/_dev2/scripts/pagelist.php on line 370

When I delete the pageindex, I get an error 500.

To be honest, I don't know if the issue is related to PmWiki or if sqlite
pagestoring is involved. The online files are the localhost files used for
my testings.

It's a shared host (1&1), yet it works nicely with a dirty very unoptimized
mysql script.
(phpinfo here : http://languefrancaise.net/_dev2/php_info.php)
Maybe it's not powerfull file hosting, I don't know, but my own laptop
still doesn't like to open and close and delete a 40mo file at every
operation.

I guess the way PmWiki does search its data is problematic in my case.
Using sqlite, I was too fast to wish he would read the targets column, in
sqlite database, but I'm afraid he still reads the .pageindex solely, which
has gotten too big to handle, and is server resource greedy process.

Even in localhost, with a well working pageindex just newly done, editing a
page is soon likely to freeze the pageindex mecanism : getting both frozen
pageindex and new,pageindex (updating and deleting have failed, these two
files will have to be manually deleted, thus a is new index to be built
with reindex script).

To me, for large wiki oftenly edited (I can do 300 edits a day), it looks
like solution would be to get rid of pageindex file (too big for getting
opened, deleted and edited 1 time every 2 minutes for hours), and rather
use sqlite at its best, i.e. not only for passive pagestoring but also for
searches (targets, title, ctime, name, etc., have their own column). I know
this would mean a lot of work for a limited profit.

As for pageindex I sadly regret that, when the search doesnt work anymore
because of the big amount of data, so doesn't work the pagelists (I'm
talking about pagelists limited to links, name, title, and page variables,
not text searches). Loosing such a capability is harsh privation.
One may accept to use Google or something else for searching, yet nothing
but PmWiki can handle formatted pagelists, which are essential to me.

In conclusion, if I'm not mistaken, my issue with PmWiki is the pageindex
mecanism.

@Peter : do you have a code to share I could try for saving .pageindex in a
sqlite database, using your reindex script ?
(http://www.pmwiki.org/wiki/PmWiki/SearchImprovements December 26, 2010)

Thanks you,
Gilles.



2015-01-31 20:22 GMT+01:00 Peter Bowers <pbowers at pobox.com>:

> On Thu, Jan 29, 2015 at 3:29 PM, ABClf <languefrancaise at gmail.com> wrote:
>
>> Great job Peter. I have had to run the script 5 times.
>> ...
>>
> 5:
>> DEBUG: B
>> DEBUG: count=0
>> Indexing complete. Deleting wiki.d/.reindex
>>
>> Warning: unlink(wiki.d/.reindex): Permission denied in
>> D:\xampp3\htdocs\abclf\local\Site.Reindex.php on line 52
>>
>>
> I've updated this in 2 ways:
> * fixed the unlink() problem (thanks, Chuck)
> * it now redirect's to itself multiple times rather than forcing you to
> re-load the page
>
> I've uploaded it as a recipe called Reindex.
>
> http://www.pmwiki.org/wiki/Cookbook/Reindex
>
> -Peter
>



-- 

---------------------------------------
| A | de la langue française
| B | http://www.languefrancaise.net
| C | languefrancaise at gmail.com
---------------------------------------
       @bobmonamour
---------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pmichaud.com/pipermail/pmwiki-users/attachments/20150206/c4689c00/attachment.html>


More information about the pmwiki-users mailing list