[pmwiki-users] PmWiki seems to "hang" on overloaded boxes

Joachim Durchholz jo at durchholz.org
Sun May 22 11:10:09 CDT 2005


Patrick R. Michaud wrote:

> On Sun, May 22, 2005 at 04:38:11PM +0200, Joachim Durchholz wrote:
> 
>> Radu wrote:
>> 
>>> At 08:11 AM 5/22/2005, Joachim Durchholz wrote:
>>> 
>>>> Here's what happened: PmWiki suddenly started to "hang", i.e.
>>>> it would not respond at all, or display only part of the output
>>>> and "hang". With "hang", I mean that the browser still has the 
>>>> data-is-forthcoming animation, but no new data ever arrives.
>>>> (For an "forever" value of one or two minutes.)
> 
> First and most importantly, what version of PmWiki?

2.0beta28.

> Second, note that by the time PmWiki is outputting information to the
> browser, PmWiki has pretty much completed any "real work" that needs 
> to be done and is simply doing lots of print statements.  (Exception:
>  <!--wiki:--> and <!--function:--> items in the template are
> processed as the page is being output.  However, the main content is
> processed before *any* output takes place.)

Hmm... I had at least one occasion when PmWiki had output the header and
left sidebar, but didn't output the right sidebar and main content. (For
that specific skin, the output order is header -> left sidebar -> right
sidebar -> page, via CSS.)

In case this helps anybody, it's the beeblebrox-gila2 template, slightly 
adapted but general processing logic untouched. I'm pretty sure that it 
uses <!--wiki:--> or a moral equivalent to output the sidebars, so it 
seems as if it managed to output top border and left sidebar, but 
stalled on the rightbar and never got around to doing the main contents.

> So, if the browser has received some output but then hangs, it's 
> almost certainly something completely outside of PmWiki's control, 
> because PmWiki is just generating print statements.

Just my impression.

>>> Um. Have you tried deleting the .flock file? If after the one or
>>> two minutes you actually do get the page, this is not the issue.
>> 
>> Well, once PmWiki hanged, calling up the site in another browser
>> window would stall as well, and deleting .flock would fix that.
> 
> This seems to tell me you're running version of PmWiki older than 
> beta30.  In beta30 and later, the .flock file isn't used at all when 
> viewing pages -- only when editing them or changing page attributes.
>  Are you seeing this problem in beta30 or later?

No, I'll have to upgrade. I wasn't too happy with the ramifications of
some of the announced changes (not sure anymore which ones, but there
were some), so I decided to stop upgrading unless necessary. It's a
single-author site just recently gone productive, and I didn't want to
rock that boat...

Anyway, I guess the .flock problem is just a corollary. The site had a
reproducible hang even with a freshly deleted .flock, and that hang went
away after reconfiguring the prefork settings of Apache (the previous
configuration seems to have forced the box into swapping out some
preforked Apache processes to disk, which nullifies or even reverses the
benefits of preforking).

I'm somewhat surprised by that result though. If PmWiki managed to 
output top margin and sidebar, it probably died while rendering the 
rightbar. However, that doesn't really make sense: the only reason why 
it could have died without giving an error message might be a shutdown 
from Apache because it exceeded some resource limit, but I don't think 
it's taking that long or that that much memory! So where's the fault in 
the logic?
On a tangent, what's the best way to see how much time a page took?

>> To make it a bit clearer: I don't mind if PmWiki fails under
>> overload - it has every right to do so. What I don't like is its
>> failure mode: no error message, but a browser that keeps waiting
>> for data that never comes, and a dysfunctional site
>> (.flock-induced, though I don't know whether that's because the
>> stalled PmWiki call never got around to cleaning out .flock or due
>> to some other reason).
> 
> I ran into similar problems such as this in the distant past, and I
> finally traced the problem to Apache.  When a browser closed a 
> connection before output was completed (or even begun), Apache would 
> sometimes block indefinitely, never returning control back to PmWiki 
> nor killing off the PmWiki process.

Ah, that's something that has been bugging one of my PmWiki
installations. Do you recall which part of the Apache configuration was
responsible for that?

Note that this is unrelated to the problem I'm *currently* having. The
browser was waiting indefinitely for a response that never came, and
things stalled on the server side (most likely within Apache when
swapped out or something).

> Of course, there's pretty much nothing that PmWiki can do about this 
> -- if PmWiki never regains control after executing a "print" 
> statement, then PmWiki can't really report an error or diagnostic or 
> anything useful such as that.

Agreed. I'm more after Apache doing something sensible in that case (if
only buffering the output and sending a 500 error page if the script was
aborted or something).

> In versions of PmWiki prior to beta30 that used .flock on every
> request, this was sometimes a real problem because a hung
> Apache/PmWiki process would maintain the lock on .flock and prevent
> other PmWiki processes from progressing.

Yes, that's what happened once the first process had hung.

Regards,
Jo



More information about the pmwiki-users mailing list