[pmwiki-users] slurp is broken

Patrick R. Michaud pmichaud at pobox.com
Wed Jul 19 14:59:26 CDT 2006


On Wed, Jul 19, 2006 at 09:34:44PM +0200, christian.ridderstrom at gmail.com wrote:
> On Wed, 19 Jul 2006, Patrick R. Michaud wrote:
> >On Wed, Jul 19, 2006 at 11:36:53AM -0500, JB wrote:
> >>PM,
> >>
> >>Can I please get a copy of your robots.txt file?
> >
> >Also, for any who are interested, here's the relevant
> >sections of my root .htaccess file, which denies certain
> >user agents at the webserver level instead of waiting
> >for PmWiki to do it:
> >
> >   # HTTrack and MSIECrawler are just plain annoying
> >   RewriteEngine On
> >   RewriteCond %{HTTP_USER_AGENT} HTTrack [OR]
> >   RewriteCond %{HTTP_USER_AGENT} MSIECrawler
> >   RewriteRule ^wiki/ - [F,L]
> >
> >   # block ?action= requests for these spiders
> >   RewriteCond %{QUERY_STRING} action=[^rb]
> >   RewriteCond %{HTTP_USER_AGENT} Googlebot [OR]
> >   RewriteCond %{HTTP_USER_AGENT} Slurp [OR]
> >   RewriteCond %{HTTP_USER_AGENT} msnbot [OR]
> >   RewriteCond %{HTTP_USER_AGENT} Teoma [OR]
> >   RewriteCond %{HTTP_USER_AGENT} ia_archive
> >   RewriteRule .* - [F,L]
> 
> The obvious solution: Add this to some PmWiki page?  Perhaps something 
> about administrative tasks? Or something related to robots.txt?

It probably belongs in Cookbook.ControllingWebRobots, which also needs
to be rewritten to be up-to-date with PmWiki 2.1.  There also needs
to be a link in the administrative tasks section, or at least a
FAQ question.

Pm




More information about the pmwiki-users mailing list