[Pmwiki-users] Re: Spamming technique

Steven Leite steven_leite
Sun May 23 20:12:50 CDT 2004


Sound good Chrisitan, but how difficult would it be to write a php
script (or cookbook recipe) to do the same?  Not all of us are
comfortable with bash script, and not all of us have shell access.

I like the idea of only seeing whihc ones have been updated since the
last time you checked though.  Sounds like a real time saver.

-S

----- Original Message ----- 
From: "Christian Ridderstr?m" <chr at home.se>
To: <pmwiki-users at pmichaud.com>
Sent: Sunday, May 23, 2004 2:18 PM
Subject: [Pmwiki-users] Re: Spamming technique


> On Sun, 23 May 2004, Crisses wrote:
>
> > Hey, has anyone tried to run a grep on their site in the wiki.d
folder
> > to see all http:// requests in their pages?  maybe I'll do that
(just
> > to eyeball what comes up).  an initial "approved" file would be
pretty
> > easy to make from there.
>
> I just ran the following command (in bash):
>
>     grep ^text= wiki.d/* | tr ? \\n | tr " " \\n | tr ']' \\n | \
>       grep -i http: | sed -e " s/.*\(http.*\)/\\1/" | sort | \
>       uniq > URIs.lst
>
> and it extracts a unique lists of URIs starting with 'http:'. The
result
> was a rather long list (more than 400 URIs). Realizing that I will use
> this command again, I ended up putting in a script 'find-URIs.sh' that
> you can find here:
>
> http://wiki.lyx.org/pmwiki.php/SiteTest/ConfigFiles
>
> In order for you to use this script, you have to modify the line
>
> dir0=~lyx/www/pmwiki # pmwiki/-directory
>
> so that $dir0 points to your wiki directory. Then you can execute the
> script through:
>
> ./find-URIs.sh +n > URIs.lst
>
> which puts the result in a file called 'URIs.lst'.
>
> Since I get so many URIs, I've put some 'valid-URIs.lst' in a file
which I
> use to filter the result as follows:
>
> cat URIs.lst | grep -v -f valid-URIs.lst
>
> I still end up with about 200 links that I manually check (basically I
> just have to glance at them to see that they look reasonable).
>
> What I've done now is to check in 'URIs.lst' into my version control
> system, so that the next time I run the command to check for URIs, I
can
> simply see which URIs are new.
>
> Oh, and I did find that another WikiSandbox-page had some bad links in
> it.
>
> /Christian
>
> -- 
> Christian Ridderstr?m
http://www.md.kth.se/~chr
>
>
>
> -- 
> Pmwiki-users mailing list
> Pmwiki-users at pmichaud.com
> http://pmichaud.com/mailman/listinfo/pmwiki-users_pmichaud.com
>
>
>




More information about the pmwiki-users mailing list