Internal Server Error

EnverexEnverex Worcester, UK Icrontian
edited April 2004 in Internet & Media
I can't see an obvious problem with my .htaccess file, but when it exists, I keep getting internal server errors. The actual config file is:

[PHP]# Email extraction bots
SetEnvIfNoCase User-Agent "ExtractorPro" mail_bots
SetEnvIfNoCase User-Agent "EmailSiphon" mail_bots
SetEnvIfNoCase User-Agent "EmailWolf" mail_bots
SetEnvIfNoCase User-Agent "EmailCollector" mail_bots
SetEnvIfNoCase User-Agent "CherryPicker" mail_bots
SetEnvIfNoCase User-Agent "WebEMailExtractor" mail_bots

# Bandwidth leachers and pests
SetEnvIfNoCase User-Agent "LinkWalker" bad_bots
SetEnvIfNoCase User-Agent "WhizBang" bad_bots
SetEnvIfNoCase User-Agent "MIIxpc" bad_bots
SetEnvIfNoCase User-Agent "MFC_Tear" bad_bots
SetEnvIfNoCase User-Agent "DIIbot" bad_bots
SetEnvIfNoCase User-Agent "ia_archiver" bad_bots

# Download Acellerators
SetEnvIfNoCase User-Agent ^Wget download_acc
SetEnvIfNoCase User-Agent Flashget download_acc
SetEnvIfNoCase User-Agent Getright download_acc
SetEnvIfNoCase User-Agent Gozilla download_acc
SetEnvIfNoCase User-Agent Downloader download_acc
SetEnvIfNoCase User-Agent ^Mass download_acc
SetEnvIfNoCase User-Agent ^LeechGet download_acc
SetEnvIfNoCase User-Agent ^MD download_acc
SetEnvIfNoCase User-Agent ^DA download_acc
SetEnvIfNoCase User-Agent ^MyGetRight download_acc
SetEnvIfNoCase User-Agent "Star Downloader" download_acc

<Limit GET POST HEADER>
Order Allow,Deny
Allow from all
Deny from env=download_acc
Deny from env=mail_bots
Deny from env=bad_bots
</Limit>
[/PHP]

and the error that appears in the Apache error log is:
[Tue Mar 23 10:23:10 2004] [alert] [client 192.168.1.100] /usr/http/ac/.htaccess: TRACE cannot be controlled by <Limit>, referer: http://atomnet.co.uk/amiga/index.php?p=emulators

Now as the .htaccess is at the route of this site, it affect everything, and the problem is, it affects every type of file, and thus random images don't work, and sometimes you just get an internal server error page. The error doesn't happen all the time, it happens completely randomly, and it is random as to hat fails too, sometimes it may be nothing, some times a few images may not load, or sometime the whole page fails.

Any idea why this is happening, and even more so, why is it random?

RE: Why aren't all of the names in speechmarks?
Speechmarks denote an absolute name, for instance:
"DL" will only match hosts the the name DL, EXACTLY.
DL will match anything with DL ANYWHERE in the name.
^DL matches clients with DL at the front of the identification.

Cheers,
Ex

Comments

  • a2jfreaka2jfreak Houston, TX Member
    edited April 2004
    I think the reason it's intermittent is because you have three groups (download_acc, mail_bots, bad_bots). Some users fall into one of those three groups, some do not. If one of the users falls into either download_acc or mail_bots, then perhaps the user gets through because env is now set to bad_bots, not set to download_acc | mail_bots | bad_bots.

    Make all your SetEnvIfNoCase fall into the same group, say bad_bots. Then just have one deny statement where env=bad_bots


    Check out: http://evolt.org/article/Using_Apache_to_stop_bad_robots/18/15126/?format=print&rating=true&comments=false
    and: http://www.webmasterworld.com/forum92/205-2-15.htm
  • EnverexEnverex Worcester, UK Icrontian
    edited April 2004
    a2jfreak wrote:
    I think the reason it's intermittent is because you have three groups (download_acc, mail_bots, bad_bots). Some users fall into one of those three groups, some do not. If one of the users falls into either download_acc or mail_bots, then perhaps the user gets through because env is now set to bad_bots, not set to download_acc | mail_bots | bad_bots.

    Make all your SetEnvIfNoCase fall into the same group, say bad_bots. Then just have one deny statement where env=bad_bots


    Check out: http://evolt.org/article/Using_Apache_to_stop_bad_robots/18/15126/?format=print&rating=true&comments=false
    and: http://www.webmasterworld.com/forum92/205-2-15.htm

    Still happens. I mean it happens with people that don't fall in to any of the categories, and randomly too.

    The file is now:

    [php]# Email extraction bots
    SetEnvIfNoCase User-Agent "ExtractorPro" bad_bots
    SetEnvIfNoCase User-Agent "EmailSiphon" bad_bots
    SetEnvIfNoCase User-Agent "EmailWolf" bad_bots
    SetEnvIfNoCase User-Agent "EmailCollector" bad_bots
    SetEnvIfNoCase User-Agent "CherryPicker" bad_bots
    SetEnvIfNoCase User-Agent "WebEMailExtractor" bad_bots

    # Bandwidth leachers and pests
    SetEnvIfNoCase User-Agent "LinkWalker" bad_bots
    SetEnvIfNoCase User-Agent "WhizBang" bad_bots
    SetEnvIfNoCase User-Agent "MIIxpc" bad_bots
    SetEnvIfNoCase User-Agent "MFC_Tear" bad_bots
    SetEnvIfNoCase User-Agent "DIIbot" bad_bots
    SetEnvIfNoCase User-Agent "ia_archiver" bad_bots

    # Download Acellerators
    SetEnvIfNoCase User-Agent ^Wget bad_bots
    SetEnvIfNoCase User-Agent Flashget bad_bots
    SetEnvIfNoCase User-Agent Getright bad_bots
    SetEnvIfNoCase User-Agent Gozilla bad_bots
    SetEnvIfNoCase User-Agent Downloader bad_bots
    SetEnvIfNoCase User-Agent ^Mass bad_bots
    SetEnvIfNoCase User-Agent ^LeechGet bad_bots
    SetEnvIfNoCase User-Agent ^MD bad_bots
    SetEnvIfNoCase User-Agent ^DA bad_bots
    SetEnvIfNoCase User-Agent ^MyGetRight bad_bots
    SetEnvIfNoCase User-Agent "Star Downloader" bad_bots

    <Limit GET POST HEADER>
    Order Allow,Deny
    Allow from all
    Deny from env=bad_bots
    </Limit>[/php]

    And it still happens randomly. When enabled random images don't load (maybe 1, maybe a few) or the page gives an error entirely.
  • a2jfreaka2jfreak Houston, TX Member
    edited April 2004
    I don't know Enverex. I just googled that stuff and found that they all used one group and thought that could be your problem. Sorry I cannot be of more assistance; I'm sure it's quite annoying for you.
  • KwitkoKwitko Sheriff of Banning (Retired) By the thing near the stuff Icrontian
    edited April 2004
    I know this isn't a direct solution to your problem, but perhaps you could do a rewrite rule instead. My .htaccess looks like this:
    redirect /scripts [url]http://www.goaway.losers[/url] 
    redirect /MSADC [url]http://www.goaway.losers[/url] 
    redirect /c [url]http://www.goaway.losers[/url] 
    redirect /d [url]http://www.goaway.losers[/url] 
    redirect /_mem_bin [url]http://www.goaway.losers[/url] 
    redirect /msadc [url]http://www.goaway.losers[/url] 
    RedirectMatch (.*)cmd.exe$ http://www.goaway.losers$1
    RedirectMatch (.*)default.ida$ http://www.goaway.losers$1
    RedirectMatch (.*)nsiislog.dll$ http://www.goaway.losers$1
    
    RewriteEngine on
    RewriteCond %{REMOTE_ADDR} ^12\.148\.196\.(12[8-9]|1[3-9][0-9]|2[0-4][0-9]|25[0-5])$ [OR]
    RewriteCond %{REMOTE_ADDR} ^64\.236\.185\.125 [OR]
    RewriteCond %{REMOTE_ADDR} ^80\.54\.32\.118 [OR]
    RewriteCond %{REMOTE_ADDR} ^64\.38\.240\.139 [OR]
    RewriteCond %{REMOTE_ADDR} ^24\.90\.225\.14 [OR]
    RewriteCond %{REMOTE_ADDR} ^24\.98\.180\.12 [OR]
    RewriteCond %{REMOTE_ADDR} ^12\.148\.209\.(19[2-9]|2[0-4][0-9]|25[0-5])$ [OR]
    RewriteCond %{REMOTE_ADDR} ^63\.148\.99\.2(2[4-9]|[3-4][0-9]|5[0-5])$ [OR]
    RewriteCond %{HTTP_REFERER} iaea\.org [OR]
    RewriteCond %{HTTP_USER_AGENT} Sqworm [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} HostItCheap [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ia_archiver [OR]
    RewriteCond %{HTTP_USER_AGENT} ^[A-Z]+$ [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Scooter.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FAST\-WebCrawler.* [OR]
    RewriteCond %{HTTP_USER_AGENT} Slurp [OR]
    RewriteCond %{HTTP_USER_AGENT} Zeus [OR]
    RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} SurveyBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ZyBorg [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} grub\-client [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} wizard\.yellowbrick\.oz
    RewriteRule .* - [F,L]
    
  • EnverexEnverex Worcester, UK Icrontian
    edited April 2004
    I have all the rest sussed, but I am wondering that the "NC" and "OR" actually mean.
  • EnverexEnverex Worcester, UK Icrontian
    edited April 2004
    Cheers TD, works a charm.

    [PHP]RewriteEngine On
    RewriteCond %{HTTP_USER_AGENT} "ExtractorPro" [OR]
    RewriteCond %{HTTP_USER_AGENT} "EmailSiphon" [OR]
    RewriteCond %{HTTP_USER_AGENT} "EmailWolf" [OR]
    RewriteCond %{HTTP_USER_AGENT} "EmailCollector" [OR]
    RewriteCond %{HTTP_USER_AGENT} "CherryPicker" [OR]
    RewriteCond %{HTTP_USER_AGENT} "WebEMailExtractor" [OR]
    RewriteCond %{HTTP_USER_AGENT} "WhizBang" [OR]
    RewriteCond %{HTTP_USER_AGENT} "MIIxpc" [OR]
    RewriteCond %{HTTP_USER_AGENT} "MFC_Tear" [OR]
    RewriteCond %{HTTP_USER_AGENT} "DIIbot" [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DA [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
    RewriteCond %{HTTP_USER_AGENT} Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
    RewriteCond %{HTTP_USER_AGENT} FlashGet [OR]
    RewriteCond %{HTTP_USER_AGENT} GetRight [OR]
    RewriteCond %{HTTP_USER_AGENT} Go!Zilla [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR]
    RewriteCond %{HTTP_USER_AGENT} ia_archiver [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} JetCar [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
    RewriteCond %{HTTP_USER_AGENT} Leech [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MD [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
    RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} "Star Downloader" [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
    RewriteCond %{HTTP_USER_AGENT} SurveyBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Zeus
    RewriteRule ^.*$ [L,R][/PHP]

    Is there any way of making it return a Forbidden rather than redirecting it? As otherwise they end up hammering the server. Leaving the Rule empty as shown above gives the vicious client a Internal Server Error, which is fine as it doesn't use up any bandwidth and the DAs don't hammer the server for the file.

    Thanks for the help.
Sign In or Register to comment.