Internal Server Error
Enverex
Worcester, UK Icrontian
I can't see an obvious problem with my .htaccess file, but when it exists, I keep getting internal server errors. The actual config file is:
[PHP]# Email extraction bots
SetEnvIfNoCase User-Agent "ExtractorPro" mail_bots
SetEnvIfNoCase User-Agent "EmailSiphon" mail_bots
SetEnvIfNoCase User-Agent "EmailWolf" mail_bots
SetEnvIfNoCase User-Agent "EmailCollector" mail_bots
SetEnvIfNoCase User-Agent "CherryPicker" mail_bots
SetEnvIfNoCase User-Agent "WebEMailExtractor" mail_bots
# Bandwidth leachers and pests
SetEnvIfNoCase User-Agent "LinkWalker" bad_bots
SetEnvIfNoCase User-Agent "WhizBang" bad_bots
SetEnvIfNoCase User-Agent "MIIxpc" bad_bots
SetEnvIfNoCase User-Agent "MFC_Tear" bad_bots
SetEnvIfNoCase User-Agent "DIIbot" bad_bots
SetEnvIfNoCase User-Agent "ia_archiver" bad_bots
# Download Acellerators
SetEnvIfNoCase User-Agent ^Wget download_acc
SetEnvIfNoCase User-Agent Flashget download_acc
SetEnvIfNoCase User-Agent Getright download_acc
SetEnvIfNoCase User-Agent Gozilla download_acc
SetEnvIfNoCase User-Agent Downloader download_acc
SetEnvIfNoCase User-Agent ^Mass download_acc
SetEnvIfNoCase User-Agent ^LeechGet download_acc
SetEnvIfNoCase User-Agent ^MD download_acc
SetEnvIfNoCase User-Agent ^DA download_acc
SetEnvIfNoCase User-Agent ^MyGetRight download_acc
SetEnvIfNoCase User-Agent "Star Downloader" download_acc
<Limit GET POST HEADER>
Order Allow,Deny
Allow from all
Deny from env=download_acc
Deny from env=mail_bots
Deny from env=bad_bots
</Limit>
[/PHP]
and the error that appears in the Apache error log is:
Now as the .htaccess is at the route of this site, it affect everything, and the problem is, it affects every type of file, and thus random images don't work, and sometimes you just get an internal server error page. The error doesn't happen all the time, it happens completely randomly, and it is random as to hat fails too, sometimes it may be nothing, some times a few images may not load, or sometime the whole page fails.
Any idea why this is happening, and even more so, why is it random?
RE: Why aren't all of the names in speechmarks?
Speechmarks denote an absolute name, for instance:
"DL" will only match hosts the the name DL, EXACTLY.
DL will match anything with DL ANYWHERE in the name.
^DL matches clients with DL at the front of the identification.
Cheers,
Ex
[PHP]# Email extraction bots
SetEnvIfNoCase User-Agent "ExtractorPro" mail_bots
SetEnvIfNoCase User-Agent "EmailSiphon" mail_bots
SetEnvIfNoCase User-Agent "EmailWolf" mail_bots
SetEnvIfNoCase User-Agent "EmailCollector" mail_bots
SetEnvIfNoCase User-Agent "CherryPicker" mail_bots
SetEnvIfNoCase User-Agent "WebEMailExtractor" mail_bots
# Bandwidth leachers and pests
SetEnvIfNoCase User-Agent "LinkWalker" bad_bots
SetEnvIfNoCase User-Agent "WhizBang" bad_bots
SetEnvIfNoCase User-Agent "MIIxpc" bad_bots
SetEnvIfNoCase User-Agent "MFC_Tear" bad_bots
SetEnvIfNoCase User-Agent "DIIbot" bad_bots
SetEnvIfNoCase User-Agent "ia_archiver" bad_bots
# Download Acellerators
SetEnvIfNoCase User-Agent ^Wget download_acc
SetEnvIfNoCase User-Agent Flashget download_acc
SetEnvIfNoCase User-Agent Getright download_acc
SetEnvIfNoCase User-Agent Gozilla download_acc
SetEnvIfNoCase User-Agent Downloader download_acc
SetEnvIfNoCase User-Agent ^Mass download_acc
SetEnvIfNoCase User-Agent ^LeechGet download_acc
SetEnvIfNoCase User-Agent ^MD download_acc
SetEnvIfNoCase User-Agent ^DA download_acc
SetEnvIfNoCase User-Agent ^MyGetRight download_acc
SetEnvIfNoCase User-Agent "Star Downloader" download_acc
<Limit GET POST HEADER>
Order Allow,Deny
Allow from all
Deny from env=download_acc
Deny from env=mail_bots
Deny from env=bad_bots
</Limit>
[/PHP]
and the error that appears in the Apache error log is:
[Tue Mar 23 10:23:10 2004] [alert] [client 192.168.1.100] /usr/http/ac/.htaccess: TRACE cannot be controlled by <Limit>, referer: http://atomnet.co.uk/amiga/index.php?p=emulators
Now as the .htaccess is at the route of this site, it affect everything, and the problem is, it affects every type of file, and thus random images don't work, and sometimes you just get an internal server error page. The error doesn't happen all the time, it happens completely randomly, and it is random as to hat fails too, sometimes it may be nothing, some times a few images may not load, or sometime the whole page fails.
Any idea why this is happening, and even more so, why is it random?
RE: Why aren't all of the names in speechmarks?
Speechmarks denote an absolute name, for instance:
"DL" will only match hosts the the name DL, EXACTLY.
DL will match anything with DL ANYWHERE in the name.
^DL matches clients with DL at the front of the identification.
Cheers,
Ex
0
Comments
Make all your SetEnvIfNoCase fall into the same group, say bad_bots. Then just have one deny statement where env=bad_bots
Check out: http://evolt.org/article/Using_Apache_to_stop_bad_robots/18/15126/?format=print&rating=true&comments=false
and: http://www.webmasterworld.com/forum92/205-2-15.htm
Still happens. I mean it happens with people that don't fall in to any of the categories, and randomly too.
The file is now:
[php]# Email extraction bots
SetEnvIfNoCase User-Agent "ExtractorPro" bad_bots
SetEnvIfNoCase User-Agent "EmailSiphon" bad_bots
SetEnvIfNoCase User-Agent "EmailWolf" bad_bots
SetEnvIfNoCase User-Agent "EmailCollector" bad_bots
SetEnvIfNoCase User-Agent "CherryPicker" bad_bots
SetEnvIfNoCase User-Agent "WebEMailExtractor" bad_bots
# Bandwidth leachers and pests
SetEnvIfNoCase User-Agent "LinkWalker" bad_bots
SetEnvIfNoCase User-Agent "WhizBang" bad_bots
SetEnvIfNoCase User-Agent "MIIxpc" bad_bots
SetEnvIfNoCase User-Agent "MFC_Tear" bad_bots
SetEnvIfNoCase User-Agent "DIIbot" bad_bots
SetEnvIfNoCase User-Agent "ia_archiver" bad_bots
# Download Acellerators
SetEnvIfNoCase User-Agent ^Wget bad_bots
SetEnvIfNoCase User-Agent Flashget bad_bots
SetEnvIfNoCase User-Agent Getright bad_bots
SetEnvIfNoCase User-Agent Gozilla bad_bots
SetEnvIfNoCase User-Agent Downloader bad_bots
SetEnvIfNoCase User-Agent ^Mass bad_bots
SetEnvIfNoCase User-Agent ^LeechGet bad_bots
SetEnvIfNoCase User-Agent ^MD bad_bots
SetEnvIfNoCase User-Agent ^DA bad_bots
SetEnvIfNoCase User-Agent ^MyGetRight bad_bots
SetEnvIfNoCase User-Agent "Star Downloader" bad_bots
<Limit GET POST HEADER>
Order Allow,Deny
Allow from all
Deny from env=bad_bots
</Limit>[/php]
And it still happens randomly. When enabled random images don't load (maybe 1, maybe a few) or the page gives an error entirely.
[PHP]RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} "ExtractorPro" [OR]
RewriteCond %{HTTP_USER_AGENT} "EmailSiphon" [OR]
RewriteCond %{HTTP_USER_AGENT} "EmailWolf" [OR]
RewriteCond %{HTTP_USER_AGENT} "EmailCollector" [OR]
RewriteCond %{HTTP_USER_AGENT} "CherryPicker" [OR]
RewriteCond %{HTTP_USER_AGENT} "WebEMailExtractor" [OR]
RewriteCond %{HTTP_USER_AGENT} "WhizBang" [OR]
RewriteCond %{HTTP_USER_AGENT} "MIIxpc" [OR]
RewriteCond %{HTTP_USER_AGENT} "MFC_Tear" [OR]
RewriteCond %{HTTP_USER_AGENT} "DIIbot" [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
RewriteCond %{HTTP_USER_AGENT} ^DA [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR]
RewriteCond %{HTTP_USER_AGENT} ia_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} Leech [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^MD [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} "Star Downloader" [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} SurveyBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.*$ [L,R][/PHP]
Is there any way of making it return a Forbidden rather than redirecting it? As otherwise they end up hammering the server. Leaving the Rule empty as shown above gives the vicious client a Internal Server Error, which is fine as it doesn't use up any bandwidth and the DAs don't hammer the server for the file.
Thanks for the help.