
- Can’t Find How to Turn on Text Message Notifications for Amazon Fresh and Whole Foods Delivery? Here’s Why - 1/13/2021
- Parler Shut Down but Not Before Massive Data Scrape of its Users and Their Posts, Videos and Pictures - 1/11/2021
- iPhone or iPad No Longer Showing Recent Message Contacts in Quick Share Sheet?Here’s the Fix - 12/29/2020
If you are finding “SlurpConfirm404”, “SlurpConfirm404.htm”, “SlurpConfirm404.html” or “SlurpConfirm404.php” in your log files, and can’t figure out why, you’re not alone. Here’s what that SlurpConfirm404 is all about.
First, Yahoo Slurp is what Yahoo calls their Yahoo Web Crawler – their website indexing engine that crawls around the world wide web, indexing (cataloguing) all of the websites, and all of the web pages on those websites.
Now, many search engines and other web indexers, including Google and Yahoo, are interested in knowing what happens when someone comes to your website and tries to find a page on your site which doesn’t actually exist. What sort of error or message does your website return? Typically this error is error number 404 – page not found, and these search engines want to know how your site handles these. They want to see a proper “404 – not found” response for this, and want to make sure that your site is not returning such other response.
When the Yahoo Slurp web crawler wants to test your site to see what your site does with a query for a non-existant page (which should return some sort of “404 – page not found” error), it asks for a page on your site called “SlurpConfirm404”, on the assumption that you won’t have such a page, and so it will be a good test of what your site returns for such a request. In other words, it’s Yahoo’s Slurp’s way of confirming a 404 response – hence “SlurpConfirm404”.
No Paywall Here!
The Internet Patrol is and always has been free. We don't hide our articles behind a paywall, or restrict the number of articles you can read in a month if you don't give us money. That said, it does cost us money to run the site, so if something you read here was helpful or useful, won't you consider donating something to help keep the Internet Patrol free? |
# Now in my .htaccess :
SetEnvIf Request_URI “/SlurpConfirm404$” bad_bot
Order Allow,Deny
Allow from all
Deny from env=bad_bot
# Bye-bye Yahoo_Slurp
Well, it does hundreds of hits with different non existent pages… Does it expect to have diffetent 404 page ? Doesn’t make much sense to me…
The most popular I see are:
/SlurpConfirm404/robocopwebring/NonFramesHome.htm
/SlurpConfirm404/baystars.htm
/SlurpConfirm404/animalprints/Vacation_Sick_Time/clpa.htm
/SlurpConfirm404/islam/circuses.htm
Islam circuses? Are you sure it is !Yahoo?
I’ve just added an entry to my Fail2Ban filters to block any address that 404’s on SlurpConfirm404.
Yahoo’s web crawler ignores my robots.txt so I’m going to block that crap.
Big Thanks, I was very concerned. Slurp sounds more like a slug than a web crawler. So I should instead be pleased that Yahoo is having a good look around my website.
Regards The Pink Bin Lady
Gah, so yahoo like doing what seems to be just another wave of spam traffic to my site..
The IP does belong to Yahoo (in my logs).
Thanks Yahoo, for screwing with my logs!
If this is true, why isn’t it just a single hit? Why do I have hundreds of attempts to obtain /slurpconfirm/randomwebpagehere.html?
Thanks for adding wasted effort to my server, yahoo.