Newsgroups : Alt : alt.internet.search-engines : 2006 May : Spidering home page - and nothing else

www.cryer.info
Managed Newsgroup Archive

Spidering home page - and nothing else

Subject:Spidering home page - and nothing else
Posted by:"Phil Payne" (ph..@isham-research.co.uk)
Date:15 May 2006 02:12:28 -0700

Going through this month's log I've found lots of search engine bot
visits that have downloaded robots.txt and index.html - and nothing
else.

There weren't any changes on index.html for over a month.  I've made
some this morning just to see what happens.

But I can't understand this behaviour.  The pages that have been
changed haven't been touched - I only have one HEAD request in two
weeks, and that perversely was the Googlebot just after it had spidered
the page anyway.  The others just download the home page blindly - of
course, it hasn't changed since the last time.  Other pages that have
changed aren't being looked at.

Is it necessary to make a pointless change to the home page so that
MSN, Yahoo, Excite, Ask, et al, will crawl deeper?

I've just done that anyway, to see what happens.

Glossary

File Types

Replies:

www.cryer.info
Managed Newsgroup Archive