I totally rewrote one of my websites a few months ago going a more
dynamic design. All the code is now in a single ASP.NET page and the
page changes depending on query strings. I can control every aspect of
the pages from the query strings. While in the middle of development,
somehow either robots or link generators got a hold of my site as I my
logs reflect siteurl?hop=somestringhere all over the place. At first i
did not think much of this, but now, somehow, my google index for my
main home page of the site is not longer indexed and instead some odd
page ?hop query page is all that google will render.
Obviously I'd like to fix this. I'm really only interested in robots
indexing my main page. I know I can create a robots.txt file to stop
indexing of anything with a ? query string. But, I'm not sure if I'd
hurting myself by stopping robots from attempting to index every query
string combination that might point customers to me. For now, I've
opted to make the main page dynamically set the meta robots tag to
noindex,follow when the url has a query string of any kind. I suspect
this will atleast allow robots to follow and index my default page.
Some questions.
- Does Google NOT count a link if robots is telling it not to index it?
- Is it common practice for webmaster to not index pages with
unexpected query strings?
- Will Google set a duplicate content penalaty if pages a page is
linked with a query string.. keeping in mind anybody can link to
anybodies page with a query string, and often the page renders the
exact page?
- I suspect these ?hop= links are from clickbank customers, will these
help Google rank in anyway?
Thank you for any help or information.