Load balancing is a process used in the Internet world to help "ease the burden" on the servers of highly-trafficked sites. The idea is that when a visitor clicks on a URL, he or she may be redirected to a ww1, or a ww2 (identical) version of the website. As a surfer, you may have noticed this happening before and wondered why. So if you use load balancing, will it effect your search engine optimization efforts?
I actually started a thread on this subject at Search Engine Watch Forums recently, and have received some decent answers so far. Using the research and knowledge (and responses) of one of our developers, I asked fellow SEWans if Google Sitemaps may be the answer to solve the following problem with a load-balanced website:
...duplicate content that has been indexed. We have both ww1 and ww2 pages indexed, although mysteriously the ww2 has more pages in the index (almost all of them).
Although there have been some useful responses and solid suggestions, including to use "a reverse-proxy with url rewrite" and also:
Is it a scripted site? (If so) Modify the script so that it checks the requested URL...If the URL is www then add nothing....If the URL is www1 or www2 then add the [meta name="robots" content="noindex"] tag to the page...Eventually you will only have one set of URLs indexed.
I would love to get some more ideas from others that have dealt with this problem, especially if they have tried Sitemaps. Please join the discussion at Search Engine Watch Forums and feel free to link in the comments to any other discussion or article about using Sitemaps to help with load balancing issues with SEO. Another thread at SEW also discusses load balancing.