Representatives from major crawler-based search engines cover how to submit and feed them content, with plenty of Q&A time to cover issues related to ranking well and being indexed.
Danny Sullivan the conference Co-Chair
is moderating with
Peter Linsley of Ask.com,
Evan Roseman of Google, from Eytan
Seidman Microsoft and
Sean Suchter of Yahoo! Search are
panelists.
Eytan is up first for a short presentation. He talks about their Live Webmaster
Portal which includes features on how Microsoft will crawl your site. They
support site map submissions and you can also see statistics specific to your
web site.
They have multiple crawlers that will always begin with "MSNBot" -
- web search
- news
- academic
- multimedia
- user agent
Next he points out that they support "NOCACHE" and NOODP" tags.
Sean is up next for a short presentation on some updates with the Yahoo!
crawler. One is dynamic URL rewriting via Site Explorer. Another thing is the
"Robots-nocontent" tag which allows you to block access to certain portions of a
web page. They have implemented crawler load improvements (reduction and
targeting). New crawler has lower volume with better targeting.
Evan is up next and to start things off, he highlights Webmaster Central and
explains some of its features. He suggests that you take advantage of it to
submit a site map so that Google can index all your content. He also points out
the Google Help Center in which they feature answers to some of the most common
questions.
Finally, Peter is up. He talks about catering to the search engine robot as many
times in catering to the actual human visitor, the robot is forgotten. Some
problems include requiring cookies. He points out that Ask does accept site map
submissions but points out that they'd rather be able to crawl naturally.
Peter uses the Adobe site to demonstrate some issues that they may have with
multiple domains and duplicate content. He then uses the Mormon.org site and
shows that they are disallowing crawlers to index the root page. This creates
problems with crawling.
Now begins the Q&A portion of the session.
Q: First question if for Google rep. Wants to know whether they will
allow users to see supplemental results within Webmaster Central now that they
are no longer tagging them in search results.
A: Evan stated that being in supplemental is not a penalty but did not
provide a definite answer as to whether they would allow users to discover if or
not results are supplemental.
Danny interjects that all engines have a two-tier system and Eytan, Sean and
Peter confirmed that. So... they all have supplemental indices but people only
seem to be concerned with Google's, most likely because they used to identify
them as such in the regular search results.
Q: What can a competitor actually do if anything to hurt your site?
A: Evan says that there is a possibility where a competitor could hurt
your site but did say it is extremely difficult. Hacking, domain hi-jacking are
some of the things that can occur.
Q: Question relates to scenario when you re-publish content to places such
as eBay but the sites you re-publish to rank better than original. How can a
webmaster identify original source of information?
A: Peter answers that one could try to get places they republish content
to use robots.txt to block spidering of content. Another thing to do is have
link back to original site. However on a site such as eBay, that is not always
possible. The response to that is to create unique content for these sites that
this person is re-publishing content on.
Q: Robert Carlton asks if all engines are moving towards having things like
Webmaster Centrals. Also asks how they treat 404s and 410s.
A: As for 404s and 410s, Ask, Google and Yahoo! treat them the same.
Robert points out that they should treat them differently as a 410 indicates the
file is gone whereas 404 is an error.
Q: Question regarding getting content crawled more frequently.
A: Evan suggest to use the Site Map feature in Webmaster Central and keep
it up to date. He also suggest promoting it by placing a link to it on the home
page of their site.
Q: How can one use site maps more effective for very larges site that have
information changing on a regular basis? Also inquired how to get more pages
indexed when only a portion are being indexed.
A: Submitting a site map with Google is not going to cause other URLs to
not be crawled. Evan also points that they are not going to be able to crawl and
include ALL the pages that are out there. Again suggests that webmaster promote
them such as listing them on home page. However when dealing with hundreds of
thousands of pages, that is not always feasible.
Q: How do engines interpret things like AJAX, JavaScript, etc.?
A: Eytan answered that if webmaster wants things interpreted, they are
going to have to represent those in a format the engine can understand, AJAX and
JavaScript currently not being one of them.
Q: Question regarding rankings in Yahoo! disappearing for three weeks but
then they get back in. Is his due to an update?
A: Sean answers that it certainly could be and suggests using Site
Explorer to see if there is some kind of issue.
Q: How many links will engines actually crawl per page? How much is too
much?
A: Peter says there is no hard and fast rule but keep the end user in
mind. Evan echoes the same feeling.
Q: Do the engine use meta descriptions?
A: All engines use them and may use them if the algorithm feels they are
relevant.
Q: For sites that are designed completely in Flash, can you use content
in a "noscript" tag or would that be considered as some type of cloaking?
A: Sean said IP delivery is a no-no but if the content is the same as
Flash, he'd rather see content in noscript than traditional cloaking. Evan
suggests avoiding sites in complete Flash but rather use Flash components.
Q: Is meta keywords tag still relevant?
A: Microsoft - no, Yahoo! - not really, Google - not really, and Ask -
not really. All read it but it is has so little bearing. For a really obscure
keyword where it only appears in the keyword tag and no where else on the web,
Yahoo! and Ask are the only ones that will show a search result based on it.
Q: How do engines view automated submission/ranking software?
A: Evan - don't use them.
I asked a Peter Linsley a question after the session regarding whether Ask is
working to make their index fresher. In other words, are they working to
re-index content as fast as the other engines do as typically it takes 6 months
or more to get changes made to pages in the Ask index.
He said they are working on it but cannot give me any definite timeframe as to
when that might be rolled out.
I also asked if they prioritize sites such as a CNN or Amazon in that changes to
those sites are updated in the index more frequently than a mom and pop brochure
type of a site and he confirmed that was true.
David Wallace - CEO and Founder SearchRank