Louis Monier: Past, Present & Future Of Search - Cuill Vice President of Products Monier (formerly with Google, eBay and AltaVista) will describe efforts at Cuill to set new standards in search, in addition to providing perspective on developments in the field.
The primary thing is that people need to find information on the web. Let's look at the past. Human activities linked to one network that has never been achieved before. This is on par with writing - a total turning point.
The Internet at the time went through the first phase that every new technology has to go through: rejection. People were writing very anti-Internet quotes.
"The Web is only as good as its index." At the time, something was needed to navigate the web. Only a full text search will do. Search was so fundamental to the web that nobody rejected it. There was such a need for it. Query, relevance, keywords = terms that became daily words.
The first search engines were slow and covered a tiny part of the web. In 1995, Altavista launched. It was good because it had a whooping 16 million documents and was able to respond within a fraction of a second. It had features that nobody dreamed of - you can find people linking to your homepage. This was the feature that essentially started search marketing. It was the first modern search engine by many people's definition and within months became the strongest search brand on the web. It dominated for awhile.
But around that time, there were other brands that showed more pages. Because a SERP is much more expensive to produce than a news page, search engines tried to get people to stay on the site as much as possible. However, this was incompatible with the mission of search engines to send you someplace. In '98, spam and aggressive marketing almost killed web search. Most queries would turn page after page of irrelevant data that were not related to the query. Search engines were very naive at the time; they didn't know how to deal with these crappy results.
One university experiment had a new idea for relevance: link analysis. That's what became Google. Link analysis resisted keyword stuffing. The edge of Google here made people switch to them. The mix of this secret weapon and discrete targeted text ads is Google. More power to them.
For several years, Google alone was working on search and got a head start. Today, there's a big product space - one dominant players and a few out-of-breath competitors.
For some queries, there's a best answer. It's so easy that there's one button (I'm Feeling Lucky). But for some queries, there is no one best answer. For most queries (long tail), answering these queries has not progressed within the last 10-12 years. What are the obstacles? Phrasing the proper query is often one.
Now what? We can use help in what to do next. We often need help figuring out what we mean. Sometimes we don't know what we need. The anology is ecommerce sites. The first result may not satisfy your query. What's the web equivalent? From a certain point, it's still 1995. We're still using the engines the same way. Besides the massive capital investment, we saw the same 3 ideas: 1. Vintage 1960 information retrieval. There are a many words per a possible query. Keyword density may work in this area. (Do a search for "cool stuff" and find as many references of those words in the document) 2. Pay attention to links and anchors. This was made famous by PageRank. 3. Put your users to work. Which result gets the most clicks? Move that one up. There are user interface issues: basically 10 links. This is the search engine of 10 years ago.
What exactly has changed? What's new? Mostly it's scale. We're talking billions, not millions, of pages now. The web has also moved in terms of speed.
Does size matter? Search engines can have all the pages or the best pages. Size clearly must not matter. You already get too many results. Think users. Think search marketing. Access to a few well-known websites should be enough as long as it covers all your interests. To be fair, you should also consider the intersts of your friends as well - and their friends and so on. So yes, size matters. It is impossible to figure out what is interesting to you and others.
There is some good content off the written path that search engines don't show you. How do you know about these pages? It's an interesting thing to think about.
I don't want a list of just the most popular pages. Search, for me, is not looking about my keys. It's about research, exploring, and finding what I need with insight
The future: will we get this? There are many contenders - in no particular order: - Human powered search. We have directories and they produce a great job: high quality content but the coverage is tiny. It's not scalable though. - Personalization. Track users and we can fill in the blanks. The unescapable example is diamond: diamond (jewelry) vs. a baseball diamond. It will only achieve so much, though. If I search for a present for my wife, they might get the wrong idea. - Social search: my friends already have the answer. I don't see how my friends' search history will contribute to my question. How many people will it take to intercept my query? Too many. I don't think it's a very serious contender. - Vertical search: a specialized search engine can take a narrow slice of interest and do better than a broad search engine. It nobody wants to manage 10,000 bookmarks which is the challenge of vertical search. We need a search engine for bookmarks! For things we care about, they probably go to a specialized store. But sometimes you go to the supermarket because one place has it all. The same reasoning applies to search. One size fits all typically works because people don't want to manage this whole ocean of knowledge. This will limit the reach of vertical sites. - Natural language processing: search engines should try to understand as much as possible in documents and queries. The problem is how much good language is there on the web? Not enough! There is also a problem: the great wall - e.g. Java (not coffee, not the Island, but the programming language!) - Semantic Search and the Semantic Web: The definition is that webmasters are going say exactly what things mean. But what's their motivation to take this tremendous amount of effort to do this? - Artificial Intelligence...? I think it's hard to go from the Flintstones to the Jetsons. We need progress. That's more realistic.
Back in the early days of Altavista, a friend surprised me with a new use for a search engine: using it as a spell checker. That's a definition of research! (e.g. blogology vs. bloggology - latter has less matches.) This is the first time I realized that search engines can be used for more than just navigation. That was an insight for me. How many results do you get for this query? That was informative.
In Altavista, we extracted interesting words from the results to show you them in an interesting way. That's how you can get terms related to your query. That's also how you can learn some terms that are related to the query. You can use aggregate information from the results. Surprisingly, it has worked very well for other domains.
Research assistant - different opinions for queries. This is a lot more than navigation: analyzing what's on the web.
Conclusion: Search engines are the only game in town - the only way to discover things on the web. Size matters.