In my interview with Danny Sullivan, the Google Search Liaison, I tried to get more details on Navboost and how it plays a roll in core updates and general rankings. I was pretty much shot down (I am a good loser). That being said, here are some old posts from ex-Googlers (from what I can tell) on the topic of Navboost from the Hacker News forums.
This first quote is from Greg (gregw134) on Hacker News, a quote you probably have seen before:
Ex-Google search engineer here (2019-2023). I know a lot of the veteran engineers were upset when Ben Gomes got shunted off. Probably the bigger change, from what I've heard, was losing Amit Singhal who led Search until 2016. Amit fought against creeping complexity. There is a semi-famous internal document he wrote where he argued against the other search leads that Google should use less machine-learning, or at least contain it as much as possible, so that ranking stays debuggable and understandable by human search engineers. My impression is that since he left complexity exploded, with every team launching as many deep learning projects as they can (just like every other large tech company has). The problem though, is the older systems had obvious problems, while the newer systems have hidden bugs and conceptual issues which often don't show up in the metrics, and which compound over time as more complexity is layered on. For example: I found an off by 1 error deep in a formula from an old launch that has been reordering top results for 15% of queries since 2015. I handed it off when I left but have no idea whether anyone actually fixed it or not.I wrote up all of the search bugs I was aware of in an internal document called "second page navboost", so if anyone working on search at Google reads this and needs a launch go check it out.
The second, I don't think you've seen before. It is from Kevin Lacker, now the CTO at Parse, but was a Google Search engineer who worked in Google's Search Quality department developing search algorithms between Jan 2005 - Nov 2009. He posted 9 months ago on Hacker News:
Bah, kids these days. It's not revisionism. I worked on search quality during the hand-coded heuristic era, 2005-2009. I spent some of that time working on the "navboost" team which used click data to alter the search results. Even by 2007, click data was quite valuable, arguably the single most important component of the algorithm.The algorithm was hand-coded, sure. You don't need machine learning in order to use click data. You just need a few people to have searched for that particular query before. When someone clicks on a result and stays there a while, it's a "long click" and you boost that search result for that query.
Then I found these interesting references from people on Hacker News from years before the DOJ documents were revealed on Navboost. I am not sure if these are from Googlers or not.
Google doesn't use page rank anymore. They have a signal called navboost which is the strongest signal in ranking and retrieval. And that idea originates from Yahoo paper in which they revealed the gems of search engine algorithms (PDF here)
That PDF is a Yahoo patent.
Sundar's claim to fame is Google Toolbar for MSIE. It was a big deal at that time. It locked Google search on MSIE and ultimately enable navboost in search quality.Then he was the product lead on the Chrome team. I don't know if it was him who lobbied Page to create Chrome.
quantumofalpha on Hacker News:
wat? Google used clicks ("navboost") since approximately forever, it's one of strongest signals for all major web search engines. I guess the better question is why they don't optimize ranking directly on clicks, why still bother with all that expensive human raters business... well optimizing for clicks helps but only until a certain point beyond which it starts hurting relevancy by overpromoting old popular results and clickbait. But as an ingredient in overall ranking in one way or another they and everyone else definitely use click data.
We recorded a video on this topic this morning, we kinda go off on Google on this topic at the 2:53 mark:
Take what you want from these quotes but I did find them interesting.
Forum discussion at Hacker News.
Update: Nice tidbit here:
This is the edition I took the quote from. It's a super interesting book, it also talks about Google spam detection algorithms, how Google started doing NLP to extract facts from webpages, etc: https://t.co/YRDTr9ckgB
— Juan González Villa (@seostrategaEN) September 13, 2024