Google has posted a new video on how BERT helps Google Search understand language. Google has been using BERT in search since 2018, we only knew about it in 2019. That being said, the short video basically says it is about Google understanding the little words better.
Here is the video:
Here is the transcript if you don't want to listen:
If a pancake recipe told you to "mix the batter with the banana," you probably wouldn't think to use the banana as a mixing spoon. But what's obvious to humans — things like context, tone, and intention — are actually very difficult for computers to pick up on. At its core, a Google Search is about understanding language. In order to return the right information, Google doesn't just need to know the definition of the words... it needs to know what they all mean when strung together in a specific order. And that includes the smaller words like "for" and "to." And when you think about how many different meanings a single word can have... you start to see how writing a computer program that takes all these nuances into account is pretty tough. See? Case in point. "Pretty" here doesn't mean beautiful, it means "very." More and more, people talk to Google the way they think and speak. And, more and more — Google is getting better at understanding what they mean. One of the biggest leaps forward in the history of Search came about with the introduction of "Bidirectional Encoder Representations from Transformers" or as we like to call it, BERT. BERT is a machine-learning, model architecture that helps Google process language and understand the context in which it appears. Search used to process a query by pulling out the words it thought were most important. For example, if you said, "can you get medicine for someone pharmacy" you would have gotten general results about pharmacies and prescriptions because it would have essentially ignored the word "for." But with BERT, the LITTLE words are taken into account and it changes things. Search now understands you want to know if you can pick up medicine... prescribed to someone else. But how do you train a language model to pick up context? There's a big difference between knowing words and understanding meaning. The model learns context by applying the same fill-in-the-blank principles it takes to complete a Mad Libs. So we take a phrase. We hide about 20% of the input words. And then we make the computer guess the words that are missing. Over time, the model begins to understand different words have different meanings depending on what's around them. And the order in which they appear in that text, really matters. So when you search something complex like, "Fly fishing bait to use for trout in september montana" Search knows all the little words are important and because it now takes them all into account, Google can tell you the perfect bait for that time of year. BERT isn't foolproof, but since implementing it in 2019, it's improved a lot of searches. We should always be able to learn about whatever we're curious about. And that's why Search will always be working to understand exactly what you're truly asking.
Good video from Google about BERT:
— Glenn Gabe (@glenngabe) February 17, 2022
"The little words are taken into account, and that changes things."
And... "There's a big difference between knowing words and understanding meaning." https://t.co/boqZaUlYht
I am surprised Google did not release this video when we wrote about how Google uses AI in search.
Forum discussion at Twitter.