Sean Brattin asked Google's John Mueller a couple of months ago if Google uses Cloud Vision AI within Google Image Search. The answer was no, but John did say that Google did talk about "about doing that in the past" but ended up not to do it.
This came up in a video SEO hangout back in February at the 7:16 mark into the video. Sean Brattin asked "So are you able to comment on this? I imagine that Cloud Vision has something to do with that, trying to match similarities with machine learning to the entities. Am I on the right track here?"
John Mueller responded "I don't know how far we would use something like that. I do think that, at least as far as I understand, we've talked about doing that in the past, specifically for image search. But it's something where, just purely based on the contents of the image alone, it's sometimes really hard to determine how the relevance should be for a specific query."
John then gave this example, "So for example, you might have, I don't know, a picture of a beach. And we could recognize, oh, it's a beach. There's water here. Things like that. But if someone is searching for a hotel, is a picture of the beach the relevant thing to show? Or is that, I don't know, a couple of miles away from the hotel? It's really hard to judge just based on the content of the image alone."
John added that Google does use techniques to understand images but likely not from the Vision AI. John said "so I imagine if or when we do use kind of machine learning to understand the contents of an image, it's something auxiliary to the other factors that we have. So it's not that it would completely override everything else."
Here is the video embed with these details:
Forum discussion at YouTube Community.