The other day Gianluca Fiorelli posted on Twitter that he felt Google was taking content from third party sites and placing that content on in the Google Search knowledge panels but sourcing itself, Google, as the author of that content and not the site it is sourcing it from. Well, Google said that is not true, that it is the other way around.
Google said that the content was written by Google employees and the sites you thought wrote it originally actually stole it from Google and put it on their own sites. I did dig into some examples and noticed that many actually didn't have that content on their site until recently, a few months ago, whereas Google likely had that content there for years. It is very hard to prove either way, but we know Google has been writing its own knowledge panels since 2018 - so it is more likely that these sites stole the content from Google - word for word - than Google stealing the content from those third party sites and not citing them.
Here are the allegations from Gianluca Fiorelli posted on Twitter:
... and discovered how practically all of them are copy-paste of Italian websites.
— Gianluca Fiorelli (@gfiorelli1) April 26, 2022
I repeated the experiment, and I can confirm the same result.
If Google is using AI, it is trained for stealing the content that better fits the K. Panel design.
Follow to see a few examples.
Piemonte. The content is 100% copied from this page: https://t.co/4usIWl10Xg pic.twitter.com/PHfFNIxvFW
— Gianluca Fiorelli (@gfiorelli1) April 26, 2022
So... Is Google the author of these Knowledge Panel texts? Clearly, it does not seem so!
— Gianluca Fiorelli (@gfiorelli1) April 26, 2022
Is it using AI? Hard to say. If it uses it, is for understanding which all the indexed content fits better in the KG Panel.
Is the most authoritative? Sincerely not. Correct, but not really..
There is a bit of a gap in the middle but click on the first tweet to scan them all.
Then Danny Sullivan of Google replies:
This is why in 2020, we added the - Google credit to indicate we authored the description. See also: https://t.co/55xDApET0S
— Danny Sullivan (@dannysullivan) April 26, 2022
He then goes further the next day:
The first line of the description in that YouTube video matches what you'll find in this news article from the same year: https://t.co/2DEEum6zHe
— Danny Sullivan (@dannysullivan) April 26, 2022
Did they also pick it up from the same YouTube video? It doesn't make a lot of sense...
I'll see if I can track down when we first stared authoring descriptions, but it was likely before 2018. It was only in 2018 that we got asked about it (as I said in my earlier reply).
— Danny Sullivan (@dannysullivan) April 26, 2022
Following up, we were authoring descriptions from at least 2015. In 2016, we had a blog post that talked about some of this work with destinations; you can see one of our descriptions in an embedded YouTube video in that post (34 seconds in): https://t.co/aVBjCAwhos
— Danny Sullivan (@dannysullivan) April 28, 2022
Even before Danny replied to any of this, Glenn Gabe also felt that Google was being copied from, not the other way around:
I'll check again, but I think at least one of those pages hasn't existed for very long. I didn't have time to dig in too much, but worth noting.
— Glenn Gabe (@glenngabe) April 26, 2022
Brodie Clark as well:
This has come up in the past. When investigating, I found that it was the opposite (other sites taking Google’s content). Though it can be tricky to track down the original source when this started happening 4+ years ago.. A more detailed explanation from Google would be helpful. https://t.co/TQ0vlRbQ6q
— Brodie Clark (@brodieseo) April 27, 2022
In any event, this is why we have the Wayback Machine - the only issue is it is super hard to know when Google posted these in the search results. So you can believe whomever you want - I do trust Danny Sullivan, he would 100% not lie to me, despite what you all think. Google links out to sources all the time, most of the time, so I don't see why Google would steal the content versus link to the source of the content. But maybe I am naive?
Forum discussion at Twitter.