Google's John Mueller said on Twitter that sometimes hard to understand and really complex URL structures can lead to pages being removed from the Google search index. He said specifically "many URLs leading to the same content, making our systems assume that a part of the URL is irrelevant."
The complaint was that an e-commerce site had over 50,000 pages removed from the Google index recently. After a bit of digging around, John Mueller from Google responded that it is possible the URL structure on the site is to blame.
Here are the chain of tweets:
Does the site have product variations at their own URL? Could be a page variation that’s considered to be more dominant than the actual canonical, even if it’s canonicalized?
— Kyle Faber (@regal_kyle) December 27, 2018
I've seen that happen when a site has a hard-to-understand URL structure, with many URLs leading to the same content, making our systems assume that a part of the URL is irrelevant.
— 🍌 John 🍌 (@JohnMu) December 28, 2018
Keep your site simple.
Forum discussion at Twitter.