John Mueller from Google posted a mindful and useful response on Reddit mostly around where to start with hreflang. He said "It's easy to dig into endless pits of complexity with hreflang." "My recommendation would be first to limit the number of pages you create to those that are absolutely critical & valuable," he added.
After you go through that thought process, then he said "focus first on pages where you're seeing wrong-language traffic."
So it doesn't always make sense to simply deploy hreflang everywhere because it might be easier to not think about it. "Sometimes the balance between "save effort by thinking" and "just do it everywhere" is not that straightforward to determine," John Mueller said.
Here is his full response:
You definitely shouldn't block / disallow these in robots.txt -- if they're disallowed from crawling, we wouldn't be able to canonicalize them at all, or see any of the metadata on them.It's easy to dig into endless pits of complexity with hreflang. "Let's create all languages! Let's make pages for all countries! What if someone in Japan wants to read it in Swahili? Let's make even more pages!" My guess is most of these "pages created because you can" get very little traffic, add very little value, and they add a significant overhead (crawling, indexing, canonicalization, ranking, maintenance, hreflang, structured data, etc.).
My recommendation would be first to limit the number of pages you create to those that are absolutely critical & valuable -- maybe that already cuts the pages you're thinking about. Think big here; if you're talking about individual pages within a medium-sized site, it's probably a non-issue. On the other hand, if you're considering copying your whole site into 20 languages x 10 countries, that's something else.
Past that, for hreflang, I'd focus first on pages where you're seeing wrong-language traffic -- often these are pages that get a lot of global, branded queries, where it's hard to determine which language content they want. A search for "google" can match a lot of language pages, hreflang can help to differentiate. On the other hand, a search for "search engine" is pretty clear & matches pages where you write about "search engine" already, so pages like that don't need as much help being language-targeted. That said, sometimes the balance between "save effort by thinking" and "just do it everywhere" is not that straightforward to determine :).
Forum discussion at Reddit.