I found it super interesting to listen to a dialog between a new German startup and Google's John Mueller. The conversation leads to John Mueller explaining that new sites can ultimately use internal linking to funnel Google through the most trusted and quality pages on the site, to earn trust and then get Google to crawl more and more of those pages over time after that trust was earned.
In short, the German startup created a site that ranks realtors based on how much over value they close on properties and how quickly they do it as well. They do this by mashing up data they find on the web in a unique way than other sites do. To me, that sounds like a super useful site but to John, it did not. John basically said the site was taking data from across the web and mashing it up differently but that isn't enough for Google to say the site is quality or not.
John said at the end of his explanation, "you have to create a more funneled web structure in the beginning and then over time as you see that Google is doing more and more on your website then expand that and expand that step by step until you've reached kind of your end situation where Google is actually actively indexing all of the content on your website."
The question was asked at the 18:25 mark in the video by Omar Dahroug. Here is the video, it goes on for several minutes, so you can read the transcript below:
Here is the lead question:
We are a very young startup that just started here in Germany. And basically what we do is that we crawl leading real estate platforms and we try and assess which realtor is actually achieving higher prices, which realtor is achieving fast sales durations. And so we have for each postal code in Germany we have one specific page for that possibility, which includes or like which calculates to 28,000 pages, which is not that huge for other websites. But we're facing challenges in indexing those sites. So our goal is that someone that typing in [realtor munich] that our page with the ranking of the best agents in that region will pop up. But currently we see that 90 percent of our pages is excluded from the index. And i've read a lot about it and you also are talking about how the content is might be similar or that the content is not super interesting for the users. And my question to you is how to solve this so how to get ranked when someone gets or types in hey I want to search for a realtor in that location because it's it's super crucial for him but the bots might not understand that.
Here is John's answer:
Yeah. I think there are a few things that that come into play. On the one hand from from the quality side that's something that I think you almost need to figure out before you push too far. In the sense that it's sometimes easy to create these these back ends that analyze a lot of data and spit out some some metrics for individual locations and maybe make some lists based on existing things. But you really need to make sure that these are also useful pages for for people. So it's not just like a recompilation of data that's already out there. So I don't know, for example if you take a city and they have 10 realtors and it's essentially the same 10 that are in the phone book. Then it's hard to say that your compilation is providing something of unique value. So really kind of making sure that the content that you're publishing is something that is really high quality and useful. I think that's almost like the first step. Because that kind of helps to to grow your website over time.
Question:
But how would you indicate that, sorry to interrupt you, I mean how how do you because because we are pretty sure that this is pretty unique content all over Germany, it might be all over the world. But we have a hard time knowing...
Answer:
Yeah, I think that's something where you almost need to do something like user studies to figure out what what is the best UX for for these kind of pages, what kind of information do people need, what ways can you provide kind of assurance that the content is correct that you're not skewed by I don't know payments or whatever, that it's really trustworthy content. But that's something which is from our point of view almost like a a requirement for for the next steps because it is possible to get a lot of pages indexed and like using different ways. But if we recognize that you have a large website and we think that the content overall is low quality then you almost have a bigger problem kind of in telling google kind of like we improved the quality of our pages. Whereas if you start off with like really high quality content then it's more a matter of the challenge of well how do I get more of my pages indexed over time. So that's that's kind of the the first thing I would do.And with regards to indexing more pages over time that's usually something that happens automatically as we recognize that your website is really valuable and really important. And that's something that it just takes time. And one of the strategies I i try to I don't know encourage people or tell people to think about is, on the one hand you can decide what what kind of pages you think are important within your website and which ones you want google to focus on through internal linking. And kind of working to make sure that like if you say 90 are currently not indexed and 10 is indexed, that you make sure that at least the 10 that are indexed are really good pages and really important pages for you. So that like you get a mass of people already going to these pages and saying this is fantastic content maybe they recommend it to other people maybe to link to it from other places. But at least that 10 percent that you start with is something that you can kind of grow with. And then over time what you'll see is Google crawls more and more of your website as it recognizes that it's really good and important and that can then result in well on the one hand crawling more frequently the existing pages, on the other hand crawling deeper within your website so kind of like more layers within the internal linking and digging in a little bit deeper. So that's something that essentially kind of happens over time algorithmically. And it's something where it's it's sometimes tricky especially with a website like i'm guessing yours is to create an internal linking structure in the beginning that focuses on things that you find important. Because it's very easy to say well I will just list all my postal codes in numerical order and Google will crawl and index all of these pages but maybe the first 10 of those pages that you have linked there are irrelevant for your business or a very low low value for your website.
So it's almost like you have to create a more funneled web structure in the beginning and then over time as you see that Google is doing more and more on your website then expand that and expand that step by step until you've reached kind of your end situation where Google is actually actively indexing all of the content on your website.
This might give some of you more perspective on what Google sees as quality, from an algorithmic perspective.
Forum discussion at YouTube Community.