Why doesn’t Google index my content properly?
Before Google can rank your content, it needs to discover it, be permitted and able to evaluate it, and index it. If any of these processes go wrong, you might find that your pages don’t show up in the search results.
Most of the time, you can rely on Google to correctly index your content, all by itself. After all, this process is one of the foundational parts of what Google is, and does.
Simply putting your content online isn’t always enough, however.
If you have technical issues, low-quality content, or incorrect indexing controls, you might trip up during those processes of discovery, evaluation and indexing might go wrong.
Discovery
In order to index a page, Google has to be able to find it. That means that somewhere has to link to it – whether that’s from other indexed pages in the same site, or from other sites.
Depending on the relevance and quality of the places it’s linked from, it might take a little time for Google to schedule following those links and finding your pages.
That also means that the page can’t be ‘hidden’ – which, for example, might mean content being password protected, blocked via robots.txt, or only available to users in certain countries.
Evaluation
When Google has discovered the page, it will digest the content (including the HTML code and related assets) to assess the quality and relevance.
During this process, there are a number of things which can result in Google choosing not to index a page. They include:
- When it determines that the content of the page is ‘low quality’. E.g., if there’s a very low word count, or if the content is a close/direct duplicate of another page. Particular ‘over-optimized’ or ‘spammy’ pages may also be ignored.
- When it discovers specific indexing instructions on the page (such as a meta robots tag, or a canonical URL tag pointing at a different page). Google will make a judgment call in cases like this whether it should honor the instructions, but chances are, it’ll choose not to include the page.
- When it can’t see/access the content. For websites which rely heavily on JavaScript, or those which include content in complex or non-standard ways, Google might not be able to consume the page content. It may be that, as far as they’re concerned, it’s an empty (or low quality) page.
- When it has to process heavy JavaScript, Google might schedule a ‘follow-up’ crawl to dig deeper, before deciding what/whether to index or not. The time this takes can vary considerably, based on Google’s resourcing and their prioritization of your pages.
Indexing
If you’ve passed all of those tests, then your content should be successfully indexed and should turn up when you search for it.
HINT: try doing a ‘site’ search on Google (e.g., site:https://www.example.com/example-page/
) to see if a specific URL has been included in the index).
Bear in mind that, once a page is in the index, that doesn’t mean it’ll stay there forever! Google repeatedly crawls and re-evaluates content – so if your quality drops, or if you accidentally prevent Google from evaluating the content, then your page might get dropped out of the index.
Any questions? Let us know in the comments!
Read more: What is crawlability? »
Coming up next!
-
Event
WordCamp Europe 2024
June 13 - 15, 2024 Team Yoast is at Attending, Organizing, Sponsoring, Volunteering WordCamp Europe 2024! Click through to see who will be there, what we will do, and more! See where you can find us next » -
SEO webinar
Webinar: How to start with SEO (May 7, 2024)
07 May 2024 Learn how to start your SEO journey the right way with our free webinar. Get practical tips and answers to all your questions in the live Q&A! All Yoast SEO webinars »
Thanks for sharing this valuable article. robots.txt plays an important role in Google indexing and now I know very much of it.
I was really seeking the answer to this question since very long. Thanks for providing insights on all the possible problems that arise in the website. I will implement all the necessary changes to stop the hindrance for Google to index my website’s content properly.
I recently recognized, that google is indexing pretty fast, as long as your content is decent and is getting a couple of social media shares.
But thes experience only counts for Germany.
I think in the US its harder to get indexed, because there is much more content being released!