Why am I getting different values for indexed pages in the Google search the GSC and SISTRIX? SISTRIX
Why am I getting different values for indexed pages in the Google search the GSC and SISTRIX - SISTRIX Login Free trialSISTRIX BlogFree ToolsAsk SISTRIXTutorialsWorkshopsAcademy Home / Ask SISTRIX / Crawling and indexing / Why am I getting different values for indexed pages in the Google search the GSC and SISTRIX
Why does the amount of indexed pages fluctuate so much? From: SISTRIX Team 23.05.2022 Google-Index, Google-Bot and the Crawling Process What is the Google Everflux? Robots meta tag vs. robots.txt: what are the main differences? What is an HTTP referrer? Our web site is no longer in the index - have we lost our rankings? What is a User-Agent? What is Google Search Console and How To Get Started Web Crawlers: How do They Work? Changing Google Search through Entities What is the X-Robots-Tag? What is the Mobile First Index? Rich Snippets: What are the advantages? Can the Google-Bot fill out and crawl forms? Crawl Budget: What does this mean? These are the CTR's For Various Types of Google Search Result Crawling and Indexing for extensive websites Google SERP Features: Result Types in the Search Results Why does the amount of indexed pages fluctuate so much? How can I quickly get a new page into Google's index? Why does a blocked, noindex URL show up in the search results? Is a website with and without the www harmful? Shelf space optimisation on Google Find out how many pages of a domain are indexed by Google The consequences of negative user-signals on Google's rankings Why am I getting different values for indexed pages in the Google search the GSC and SISTRIX How can I remove a URL on my website from the Google Index? Back to overview German English Spanish Italian French
Why am I getting different values for indexed pages in the Google search the GSC and SISTRIX
From: SISTRIX Team 23.05.2022 Google-Index, Google-Bot and the Crawling Process What is the Google Everflux? Robots meta tag vs. robots.txt: what are the main differences? What is an HTTP referrer? Our web site is no longer in the index - have we lost our rankings? What is a User-Agent? What is Google Search Console and How To Get Started Web Crawlers: How do They Work? Changing Google Search through Entities What is the X-Robots-Tag? What is the Mobile First Index? Rich Snippets: What are the advantages? Can the Google-Bot fill out and crawl forms? Crawl Budget: What does this mean? These are the CTR's For Various Types of Google Search Result Crawling and Indexing for extensive websites Google SERP Features: Result Types in the Search Results Why does the amount of indexed pages fluctuate so much? How can I quickly get a new page into Google's index? Why does a blocked, noindex URL show up in the search results? Is a website with and without the www harmful? Shelf space optimisation on Google Find out how many pages of a domain are indexed by Google The consequences of negative user-signals on Google's rankings Why am I getting different values for indexed pages in the Google search the GSC and SISTRIX How can I remove a URL on my website from the Google Index? Back to overviewSometimes it may happen that the numbers you get from a Google site:-query, the Google Search Console (GSC) and the SISTRIX Toolbox do not match.ContentsContentsComparing the indexed pages Google site -query and the SISTRIX dataThe number of indexed pages in the SISTRIX Toolbox is an averageValues that fluctuate strongly should be examinedExample using red-simon com You are not able to directly compare the numbers you get from a site:-query on Google and the Google Search Console, as the later are calculated separately by Google. This is why you will get different results which are published at different times.Comparing the indexed pages Google site -query and the SISTRIX data
Google site:- query for zalando.co.ukSISTRIX data for the domain zalando.co.uk Last data point from May 23rd 2022When you are evaluating two sets of data, you should always take the date when the data was measured into consideration. In the above example,the data from the Google site:-query is slightly more recent.The number of indexed pages in the SISTRIX Toolbox is an average
According to statements by Google, the number of indexed pages becomes only a rough estimate at more than 1,000 pages (mind the word “about” in front of the results). In order to eliminate the biggest outliers, we collect the SISTRIX data multiple times per week and then calculate the average value. To do so, we run site:-queries on Google, which ensure that our values come straight from Google, we only calculate an average over the weekly data. If we show the indexed pages have gone up (or down), then these are the numbers that we got from Google, at the time of our site:-query. We also only add a new data point to the history when we notice a change in the average amount.Values that fluctuate strongly should be examined
If your indexed pages vary noticeably, you should nonetheless take a look at the cause of this. In many cases, duplicate content or content that Google values as less important are the cause. Google will index these pages at first (the number of indexed pages goes up) and then filters out duplicates and less important pages again (the number of indexed pages falls). This also applies to print versions of pages, Sessions IDs, Affiliate-Links and others.Example using red-simon com
To give you an example, let us look at the site:-query for the domain red-simon.com in 2013. Towards the back (results page 10 in our example), we can notice the reason for a noticeable increase in the number of indexed pages:Google site:-query for red-simon.com in 2013 With red-simon.com we can see that there are a lot of dynamic URLs (with numerous parameters) which can be found in the search results (for example red-simon.com/data/cmsv2.asp?mid=41&sid=1&pid=533). These pieces of content can probably be accessed through a number of different URLs and are therefore duplicates. To some extend these pages were also redirected using a 302 redirect, which Google does not understand. Always use a 301-redirect for your redirects. It would surely be good for the website to have the dynamic URLs removed and replaced by static URLs. mod_rewrite could be one approach to handle this example.More evaluations for strongly fluctuating numbers of indexed pages:Why does the amount of indexed pages fluctuate so much? From: SISTRIX Team 23.05.2022 Google-Index, Google-Bot and the Crawling Process What is the Google Everflux? Robots meta tag vs. robots.txt: what are the main differences? What is an HTTP referrer? Our web site is no longer in the index - have we lost our rankings? What is a User-Agent? What is Google Search Console and How To Get Started Web Crawlers: How do They Work? Changing Google Search through Entities What is the X-Robots-Tag? What is the Mobile First Index? Rich Snippets: What are the advantages? Can the Google-Bot fill out and crawl forms? Crawl Budget: What does this mean? These are the CTR's For Various Types of Google Search Result Crawling and Indexing for extensive websites Google SERP Features: Result Types in the Search Results Why does the amount of indexed pages fluctuate so much? How can I quickly get a new page into Google's index? Why does a blocked, noindex URL show up in the search results? Is a website with and without the www harmful? Shelf space optimisation on Google Find out how many pages of a domain are indexed by Google The consequences of negative user-signals on Google's rankings Why am I getting different values for indexed pages in the Google search the GSC and SISTRIX How can I remove a URL on my website from the Google Index? Back to overview German English Spanish Italian French