Thursday, 11 September 2014

Minor update to the 2000 page limit

A short time ago we increased  the sitemap page limit from 1000 to 2000. This limit applies to the number of pages in a sitemap.

In larger websites we sometimes find they have many errors in some cases 10,000s of links with errors e.g. a 404 not found.. In these cases our spider was often crawling 10'000s of pages before it hit the 2000 sitemap page limit.

Whilst these are not too common, when we do encounter them  they can cause a log jam during busy times which blocks other sitemaps from processing.

To address this we have now implemented a change so the the maximum pages spidered is 2000. therefore if you have 10,000s of url when we have processed 2000 pages, even if they errored the spider will stop.

For the majority of sites this will make little difference, we just thought we would let you know.