Monday, 30 May 2016

Website diagnostic tool

We sometimes get contacted by people with questions about why their sitemap wasn't generated as expected , indeed in some cases returning very few or no pages.

In most cases the answers are to do with the structure or format of their website and how it responds to our spider.

To help users understand and address these problems we've started to automate some standard checks we should normally do manually to help resolve these issues.

The diagnostic tool runs through a series of tests designed to mimic the behaviour of our spider.

We test things like
  • Accessing your server
  • The server response codes
  • Parsing of the HTML content
  • Important tags such as
    • titles and headers
    • canonical urls and
    • http refresh
  • Number of urls found
The tool will return a list of results which can be helpful when trying to understand how our sitemap generator interprets your websites.

Exampe diagnostics output

 You can access the diagnostic tool when you download your sitemap and from the help section.

In this release we also fixed a couple of bugs.
  • Fix: processing of http equivelent refresh meta tag.
  • Fix : noindex / nofollow rules to ensure pages that we're not indexed we're still followed where there wasn't a nofollow value.