Tuesday, 1 July 2014

Sitemaps and multiple domains

We sometimes get asked by people why our sitemap generator doesn't find all pages in their website. There can be a number of reasons, however one reason we have found is references to multiple domains within the same website structure, in particular homepages that reference a different domain to the one the user specifies for their sitemap..

We recommend you are consistent with your domains. Pick a primary domain and stick to it. If you have secondary domains by all means use them, but make sure your website structure uses the primary domain. this will make it clearer to our spider and search engines where your pages are and the structure of your site.

A concrete example of this is if you have 2 domains pointing to the same website and use full absolute links in your page, avoid mixing the use of domains and where possible just use the relative path.

e.g. if you have mysite1980.org.uk and mysite1980.org pointing to the same site, avoid doing this :

<a href="http://www.mysite1980.org.uk">Home Link 1</a>
<a href="http://www.mysite1980.org/aboutus">About</a>
<a href=""http://www.mysite1980.org.uk/features">Features</a>
<a href=""http://www.mysite1980.org.uk/contact">Contact us</a>


We also see some website framing another site. We assume people do this to masquerade the site under another address. The best way to do this is using a DNS CNAME or HTTP 301 redirect depending on your circumstances and need.

If you frame one domain in another our spider wont recognize the two domains are the same website.

Remember it is perfectly acceptable to have more than one sitemap, one for each domain / website, but where the domains all point back to the same website you should make sure you have a good HTTP redirect strategy or make use of canonical URLs to ensure that users end up in the correct place, and that search engines don't penalise you for duplicate content.

And of course if you don't supply the correct address to the sitemap generator you risk it not being able to find some or all of your pages, if your canonical urls and redirects aren't in place.