{
"$type": "site.standard.document",
"canonicalUrl": "https://www.simoncox.com/post/2017-11-29-how-i-add-canonicals-into-perch-cms-sites/",
"description": "Canonical links tell search engines where the original page is. Google favors the oldest version and flags others as duplicates.",
"path": "/post/2017-11-29-how-i-add-canonicals-into-perch-cms-sites/",
"publishedAt": "2017-11-29T09:00:00.000Z",
"site": "at://did:plc:tki7vwlanxbwrz2er67eaeqa/site.standard.publication/3mp4h4md7zv2y",
"tags": "Web",
"textContent": "{loading=\"eager\"}\n\nCanonicals can trip up your sites SEO\n\nOriginally conceived for situations where articles were duplicated they would reference the original. Google tends to choose the oldest version of a page that it can find (but not the only method it uses) and any other pages with the same or very similar content are considered duplicates and will not do a well on the Search Engine Results Pages - SERPs and we want our pages to do well there for the traffic.\n\nIn most content management systems, developers tend to take the quick option and reference the URL the page is on. To an extent, this works very well but duplicate pages can occur by accident / non-design. For example, if you are using Perch and you decide to prettify your URLs by removing the .php you will have set up .htaccess rules to remove them. But did you decide your URLs should end in a / or not? Search Engines index URLs with and without the / as different pages — hence you can suffer from duplication.\n\n1. http://www.example.com/index.php\n2. http://www.example.com/index\n3. http://www.example.com/\n4. http://www.example.com\n5. http://example.com/index.php\n6. http://example.com/index\n7. http://example.com/\n8. http://example.com\n9. https://www.example.com/index.php\n10. https://www.example.com/index\n11. https://www.example.com/\n12. https://www.example.com\n13. https://example.com/index.php\n14. https://example.com/index\n15. https://example.com/\n16. https://example.com\n\nAll the above are essentially the same page of content — a home page and the search engines have to work out which one is the original. They are getting much better at this but that’s not a reason to help them understand your website.\n\nAll the above are essentially the same page of content — a home page and the search engines have to work out which one is the original. They are getting much better at this but that’s not a reason to help them understand your website.\n\nFor subpages, canonicals are more critical as the search engines are less likely to be tolerant and often they will find your site through links to a subpage rather than down through the home page. Having the canonical automatically generated means that any URLs that resolve that you actually do not want on the site will include the incorrect canonical. If you remove the .php from the URLs, as I tend to do, then you may have situations where Perch is outputting links with the .php — the canonical would then include the .php and cause duplicate content issues. Footer menus are an example of where this may happen.\n\nI like to manually add the Canonical so that I know I am in control but this can lead to issues if an editor mistypes the URL so the technique I use grabs the list of pages from within Perch as a dropdown list for the editor to choose from.\n\nPerch field type — Pagelist\n\nYou will need to add the Perch field type into /perch/addons/fieldtypes/— drop the folder and its php file in there and you are good to go.\n\nThe Perch 2 field type Page list is available from the Perch CMS site. At the time of writing, there is no Perch 3 version but the archived Perch 2 version seems to work ok.\n\nPerch template code\n\nThe following code goes into perch/templates/pages/attributes/seo.html\n\n<link rel=\"canonical\" href=\"<perch:pages id=\"domain\" /><perch:pages id=\"canonical\" type=\"pagelist\" output=\"pageurl\" replace=\".php|,/index|\" label=\"Canonical page\" help=\"Please select the page you wish to have as the canonical URL for this page (normaly just choose this page)\" required=\"true\" />\">\n\nreplace=”.php|” removes the .php from the URL. \ntype=“pagelist” provides the list of pages on your site\n\nOn each page in the CMS appears a drop-down box with the pages you have on your site. The editor can select from this list thus avoiding manual errors — though they could choose the wrong page so that’s worth checking!\n\nExample of dropdown list used in the Perch content management system\n\n</figure>\n\nThe output code in the head:\n\nhttps://example.com/my-new-page\">\n\n *\nAnd there is more...\n\nPagination\nClive Walker. asked me how do I deal with pagination. Generally, I don’t as I pagination is the work of the devil and advertisers. There are so many sites who make you click through a series of pages to read an article — this is just to sell advertising, not to make it easy for you to read as usually the whole article could easily go on one page and you would scroll down to read it.\n\nThere is, however, a situation where pagination is very useful — lists of article entries, categories, topics and tags. In these situations, it is recommended that there is a view all page and that the paginated pages are canonicalised to that, but with huge lists, a view all page is impractical — will take days to load etc. and then the paginated pages can be self-canonicalised. If you want to know more then head over to Deep Crawl’s information on canonicalisation and pagination..\n\n18 December 2017 Update for home page\nI have also updated the perch code I used as there was an issue. The home page was outputting ‘/index’ so I have added that into the replace statement as it was canonicalising the home page to a URL that didn’t exist — and that is a bad thing! Apologies to anyone who had used the code prior to today.",
"title": "How I add canonicals into Perch CMS sites"
}