You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It can be convenient in forks to enable deployment to GitHub Pages for the purposes of testing. However this inadvertently creates duplicate copies of the documentation accessible on the wider public internet, which means search engines have the potential to find them.
This runs the risk of polluting search results with content which is likely outdated. I believe it also runs the risk of harming the SEO of the official documentation website. (I'm no SEO expert but my understanding is Google in particular harshly penalizes websites which duplicate other websites.)
We should generate a robots.txt and/or add the appropriate meta tags to non-canonical copies of the docs website.
One thing I didn't really think about when writing this is that the main website's robots.txt is what actually matters since the docs repo is nested in a subdirectory.
(Similarly for forks, the robots.txt in the GitHub Pages website of the user or the organization associated with the fork is what actually matters.)
This means we actually probably just go the route of adding <meta name="robots" content="noindex, nofollow"> tags to the <head> of every page instead.
It can be convenient in forks to enable deployment to GitHub Pages for the purposes of testing. However this inadvertently creates duplicate copies of the documentation accessible on the wider public internet, which means search engines have the potential to find them.
This runs the risk of polluting search results with content which is likely outdated. I believe it also runs the risk of harming the SEO of the official documentation website. (I'm no SEO expert but my understanding is Google in particular harshly penalizes websites which duplicate other websites.)
We should
generate aadd the appropriate meta tags to non-canonical copies of the docs website.robots.txt
and/orAs a semi-related aside (since you specify it in the
robots.txt
), we should also enable thesitemap.xml
generation. Looks like it just needs to be turned on.The text was updated successfully, but these errors were encountered: