Create dynamic sitemap, humans.txt and robots.txt files
Previous article Next articleMost of the method described in this tutorial is ported into a new module for CMS Made Simple named SitemapMgr! Read all about is in this tutorial: How to use SitemapMgr »
To help search engines like Bing and Google index your website you can use sitemaps and a robots.txt file. A module like SiteMapMadeSimple is a great solution to create your static sitemap. A disadvantage of the module is that after each content change the sitemap file is recreated. For very large websites it can take a while...
I had that problem at one of my own websites. A few hundred pages, a few thousand Products entries, dozens of news articles and a few dozen Company Directory entries. Working in the admin wasn't fun anymore because the admin was very slowwww. In the CMSMS forum I found some posts from SjG, Kermit and Arnoud talking about dynamic sitemaps. I liked the idea and worked it out for myself, resulting in good SEO-friendly sitemaps and an admin panel in normal speed.
Shortcuts within this tutorial:
- Required preparations
- Sitemap for regular content pages
- Sitemap for the CGBlog module
- Sitemap for the Products module
- Sitemap for the News module
- Sitemap for the Company Directory module
- Sitemap for the CGCalendar module
- Sitemap Index file
- Split up large sitemaps
- robots.txt file
- humans.txt file
How to use
Required preparations
1. Create a new User Defined Tag named "content_type"
if ($content_type != '') { cmsms()->set_content_type($content_type); }
2. Create a new Core::Page template named "blank", that only contains:
3. For each sitemap you want to create, you need to make a regular content page. To keep the listcontent page a bit tidy I put them all under a dummy page or section header.
- Parent page or section header: SEO
- Humans.txt
- Robots.txt
- Sitemap Index
- Sitemap Pages
- Sitemap News
- Sitemap CGBlog
- Etc.
In the options tab of the page editor, all pages must be set non-searchable, not in menu and WYSIWYG switched off.
I will show you some example sitemaps, a dynamic robots.txt file and a dynamic humans.txt file.
4. Permissive Smarty
In the latest Smarty releases due to security settings PHP functions aren't available by default... If you do want to use PHP functions, you have to enable them by adding this line to your CMSMS config.php file:
This config variable loosens some of the security configuration for Smarty templates. Particularly enabling this option allows the use of any PHP function as a Smarty plugin. You better not use this option if you are allowing content to be submitted for display on your website from untrusted sources!
If the feed reader works without this line, you better not add it!
Sitemap for content pages
Template
Create a new Navigator template named "sitemap_pages" with the content:
{function name=Nav_sitemap}
{foreach $data as $node}
{page_attr key=searchable page=$node->id assign=isSearchable}
{if $node->type=='content' && !empty($isSearchable)}
<url>
<loc>{$node->url}</loc>
<lastmod>{$node->modified|date_format:'%Y-%m-%d'}</lastmod>
<changefreq>{math now=$smarty.now modified=$node->modified equation='(now-modified)/86400' assign='days'}{if $days < 2}hourly{elseif $days < 14}daily{elseif $days < 61}weekly{elseif $days < 365}monthly{else}yearly{/if}</changefreq>
<priority>{$level=$node->hierarchy|substr_count:'.'}{if $node->url|substr:0:-1 == {root_url}}1{elseif $level == '0'}0.8{elseif $level == '1'}0.6{elseif $level == '2'}0.4{else}0.2{/if}</priority>
</url>
{/if}
{if isset($node->children)}{Nav_sitemap data=$node->children}{/if}
{/foreach}
{/function}
{if isset($nodes)}{Nav_sitemap data=$nodes}{/if}
</urlset>
Page
Create a new content page "Sitemap Pages" with the page content:
{Navigator template='sitemap_pages'}
All content pages that are included in the menu will be shown in the sitemap, other pages are hidden.
If you want to have them also included you have to add the show_all parameter in the menu call:
{Navigator template='sitemap_pages' show_all=1}
Set in options tab Page URL: sitemap-pages.xml
You can test your sitemap at www.website.com/sitemap-pages.xml
Sitemap for CGBlog module (release 1.11+)
Template
Create a new CGBlog summary template named "sitemap_blog" with the content:
{foreach from=$items item=entry}
<url>
<loc>{$entry->detail_url}</loc>
<lastmod>{$entry->modified_date|date_format:'%Y-%m-%d'}</lastmod>
<changefreq>{math now=$smarty.now modified=strtotime($entry->modified_date) equation='(now-modified)/86400' assign='days'}{if $days < 2}hourly{elseif $days < 14}daily{elseif $days < 61}weekly{elseif $days < 365}monthly{else}yearly{/if}</changefreq>
<priority>0.6</priority>
</url>
{/foreach}
</urlset>
Page
Create a new content page "Sitemap Blog" with the page content:
{CGBlog summarytemplate='sitemap_blog' number=1000}
Set in options tab Page URL: sitemap-blog.xml
You can test your sitemap at www.website.com/sitemap-blog.xml
Sitemap for Products module
Template
Create a new Products summary template named "sitemap_products" with the content:
{foreach from=$items item=entry}
<url>
<loc>{$entry->detail_url}</loc>
<lastmod>{$entry->modified_date|date_format:'%Y-%m-%d'}</lastmod>
</url>
{/foreach}
</urlset>
Page
Create a new content page "Sitemap Products" with the page content:
{Products summarytemplate='sitemap_products'}
Set in options tab Page URL: sitemap-products.xml
You can test your sitemap at www.website.com/sitemap-products.xml
Split up large (Products) sitemaps
Template
We use the template "sitemap_products" created above:
Page
Create a new content page "Sitemap Products 1" with the page content:
{cge_module_hint module='Products' page=1}
{Products summarytemplate='sitemap_products' sortby='id' pagelimit=500}
Set in options tab Page URL: sitemap-products-1.xml
You can test your sitemap at www.website.com/sitemap-products-1.xml
Next, do similar for page 2, 3, 4, etc.
Sitemap for News module
Template
Create a new News summary template named "sitemap_news" with the content:
{foreach from=$items item=entry}
<url>
<loc>{$entry->moreurl}</loc>
<lastmod>{$entry->modified_date|date_format:'%Y-%m-%d'}</lastmod>
<changefreq>{math now=$smarty.now modified=strtotime($entry->modified_date) equation='(now-modified)/86400' assign='days'}{if $days < 2}hourly{elseif $days < 14}daily{elseif $days < 61}weekly{elseif $days < 365}monthly{else}yearly{/if}</changefreq>
<priority>0.6</priority>
</url>
{/foreach}
</urlset>
Page
Create a new content page "Sitemap News" with the page content:
{News summarytemplate='sitemap_news'}
Set in options tab Page URL: sitemap-news.xml
You can test your sitemap at www.website.com/sitemap-news.xml
Sitemap for Company Directory module
Template
Create a new Company Directory summary template named "sitemap_companydirectory" with the content:
{foreach from=$items item=entry}
<url>
<loc>{$entry->detail_url}</loc>
<lastmod>{$entry->modified_date|date_format:'%Y-%m-%d'}</lastmod>
</url>
{/foreach}
</urlset>
Page
Create a new content page "Sitemap Company Directory" with the page content:
{CompanyDirectory summarytemplate='sitemap_companydirectory'}
Set in options tab Page URL: sitemap-compagnies.xml
You can test your sitemap at www.website.com/sitemap-compagnies.xml
Sitemap for CGCalendar module
Template
Create a new CGCalendar upcominglist template named "sitemap_cgcalendar" with the content:
{foreach from=$events key=key item=event}
<url>
<loc>{$event.url}</loc>
</url>
{/foreach}
</urlset>
Page
Create a new content page "Sitemap Calendar" with the page content:
{CGCalendar display='upcominglist'}
Set in options tab Page URL: sitemap-calendar.xml
You can test your sitemap at www.website.com/sitemap-calendar.xml
Sitemap Index file
If you have multiple sitemaps you can create a sitemap index file, call it a sitemap for sitemaps...
You only need to submit *this* sitemap to Google Webmastertools!
Template
Create a new Navigator template named "sitemap_index" with the content:
{foreach $nodes as $node}
{if $node->type == 'content'}
<sitemap><loc>{$node->url}</loc></sitemap>
{/if}
{/foreach}
</sitemapindex>
Page
Create a new content page "Sitemap Index" with the page content:
{Navigator template='sitemap_index' childrenof='seo'}
Set in options tab Page URL: sitemap.xml
You can test your sitemap at www.website.com/sitemap.xml
Note: All pages/sitemaps that should be included in the Sitemap Index file need to be set included in menu!
Robots.txt
Page
Create a new content page "Robots.txt" with the page content:
User-agent: *
Sitemap: {root_url}/sitemap.xml
Disallow: /doc/
Disallow: /install/
Disallow: /lib/
Disallow: /modules/
Disallow: /module_custom/
Disallow: /plugins/
Disallow: /scripts/
Disallow: /tmp/
Allow: /tmp/cache/
Set in options tab Page URL: robots.txt
You can test your file at www.website.com/robots.txt
Humans.txt
Humans.txt? Say whaaat?? You can read more about the use of it here: humans.txt
Page
Create a new content page "Humans.txt" with the page content:
/* TEAM */
Name: Your name
E-mail: you@website.com
Twitter: @yourtwitter
Location: City, Country
Name: Your colleagues name
E-mail: colleague@website.com
Twitter: @hisorhertwitter
Location: City, Country
/* THANKS */
CMS Can Be Simple - For all those great CMSMS tips and tricks :)
http://cmscanbesimple.org
/* SITE */
Standards: HTML5, CSS3, etc.
Components: Modernizr, jQuery, etc.
Software: CMS Made Simple, what else?!
Set in options tab Page URL: humans.txt
You can test your file at www.website.com/humans.txt
You can add to your <head> area:
Working example
Check the following links from this website:
- humans.txt
- robots.txt
- sitemap.xml (sitemap index)
- sitemap-blog.xml
- sitemap-pages.xml
- Sitemap with 500 Products module items
Comment Form
ReviewManager
ReviewManager