Create dynamic sitemap, humans.txt and robots.txt files
Previous article Next articleMost of the method described in this tutorial is ported into a new module for CMS Made Simple named SitemapMgr! Read all about is in this tutorial: How to use SitemapMgr »
To help search engines like Bing and Google index your website you can use sitemaps and a robots.txt file. A module like SiteMapMadeSimple is a great solution to create your static sitemap. A disadvantage of the module is that after each content change the sitemap file is recreated. For very large websites it can take a while...
I had that problem at one of my own websites. A few hundred pages, a few thousand Products entries, dozens of news articles and a few dozen Company Directory entries. Working in the admin wasn't fun anymore because the admin was very slowwww. In the CMSMS forum I found some posts from SjG, Kermit and Arnoud talking about dynamic sitemaps. I liked the idea and worked it out for myself, resulting in good SEO-friendly sitemaps and an admin panel in normal speed.
Shortcuts within this tutorial:
- Required preparations
- Sitemap for regular content pages
- Sitemap for the CGBlog module
- Sitemap for the Products module
- Sitemap for the News module
- Sitemap for the Company Directory module
- Sitemap for the CGCalendar module
- Sitemap Index file
- Split up large sitemaps
- robots.txt file
- humans.txt file
How to use
Required preparations
1. Create a new User Defined Tag named "content_type"
if ($content_type != '') { cmsms()->set_content_type($content_type); }
2. Create a new Core::Page template named "blank", that only contains:
3. For each sitemap you want to create, you need to make a regular content page. To keep the listcontent page a bit tidy I put them all under a dummy page or section header.
- Parent page or section header: SEO
- Humans.txt
- Robots.txt
- Sitemap Index
- Sitemap Pages
- Sitemap News
- Sitemap CGBlog
- Etc.
In the options tab of the page editor, all pages must be set non-searchable, not in menu and WYSIWYG switched off.
I will show you some example sitemaps, a dynamic robots.txt file and a dynamic humans.txt file.
4. Permissive Smarty
In the latest Smarty releases due to security settings PHP functions aren't available by default... If you do want to use PHP functions, you have to enable them by adding this line to your CMSMS config.php file:
This config variable loosens some of the security configuration for Smarty templates. Particularly enabling this option allows the use of any PHP function as a Smarty plugin. You better not use this option if you are allowing content to be submitted for display on your website from untrusted sources!
If the feed reader works without this line, you better not add it!
Sitemap for content pages
Template
Create a new Navigator template named "sitemap_pages" with the content:
{function name=Nav_sitemap}
{foreach $data as $node}
{page_attr key=searchable page=$node->id assign=isSearchable}
{if $node->type=='content' && !empty($isSearchable)}
<url>
<loc>{$node->url}</loc>
<lastmod>{$node->modified|date_format:'%Y-%m-%d'}</lastmod>
<changefreq>{math now=$smarty.now modified=$node->modified equation='(now-modified)/86400' assign='days'}{if $days < 2}hourly{elseif $days < 14}daily{elseif $days < 61}weekly{elseif $days < 365}monthly{else}yearly{/if}</changefreq>
<priority>{$level=$node->hierarchy|substr_count:'.'}{if $node->url|substr:0:-1 == {root_url}}1{elseif $level == '0'}0.8{elseif $level == '1'}0.6{elseif $level == '2'}0.4{else}0.2{/if}</priority>
</url>
{/if}
{if isset($node->children)}{Nav_sitemap data=$node->children}{/if}
{/foreach}
{/function}
{if isset($nodes)}{Nav_sitemap data=$nodes}{/if}
</urlset>
Page
Create a new content page "Sitemap Pages" with the page content:
{Navigator template='sitemap_pages'}
All content pages that are included in the menu will be shown in the sitemap, other pages are hidden.
If you want to have them also included you have to add the show_all parameter in the menu call:
{Navigator template='sitemap_pages' show_all=1}
Set in options tab Page URL: sitemap-pages.xml
You can test your sitemap at www.website.com/sitemap-pages.xml
Sitemap for CGBlog module (release 1.11+)
Template
Create a new CGBlog summary template named "sitemap_blog" with the content:
{foreach from=$items item=entry}
<url>
<loc>{$entry->detail_url}</loc>
<lastmod>{$entry->modified_date|date_format:'%Y-%m-%d'}</lastmod>
<changefreq>{math now=$smarty.now modified=strtotime($entry->modified_date) equation='(now-modified)/86400' assign='days'}{if $days < 2}hourly{elseif $days < 14}daily{elseif $days < 61}weekly{elseif $days < 365}monthly{else}yearly{/if}</changefreq>
<priority>0.6</priority>
</url>
{/foreach}
</urlset>
Page
Create a new content page "Sitemap Blog" with the page content:
{CGBlog summarytemplate='sitemap_blog' number=1000}
Set in options tab Page URL: sitemap-blog.xml
You can test your sitemap at www.website.com/sitemap-blog.xml
Sitemap for Products module
Template
Create a new Products summary template named "sitemap_products" with the content:
{foreach from=$items item=entry}
<url>
<loc>{$entry->detail_url}</loc>
<lastmod>{$entry->modified_date|date_format:'%Y-%m-%d'}</lastmod>
</url>
{/foreach}
</urlset>
Page
Create a new content page "Sitemap Products" with the page content:
{Products summarytemplate='sitemap_products'}
Set in options tab Page URL: sitemap-products.xml
You can test your sitemap at www.website.com/sitemap-products.xml
Split up large (Products) sitemaps
Template
We use the template "sitemap_products" created above:
Page
Create a new content page "Sitemap Products 1" with the page content:
{cge_module_hint module='Products' page=1}
{Products summarytemplate='sitemap_products' sortby='id' pagelimit=500}
Set in options tab Page URL: sitemap-products-1.xml
You can test your sitemap at www.website.com/sitemap-products-1.xml
Next, do similar for page 2, 3, 4, etc.
Sitemap for News module
Template
Create a new News summary template named "sitemap_news" with the content:
{foreach from=$items item=entry}
<url>
<loc>{$entry->moreurl}</loc>
<lastmod>{$entry->modified_date|date_format:'%Y-%m-%d'}</lastmod>
<changefreq>{math now=$smarty.now modified=strtotime($entry->modified_date) equation='(now-modified)/86400' assign='days'}{if $days < 2}hourly{elseif $days < 14}daily{elseif $days < 61}weekly{elseif $days < 365}monthly{else}yearly{/if}</changefreq>
<priority>0.6</priority>
</url>
{/foreach}
</urlset>
Page
Create a new content page "Sitemap News" with the page content:
{News summarytemplate='sitemap_news'}
Set in options tab Page URL: sitemap-news.xml
You can test your sitemap at www.website.com/sitemap-news.xml
Sitemap for Company Directory module
Template
Create a new Company Directory summary template named "sitemap_companydirectory" with the content:
{foreach from=$items item=entry}
<url>
<loc>{$entry->detail_url}</loc>
<lastmod>{$entry->modified_date|date_format:'%Y-%m-%d'}</lastmod>
</url>
{/foreach}
</urlset>
Page
Create a new content page "Sitemap Company Directory" with the page content:
{CompanyDirectory summarytemplate='sitemap_companydirectory'}
Set in options tab Page URL: sitemap-compagnies.xml
You can test your sitemap at www.website.com/sitemap-compagnies.xml
Sitemap for CGCalendar module
Template
Create a new CGCalendar upcominglist template named "sitemap_cgcalendar" with the content:
{foreach from=$events key=key item=event}
<url>
<loc>{$event.url}</loc>
</url>
{/foreach}
</urlset>
Page
Create a new content page "Sitemap Calendar" with the page content:
{CGCalendar display='upcominglist'}
Set in options tab Page URL: sitemap-calendar.xml
You can test your sitemap at www.website.com/sitemap-calendar.xml
Sitemap Index file
If you have multiple sitemaps you can create a sitemap index file, call it a sitemap for sitemaps...
You only need to submit *this* sitemap to Google Webmastertools!
Template
Create a new Navigator template named "sitemap_index" with the content:
{foreach $nodes as $node}
{if $node->type == 'content'}
<sitemap><loc>{$node->url}</loc></sitemap>
{/if}
{/foreach}
</sitemapindex>
Page
Create a new content page "Sitemap Index" with the page content:
{Navigator template='sitemap_index' childrenof='seo'}
Set in options tab Page URL: sitemap.xml
You can test your sitemap at www.website.com/sitemap.xml
Note: All pages/sitemaps that should be included in the Sitemap Index file need to be set included in menu!
Robots.txt
Page
Create a new content page "Robots.txt" with the page content:
User-agent: *
Sitemap: {root_url}/sitemap.xml
Disallow: /doc/
Disallow: /install/
Disallow: /lib/
Disallow: /modules/
Disallow: /module_custom/
Disallow: /plugins/
Disallow: /scripts/
Disallow: /tmp/
Allow: /tmp/cache/
Set in options tab Page URL: robots.txt
You can test your file at www.website.com/robots.txt
Humans.txt
Humans.txt? Say whaaat?? You can read more about the use of it here: humans.txt
Page
Create a new content page "Humans.txt" with the page content:
/* TEAM */
Name: Your name
E-mail: you@website.com
Twitter: @yourtwitter
Location: City, Country
Name: Your colleagues name
E-mail: colleague@website.com
Twitter: @hisorhertwitter
Location: City, Country
/* THANKS */
CMS Can Be Simple - For all those great CMSMS tips and tricks :)
http://cmscanbesimple.org
/* SITE */
Standards: HTML5, CSS3, etc.
Components: Modernizr, jQuery, etc.
Software: CMS Made Simple, what else?!
Set in options tab Page URL: humans.txt
You can test your file at www.website.com/humans.txt
You can add to your <head> area:
Working example
Check the following links from this website:
- humans.txt
- robots.txt
- sitemap.xml (sitemap index)
- sitemap-blog.xml
- sitemap-pages.xml
- Sitemap with 500 Products module items
Comment Form
44 Comments
Based on this tutorial I created a new module: SitemapMgr.
It is a module that creates humans.txt, robots.txt, site map index and site maps files.
The templates are stored in the Design Manager.
Available in the Module Manager of your CMSMS website, and at http://dev.cmsmadesimple.org/projects/sitemapmgr
Have fun!
hi,
I have a improvement to this article:
Use {content wysiwyg=false} in the 'blank' template. That way you dont have to deal with the WYSIWYG Editor.
Thanks to Paul Baker for pointing me to the humans.txt file! I added it to the tutorial
Website Live Launch Checklist: http://www.maidbloke.co.uk/2016/10/website-live-launch-checklist
I am not the developer of the Sitemap Made Simple module, so can't do anything about that. But I think it can't be much simpler like I descibed above. No need for a user interface/module. I never use the sitemap module anymore... But that is my opinion of course.
CMSMS must be simple...
resurrect Please!
http://dev.cmsmadesimple.org/projects/sitemapms
I updated the page sitemap!
The News sitemap works for me...
@Chris Taylor Thanks for your contribution
Hi,
Chris Taylors inputs are working smoothly for the content pages. THX Chris. Additionally I run into a problem with the "strtotime" function in the News-Module code. Now everything is working for me in CMSMS v 2.1.2.
Here the corrected code as I use it:
NEWS-Module:
{foreach from=$items item=entry}
{$entry->moreurl}
{$entry->modified_date|date_format:"%Y-%m-%d"}
{math now=$smarty.now modified=$entry->modified_date|strtotime equation="(now-modified)/86400" assign="days"}{if $days < 2}hourly{elseif $days < 14}daily{elseif $days < 61}weekly{elseif $days < 365}monthly{else}yearly{/if}
0.6
{/foreach}
Content-Pages:
{function name=Nav_sitemap}
{foreach $data as $node}
{page_attr key=searchable page=$node->id assign=isSearchable}
{if $node->type=='content' && $isSearchable}
{$node->url}
{$node->modified|date_format:'%Y-%m-%d'}
{math now=$smarty.now modified=$node->modified equation='(now-modified)/86400' assign='days'}
{if $days < 2}hourly{elseif $days < 14}daily{elseif $days < 61}weekly{elseif $days < 365}monthly{else}yearly{/if}
{$level=$node->hierarchy|substr_count:'.'}
{if $node->url|substr:0:-1 == {root_url}}1{elseif $level == '0'}0.8{elseif $level == '1'}0.6{elseif $level == '2'}0.4{else}0.2{/if}
{/if}
{if isset($node->children)}
{Nav_sitemap data=$node->children}
{/if}
{/foreach}
{/function}
{if isset($nodes)}
{Nav_sitemap data=$nodes}
{/if}
Hi Rolf,
Please ignore my previous post here as I have finally worked out that a page with url ‘sitemap.xml’ although it’s normal url will be ‘sitemap.xml.html’, but clever CMSMS also returns it for the requested url ‘sitemap.xml’. :)
I just had to tweak the template ‘sitemap_index’ to remove the trailing .xml
{$node->url|replace:'.xml.html':'.xml'}
I also modified the Navigator sitemap_pages template to include all levels of menus and exclude any pages that are not set to 'searchable':
{function name=Nav_sitemap}
{foreach $data as $node}
{page_attr key=searchable page=$node->id assign=isSearchable}
{if $node->type=='content' && $isSearchable}
{$node->url}
{$node->modified|date_format:'%Y-%m-%d'}
{math now=$smarty.now modified=$node->modified equation='(now-modified)/86400' assign='days'}{if $days < 2}hourly{elseif $days < 14}daily{elseif $days < 61}weekly{elseif $days < 365}monthly{else}yearly{/if}
{$level=$node->hierarchy|substr_count:'.'}{if $node->url|substr:0:-1 == {root_url}}1{elseif $level == '0'}0.8{elseif $level == '1'}0.6{elseif $level == '2'}0.4{else}0.2{/if}
{/if}
{if isset($node->children)}{Nav_sitemap data=$node->children}{/if}
{/foreach}
{/function}
{if isset($nodes)}
{Nav_sitemap data=$nodes}
{/if}
Hi,
not working for: "pages sitemap". --> it displays only first level pages, it should display all pages included in the menu, right?
Hi Rolf,
It's now working great in CMSMS 2.1.x!
But there is just one thing about the pages sitemap: it displays only first level pages, I think it shoud display all pages included in the menu, right?
Another thing: for CGCalendar, there is a missing parameter in the page when calling the upcominglist template : upcominglisttemplate='sitemap_cgcalendar' must be added in the tag like {CGCalendar display='upcominglist' upcominglisttemplate='sitemap_cgcalendar'}
BTW, thanks for all your tips & tricks!
I updated the article for CMS Made Simple 2.1.1 using the Navigator module!
I updated the article so it also works with the new Smarty Scope that was introduced in CMSMS 1.12 and 2.0
Yes, it should work in Core 2.0.
In the pages sitemap however you might need to change the menu manager tag:
{menu childrenof='seo' assign=dump}
into:
{Navigator childrenof='seo' assign=dump}
Grtz. Rolf
Hi Rolf,
First, thanx for the tip!
Is it supposed to work in CMSMS 2.0? The only content I get in sitemap-xxx.xml pages is:
Many thanks for the tips
Sitemap Made Simple is a little buggy and cause error 500 on page edition...
With this tip, 1 module less
Many thanks again Rolf - really handy tutorial, its now the default sitemap generator for any new sites I build.
@Rolf
in the sitemap pages add this
{if $node->type == 'content' || $node->type == 'advanced_content'}
instead of
{if $node->type == 'content'}
Now it's also working vor the AdvancedContent Module
Cheers
Link to the sitemap??
Unfortunately sitemap seems not to work. When i check wit http://seositecheckup.com/ the result is: Your site lacks a sitemap file.
I followd stap 1, 2 and Sitemap for content pages. What tot do next?
@Leo, noop!
Do i have to change sitemaps.org in the url of my onw website?
@Rolf
Damn you was right!
Thanks a lot for this page and your kind help.
Regards
blast
@blast
You probably added "sitemap.xml" in the PAGE ALIAS field instead of the PAGE URL field...
Ok solved by modifying these files as following:
htaccess:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ index.php?page=$1 [QSA]
config.php
$config['url_rewriting'] = 'mod_rewrite';
#$config['page_extension'] = '.htm';
$config['query_var'] = 'page';
Thanks a lot
blast
Can't find any files .xml maybe for my .htaccess file configuration
Here my .htaccess:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+).htm$ index.php?page=$1 [QSA]
Here my config.php file:
$config['url_rewriting'] = 'mod_rewrite';
$config['page_extension'] = '.htm';
$config['query_var'] = 'page';
If I write:
http://www.mysite.eu/sitemap.xml.htm
it works
also
http://www.mysite.eu/index.php?page=sitemap.xml
works
but if I write:
http://www.mysite.eu/sitemap.xml
it doesn't work (404)
Any hints?
regards
Hi,
unfortunately "sitemap-pages.xml" does not work for me.
I have the website with the 1.11.11, with Mlecms module in 3 languages.
http://www.arabictranslators.eu/sitemap-pages.xml
Thank you
Far
You use:
Menu manager template
Create a new menu template in the Menu Manager module:
sitemap
{if $count > 0}
{foreach from=$nodelist item=node}
{if $node->type == 'content'}
.
.
.
But if I have Advanced Content pages that I like to create their site map...
What have to be here: {if $node->type == '_______________'}
Added the Sitemap Index file (sitemap.xml) to the article
@Thijs
As you can see in the examples, it works perfectly in latest CMSMS version.
What error do you get?
Very useful tutorial, thanks! Thing is it seems that last modified date and changefrequency don't seem to work anymore in CMSMS 1.11.10(+). The menumanager template (content pages variant) generate an error in the XML. Is this a known issue?
@Fprm67
You have to give more info if you need some help.
Now I just can guess...
Do you have an url to the sitemap?
Hi,
No, unfortunately it does not work. Inside |urlset| and |/urlset|do not see any pages and they are empty.
Thank you
@Fprm67
Ohw, now I get it. The message you see isn't an error. When you look at my examples above you will see the same! It isn't a problem, just add the sitemap to Google and it will be accepted for sure.
Grtz. Rolf
Thank you for your reply.
I also tried with other browsers, but without success.
I recieve this:
"This XML file does not appear to have any style information associated with it. The document tree is shown below.
".
Thank you
@Fprm67
Try to use other browser/computer.
What message is Google giving when committing the sitemap?
Hi Rolf,
Thank you for sharing this post.
"Sitemap news" and "Sitemap Company Directory" function for me, but I have problem with sitemap-pages.xml. I recive this error:
"This XML file does not appear to have any style information associated with it. The document tree is shown below.
"
How can I solve this problem?
Thank you
Love the simplicity of this solution Rolf. Another great technique!
Thanks for the kind words, James!
To all, I updated the blog.
Added automatic calculation of the change frequency and the priority of the page or blog article.
The next release (1.11+) of the CGBlog module also supports the modified string, I already added it to the sitemap template above. But at this point the module isn't released yet...
Have fun!
This is a brilliant article. So much easier to customize the content of your sitemap.xml especially when you build custom modules.
Great work!
@kneep
Have you set the Page URL in the options tab of the page editor: sitemap.xml?
If I want to loopup the sitemap.xml page, i get a 404 error page. Is this because the .htaccess file checks if a file exsic
st?
Hello Arsène, you might check if the date_format parameter is correct for your server/language settings...
I use the sitemap for the CMSMS News module here: http://www.smakelijketenzonderzout.nl/sitemap-news.xml and as you can see it works :-) Do you have an URL of yours?
Hi Rolf and thank you for you website. It's full of good tutorials.
For the page "Sitemap News", I was oblige to comment the date line before guetting my sitemap accepted by Google Webmaster tools. {*{$entry->modified_date|date_format:'%F'} Not yet possible*}
Thanks again.
Arsène
Dear Rolf,
I never knew i could use multiple sitemaps but apparently there are a couple of valid reasons to use them indeed! (mostly it's about different re-indexing frequencies for different types of content)
You can even create a sitemap for all your sitemaps, lol!
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=71453
Greetings,
Manuel