Watch that trailing slash
Posted in thoughts
Tags :The trailing slash on URLs referencing directories has always bugged me because it doesn't look good. I am a bit of a URL fetishist, I like URLs to be simple and clear and without a trailing slash. But that is not the correct way of doing things.
Back in the early days, a website was essentially a collection of directories and index.html
files in them with a number of of other .html
files around (who remembers the difference between home.html
and index.html
?). The directory's name was appeared plainly in the URL and that wasn't a problem as long as it was followed by a trailing slash (or a filename).
If they are missing, the server coughs, because if you say /somedir/foo
instead of /somedir/foo/
the server searches for a file named foo
, and because this file is a directory, it complains and tries to fix it by itself. A List Apart has a good article on trailing slashes.
Notwithstanding that it generate an extra disk access and slows the whole process down for your user, it usually works out. Except in a few cases. Typically if the last portion of the URL is actually a directory on the server's filesystem and you're rewriting the URL to remove the trailing slash for cosmetic reasons. Like /weblog
where 'weblog' is an actual directory but is rewritten into /blo_index.php?div=blo&sec=hom
.
RewriteEngine on
^weblog/$ /weblog [R]
^weblog$ /sub_index.php?div=blo&sec=hom
Why would you need to do this you might be wondering? Well, in this particular case there is a CMS that drives the weblog of a specific site, and generates content include files in the 'weblog' directory which are displayed by a PHP script located at the root of the site (or in any other directory for that matter). The same script displays different sections of the site, including the weblog's main page.
Now, initially, I wanted the URL without a trailing slash, and used a rewrite rule that mapped it onto the PHP script. Unfortunately, that doesn't work. You'll get /weblog/?div=blo&sec=hom
show up in your browser if you omit the trailing slash in the URL.
The correct rewrite is to add the slash if missing:
RewriteEngine on
^weblog$ /weblog/ [R]
^weblog/$ /sub_index.php?div=blo&sec=hom
Use trailing slashes for directories because that is the correct way of doing things.
A direct consequence for me of all this was to have directory and file names that don't appear in the URL per se. This approach turned out to have several advantages:
- The URL doesn't give up anything about how the file system organisation (security);
- The URLs can be permanent in time and independant from the server's file system and technology (expandability);
- Forces you to use a directory/file naming scheme that prevents accidental overwriting of files.
Who hasn't at least once accidentally overwritten the wrong file while uploading an amended copy with the same name (e.g. index.html) to the server?
I like to organise the sections of a website in directories with each their own 3 letter code: sec_pro
for products section, sec_con
for contacts section, and so on. In each directory, the index file will prepend the 3 letter code: pro_index.html
, con_index.html
, etc. This way you ensure that there aren't two files on the server that bear the same name.
This scheme can be pushed further to subsection organisation: con_sales.html
, con_support.html
, con_corporate.html
.
The content is in a separate file: con_sales_content_intro_inc.html
, con_sales_content_main_inc.html
, con_sales_content_extra_inc.html
, etc. and located in its own separate directory: con_content
. The file system looks like this:
/sec_con/
/sec_con/con_index.html
/sec_con/con_sales.html
/sec_con/con_support.html
/sec_con/con_corporate.html
/sec_con/con_content/
/sec_con/con_content/con_sales_content_intro_inc
/sec_con/con_content/con_sales_content_main_inc
/sec_con/con_content/con_sales_content_extra_inc
[..]
This might seem a little overkill at first, but reveals to be extremely flexible and efficient in the long run. The content files can be CMS driven or not, depending of the context, users and level of expertise.
Call me a file system maniac if you like, but keeping to a neat, concise, and structured directory layout benefits everyone, and ensures that your site will not break when management decides to install the latest content management system developed in yet another emerging web–based development environment.
Comments and responses
11 Feb 2008
Trailing Slashes are good in way. Its better to use it when you are submitting your website or making a text link. Here is a good blog related to these issues, that u find worth to read
http://seodocs.blogspot.com/
12 Sep 2008
Really nice tutorial. Really love your blog and i’m gonna subscribe to the RSS feed.
29 Dec 2009
I am lost, I noticed google webmaster tools has me indexed with the trailing slash half of the time and because I am using wordpress on a windows server I am having a hard time fixing the issue. So I am trying to figure out if I should (for SEO purposes) use the slash even though none of my backlinks point that way.
Any help on this would be much appreciated.
thanks
18 May 2011
Thanks for the article.
30 Aug 2011
This is a really excellent read for me. Must agree that you are one of the best bloggers I ever saw. Thanks for posting this useful article
14 Sep 2011
Thank you,this is just what i needed.I have a presentation that I am just now working on, and I have been trying to find such information. Coconut Oil Plant.
04 Aug 2013
Thanks a lot for sharing us that very useful information, I am very happy to have this
06 Jan 2015
Thanks for this useful information..
P.S.buy essays
09 Apr 2015
I totally agree with you. Also and i like URLs to be simple and clear and without a trailing slash. I hear so many useful things from you here. Thanks a lot! Please keep it good posting! If you need a help with your blog – www.custom-paper-writing.org. can write all types of texts for you!