About Your .htaccess File

What an .htaccess File Can Do

Your .htaccess file is your friend. But, as with any friend, the more you know about your friend's abilities to help you, the more valuable that friendship is, so this document is of some length. I recommend that, unless you are already quite well familiar with all the tricks an .htaccess file can do for you, you read it through. I will first discuss .htaccess generally, then focus on the particular tricks we want and need it to do for this package.

This is not a deeply expert or comprehensive discussion: its purpose, except for the details associated with this particular package, is to give you an idea what you're working with.

First off, the actual Apache documentation on .htaccess files is available on the web. There is also an extended tutorial available, and I much recommend that tutorial. It explains in some detail the commonest uses for .htaccess files, which include, in summary form:

  • using .htaccess files for custom error documents:

    for example, if your users enter an incorrect file name, what do they get? If you have taken no special measures, they will see whatever the server uses in such cases, typically some bare, ugly, generic message like:


        Not Found

         The requested URL /junque.html was not found on this server.


    But with an appropriate entry in your .htaccess file, you can instead have users get a custom error page of your design, with nice and nice-looking polite messages and some helpful hints (such as a site directory). And there are errors besides "page not found" for which you can supply custom pages; examples might include, depending on your set-up and tastes, "500 - internal server error" (typically a defective or missing script), "401 - Authorization Required" (if a user tries to go someplace on your site that they're forbidden), or (400 - Bad Request" (one of those generic errors for when visitors try strange stuff with your pages or scripts).

  • password protection:

    while there are numerous schemes for password-protecting files and directories, using the .htaccess file is at once simple and powerful. This feature is for many their first introduction to and main use of their .htaccess file.

  • enabling SSI:

    if your host does not have server-side includes functional by default, you can turn them on in .htaccess (if you don't yet know what those are, they are an extremely useful feature with many uses--Google it up). Moreover, you have the ability to make your server treat all your ordinary html files as if they were shtml--SSI-enabled--files. (Mind, that can bring a performance hit if you have only a few files using SSI, but at least you have the option available if you want to avoid making plain that you're using includes.)

  • denying user access by IP:

    it's just what you think--you can deny access to your site to whatever IP addresses you choose, whether by particular address or by IP blocks. That can be handy if there are known miscreants you want or need to keep out.

  • changing your default directory page:

    if, for any reason, you prefer that the page a visitor sees if they just use a directory URL (such as your main site URL) be named something other than the traditional index.html and suchlike, you can arrange that--and it can be anything you like, and it can differ from directory to directory. This is not a big feature, but it further illustrates the power of the .htaccess file.

  • preventing viewing of .htaccess itself by others:

    you may have good reason to keep the contents of your .htaccess file private (for example, if you are using it for password-access control); you can easily stop the file from being read by visitors. You can also block off other files in the same way (anything that is in your site directories but that visitors need not or should not be able to read).

  • adding mime types:

    some servers may not be set up to deliver certain newer mime types; if your is one, and you need a type, you can easily add it through your .htaccess file. (You can also add a setting that will force a given file mime type to always prompt a download request rather than simply be read; obscure, but interesting.)

  • preventing "hot linking" of your files:

    this can be very useful. "Hot linking" is the name for other sites linking to something--typically an image, but maybe a multimedia file--on your site rather than bothering to download it and put it on their own site. If you have any significant number of popular images, "hot links" can steal a substantial amount of your bandwidth, which can mean real money out of your pocket for a popular site. But your .htaccess file is, again, your friend. You can simply block access to your images by anything except your own web pages--or, you can go farther, and deliver instead a nasty special image.

  • preventing directory listing:

    normally, if a visitor uses the URL of one of your directories but doesn't supply a filename, and there is no "index.html" or like file in that directory, the visitor will get a display of the contents of that directory. Often that is the way you want it, but sometimes it is not. Your .htaccess file allows you to specify, by directory, which one or ones will not return a directory-contents listing to visitors.

Yes, you can do all those things--and a lot more--with your .htaccess file. But the one I didn't mention is doing redirects, because that is not only one of the most powerful, but is our main business here and will now be discussed at length.


Redirection

In its simplest form, a redirect in effect translates a call to some URL on your site into a call for some other URL. Suppose, for example, you have long had a file named ACREF.HTM on your site, which we will assume, for discussion, is at--

http://www.1st-beginners-golf-swing-tips.com

--in a directory named MDRX. The file has become tolerably popular, and there are several, perhaps many, links to it out in the great wide world. Now you decide that for any or all of several reasons (perhaps associated with optimizing your site for search engines) you would prefer that the file be named acid-reflux.shtml and that its directory be named medical-matters. OK, fine, you can easily rename them: but now all those nice links point to a dead end. You can put a dummy "redirect" page up at the file's old URL, but that is a very tedious and wasteful way to go.

You can, instead, put one line in your .htaccess file and not only will calls to--

http://www.1st-beginners-golf-swing-tips.com/MDRX/ACREF.HTM

--automatically be redirected to--

http://www.1st-beginners-golf-swing-tips.com/medical-matters/acid-reflux.shtml

--but your server can also be told, by that same one .htaccess line, to issue a 301 - Permanently Moved header to the caller. For your individual users with browsers, that is not vital; but search-engine robots will now know that a backlink to the old filespec is actually a backlink to the new one. So your backlinks totals are unaffected by renaming the target file.

(It has recently come to light that Yahoo, by deliberate choice, will not honor 301 redirects; this has caused a mix of shock and horselaughs around the SEO community. Perhaps when Yahoo smartens up, to the level of a slow-witted six-year-old--which would be a big jump up for them--they will act more wisely. For now, nobody cares.)

But that's only the beginning of redirection. We were dealing with one particular file, or some small set of files, by explicit name. The redirection engine, however, can accept wildcards. In fact, it uses the complicated but powerful scheme called "regular expressions" (Google it if you're not up on the matter) to accept and translate filespecs.

One simple use for that might be to redirect calls for all files from some one directory to like-named ones in another directory. Or we can go farther and also universally change, for example, their extensions. Thus, if you moved all your .html files from old-dir to new-dir and also renamed each to .shtml in the process--no sweat: your friend, .htaccess, will take care of it for you (and, again, keep all your backlinks duly associated with the right pages) with one line.

The special power of .htaccess redirects that we will be making use of is its ability to translate a call to what looks to the caller like a real, physical file on your server into a call to a php script with some parameters. Here's an example:

Your visitor (which may be a search-engine robot) finds a link on your pages to--

http://www.1st-beginners-golf-swing-tips.com/books-plain/0123456789.html

--which sure looks just like a routine URL. Your visitor follows that link. But your server, when it receives the request for that URL, lets your .htaccess file translate (redirect) that call into a call that might look like this:

http://www.1st-beginners-golf-swing-tips.com/golf-books/free1.php?asin=0123456789

What is returned by that call is, of course, an html page generated on the fly ("dynamically") by that php script, based on the parameters (in this case, an "ASIN" code). Again: to the visitor--to the search-engine robot, if that's the visitor--that page is absolutely, positively indistinguishable from a real, physical ("static") page on your site.

(That is not "cheating" or "search-engine spamming": the delivered page really is "a page on your site" served to all your visitors, and fully deserves to be counted, and any links from it to be counted, by search engines.)


Installing or Modifying Your Own .htaccess File

OK, so after these hundreds of lines of document text, we get to what you need to do to your .htaccess file to make this package work. The simplicity of that will vary sharply, depending on these things:

  1. Do you have an .htaccess file at all yet?

  2. If you have one, is it already doing any redirects?

  3. Are you using Microsoft's "FrontPage" software system?

Cases 1 and 2 are both pretty simple and easy to handle, and I'll deal with them for you in a moment, unless the answer to #3 is "Yes", which complicates the matter. In fact, that complication is enough that I have put its discussion into a separate document file, one named FrontPageUsers. If you are such a user, you need to go read that file now; after that, you can return to this point of this file.

If you know, definitely, the answers to Questions 1 and 2 above, you can skip the next bit between the lines.


1. How do you find out if you already have an .htaccess file? Well, the obvious answer is "you look": it would be in your site's root directory (the directory that holds your front page, which is probably index.html or index.shtml). But there's a snag: many pieces of software, including file managers and ftp, will default to NOT displaying files whose names begin with a bare dot. So you need to be sure that whatever you're using to search for a .htaccess file is set to display files with such names.

The finstall.php script in this package tries to find a .htaccess file if you have one, and should have reported to you. If it said yes and showed you the file's contents, well, there you are. If it did not tell you that you don't have one, but didn't display any actual content, you very probably have one that is set "unreadable". But that's "unreadable" to outsiders: you should be able to find it, look into it, download a copy of it, and upload a revised copy of it.

2. If you do have one, how do you determine if it's already using the rewrite engine? You look in it for a line containing the magic words RewriteEngine On. If those do not appear anywhere in the file, you are not presently doing any redirecting; if they do, you are.



Now let's consider the three possible cases. First, though, these two generic notes: always ftp-upload a new or changed .htaccess file in ASCII (or "text" or "plain text") ftp mode, not binary mode; and, if you are modifying an existing .htaccess file, obviously you should always keep a copy of the original as disaster insurance. Now:

a. You do not have an .htaccess file at all.

Simple: you rename the file generated by this package presently called SAMPLE.htaccess to simply .htaccess and upload it as-is to your site's root directory (the directory in which your front page and your robots.txt files reside). You're done, and can go back to the main documentation. But . . . you might consider learning some details about the other uses of .htaccess as described above and also putting some of them in your file. Check out the two URLs I gave above, which I repeat here for your convenience:


(One redirection that is nice, and important for search-engine optimization, is to set your site's main or "real" URL. Most servers will treat a call for--

     http://www.1st-beginners-golf-swing-tips.com

--and a call for--

     http://1st-beginners-golf-swing-tips.com

--as the same thing, and properly deliver up your site's front page. But--how do other sites link to you? You can suggest and recommend, but you cannot control. The killer is that to search engines, those are two distinct pages, and links to one do not get "credited" as links to the other. Your backlinks value can, in the worst case, be cut right in half. You don't need to chase around after other sites that have linked to whichever you consider the "wrong" form: you just drop a pair of lines into your .htaccess file and from then on calls to the "wrong" form generate a 301 Permanent Redirect to the "right" URL, so the search engines will tabulate both kinds of links as links to the "right" form. You can do that with this pair of lines:

     RewriteCond %{HTTP_HOST} !^www\.1st-beginners-golf-swing-tips\.com
     RewriteRule ^(.*) http://www.1st-beginners-golf-swing-tips.com/$1 [L,R=301]

Those lines consider the URL www.1st-beginners-golf-swing-tips.com to be the "right" form. If you want it the other way round, with 1st-beginners-golf-swing-tips.com as the "right" form, you'd use:

     RewriteCond %{HTTP_HOST} !^1st-beginners-golf-swing-tips\.com
     RewriteRule ^(.*) http://1st-beginners-golf-swing-tips.com/$1 [L,R=301]

Your .htaccess file is your friend.)



b. You do have an existing .htaccess file, but are not currently doing any redirects in it.

Still simple: download a copy of the file, append everything in this package's generated SAMPLE.htaccess file to what's in your existing .htaccess file, at the end (or bottom), then upload the modified copy over the old one.


c. You do have an existing .htaccess file and you are already using the Rewrite engine in it.

Pretty simple: download a copy, and insert the two lines from this package's generated SAMPLE.htaccess file that start with RewriteRule in your present rewrite rules, preferably at the top of the block of current RewriteRule directives (lest some of them rewrite the calls we want to work with into a form our Rule can't recognize).

You should be sure that the redirection block of your file contains the statement RewriteBase / somewhere above your new rewrite rules (it probably already does), or you might get weird results. There is more info on that statement at the Apache manual site.


That's it! You're done with the file.

Now, let's return to the main Install instructions set.