Contents   Prev   Next
  1. mod_rewrite (Apache)
  2. Optimizing MediaWiki
  3. Optimizing PHPBB
  4. Pretty URLs
  5. Other SEO Hacks
  6. Absolut Engine CMS + SEO
  7. Rewriting Parameters
  8. Rewriting Subdomains
  9. More Info

URL Rewriting

mod_rewrite (Apache)

mod_rewrite (http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html) is an Apache (http://www.apache.org) module that can be very handy in rewriting ugly urls to make them look prettier. If you'll notice, the default installation of MediaWiki (http://www.mediawiki.org/) (the software this site uses) includes a /index.php/ in the url. Basically, to optimize the site so it's more search engine friendly, we want to remove that entire string.

See below for some example mod_rewrite rules.

Redirecting to www.domain.com

A lot of seo experts out there seem to agree that choosing one domain name, and redirecting all the others with a 301 response header is the best way to maximize your page ranking.

The plan is to take a domain like "http://organicseo.org" and redirect it to "http://www.organicseo.org". This will consolidate all the inbound links regardless of whether or not they use the "www.". This is in theory supposed to boost up your Google PageRank by having only one domain, instead of two.

Here's a simple mod_rewrite rule you can add to .htaccess to accomplish this (note: put it above any other rewrite rules are using, you want the other rules to apply to the www. domain after the initial redirect). It's also recommended to do the rewriting in http.conf, but not everyone has Root, so .htaccess will suffice although there will be somewhat of a performance hit.

Code listing:

			
RewriteEngine on

#Redirect http://organicseo.org to http://www.organicseo.org
RewriteCond %{HTTP_HOST} ^organicseo.org
RewriteRule ^(.*)$ http://www.organicseo.org/$1 [r=301,L]
			
			

Note: Some will tell you to simply setup only one domain, like www.organicseo.org, and don't provide access via organicseo.org (no www.). This however is not good practice, because undoubtedly people accessing your domain directly will type it without the "www.". If you don't have your server setup to serve those requests, they will simply get a "no response" error message, and may give up completely, thinking they have the incorrect domain name.

Optimizing MediaWiki

The following mod_rewrite rules can be used with MediaWiki.

Note: The hack below only works when your wiki is in the root ./htdocs directory For a working example, where your files are in a ./wiki subdirectory, use rofro's solution (http://meta.wikimedia.org/wiki/Talk:Rewrite_rules#rofro.27s_solution).

Change LocalSettings.php (This removes the /index.php/ in the url on all the links of the pages). From:

Code listing:

			
$wgArticlePath      = "$wgScript/$1";
			
		

To:

Code listing:

			
$wgArticlePath      = "$wgScriptPath/$1";
			
		

I suggest you comment out the original above (instead of deleting it) so you can switch back and forth when testing, or incase it breaks.

Then use .htaccess for mod_rewrite url rewriting:

Code listing:

			
RewriteEngine on

# Verifying if user forgot to put trailling slash. If so, we'll rewrite to Main_Page

RewriteCond %{REQUEST_URI} ^/$
RewriteRule ^(.*) /index.php?tile=Main_Page [L]

# Don't rewrite requests for files in MediaWiki subdirectories,
# MediaWiki PHP files, HTTP error documents, favicon.ico, or robots.txt

RewriteCond %{REQUEST_URI} !^/stylesheets/
RewriteCond %{REQUEST_URI} !^/(redirect|texvc|index).php
RewriteCond %{REQUEST_URI} !^/error/(40(1|3|4)|500).html
RewriteCond %{REQUEST_URI} !^/favicon.ico
RewriteCond %{REQUEST_URI} !^/robots.txt
RewriteCond %{REQUEST_URI} !^/images/

# Make sure there is no query string (Unless user is making a search)
RewriteCond %{QUERY_STRING} ^$ [OR] RewriteCond %{REQUEST_URI} ^/Special:Search

# Rewrite http://wiki.domain.tld/article properly, this is the main rule
RewriteRule ^(.*) /index.php/$1 [L]
			
		

Known bug: Does not work with Wiki pages that contain "?" in them, such as "What Is SEO?". I think it's conflicting with the rule using %{QUERY_STRING} since "?" signifies a query string in a url. Anyway, the quick fix is to move "What Is SEO?" page to "What Is SEO" and then change the link to the page to something like this:

[[What Is SEO|What Is SEO?]]

Notice addition of "?" in the anchor text only, not used in the url (before the "|").

One last rule I applied, was to add ".html" to the end of every page (this is done on the backend, so please don't directly link to a ".html" page when using Wiki Syntax (http://meta.wikimedia.org/wiki/Help:Editing#Links.2C_URLs)).

Minor change in LocalSettings.php

Change:

Code listing:

			
$wgArticlePath      = "$wgScriptPath/$1"; #mod_rewrite enabled
			
		

To:

Code listing:

			
$wgArticlePath      = "$wgScriptPath/$1.html"; #mod_rewrite enabled
			
		

...and a slight change to the above mod_rewrite rule:

Code listing:

			
# Rewrite http://wiki.domain.tld/article properly, this is the main rule

RewriteRule ^(.*)\.html /index.php/$1 [L]
			
		

Optimizing PHPBB

PHPBB (http://www.phpbb.com) is an open source bulletin board application (forum), that allows users to post messages and engauge in dialogue with each other.

Pretty URLs

One of the many tricks you can do is optimize your urls, instead of having the spiders see links such as "/index.php?forum=12" you can rewrite them using this nice little hack to have pretty urls like "/some-topic-here_vf22.html".

One requirement however, is that you use Apache as your web server (for the mod_rewrite rules).

You can download the latest SEO Mod from Webmedic (http://www.webmedic.net/released-phpbb-google-keyword-urls-220-seo-mod-vt25564.html), thanks to Brook Hyumphrey (mailto:bah@webmedic.net)!

Other SEO Hacks

With PHPBB, you should also take the thread or forum title from the template file, and put it inside an H1 tag styled with CSS, as well as add it to the main document title tag.

Absolut Engine CMS + SEO

Absolut Engine CMS (http://www.absolutengine.com) is an open source content management system and framework that provides easy tools to publish articles, news etc. Absolut Engine modules enhance the functionality further on. Clean URLs (based on the mod_rewrite) allow to use sitewide friendly URLs, e.g. http://domain.com/nice-and-clean-URL. SEO module is also available that further allows to tweak meta information (title, keywords and description) for each article (any content) posted.

To enable clean URLs open file admin/settings.php and change:

Code listing:

			
$cleanurls=2;
			
		

Rewriting Parameters

To enable basic rewrite:

Code listing:

			
# turn rewrite on:
RewriteEngine on

# set rewrite for domain.com only:
RewriteCond %{HTTP_HOST} ^domain.com

# redirect any requests from non-www version to www-version of the domain:
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]

# some further magic:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f

# change old ugly URL for nice one (e.g. any number of these can follow):
RewriteRule ^nice-new-URL[/]*$ /show.php?partID=1215&sectionID=54878&subsection=345344
			
		

Rewriting Subdomains

For this case, we want to rewrite foo.domain.com to www.domain.com/app.php?subdomain=foo.

First of all you need to setup a wildcard for DNS, so that anything.domain.com will be sent to your web server's IP address. If you're using Bind 9 or later, you can do it by changing the zone file (such as /etc/named/pri/domain.com.zone):

Code listing:

			
$TTL 48h
@       IN      SOA     dns1.domain.com. username.domain.com.  (
                                      2002081601 ; Serial
                                      7200       ; Refresh
                                      7200       ; Retry
                                      604800     ; Expire - 1 week
                                      3600 )    ; Minimum

@               IN      NS      dns1.domain.com.
@               IN      NS      dns2.domain.com.
www.domain.com.  IN      A       x.x.x.x
green.domain.com.        IN      A       x.x.x.x
*.domain.com.     IN      CNAME   domain.com.
			
			

where x.x.x.x refers to your server's IP address. Make sure you have all valid subdomains (green.domain.com.) declared above the wildcard declaration (*.domain.com.).

The next thing you need to do is configure Apache to answer for all subdomains requested, ie - anything.domain.com

Code listing:

			
<VirtualHost *:80>
ServerName green.domain.com
DocumentRoot /var/www/green.domain.com/htdocs
IndexOptions FancyIndexing
</VirtualHost>

### Since this is the catchall wildcard domain, we have to put it at the very end of all
### the other valid sub domains, like green.domain.com above

<VirtualHost *:80>
ServerName domain.com
ServerAlias www.domain.com
DocumentRoot /var/www/domain.com/htdocs
IndexOptions FancyIndexing
ServerAlias *.domain.com #wildcard catchall
RewriteLog "/var/log/apache2/rewrite_log"
RewriteLogLevel 9
</VirtualHost>
			
			

To be sure everything works, try accessing foo.domain.com in your browser, and you should get your regular site that comes up for www.domain.com. If you are having problems, it's always good to check the apache error logs.

Now comes the fun part, using mod_rewrite rules to change foo.domain.com to www.domain.com/app.php?subdomain=foo!

Make sure you have enabled mod_rewrite (http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html) in Apache 2.0.

For purposes of illustration, we will use a .htaccess file to hold the rules, however, this will slow your server down, and it is recommended if you have root access on the server, to put them directly into the apache configuration file, typically httpd.conf to speed things up.

Here's what you need in your .htaccess file, typically inside your htdocs directory on the web server:

Code listing:

			
Options +FollowSymlinks

RewriteEngine On

RewriteCond %{HTTP_HOST} ^[^\.w{3}]+\.domain.com$
RewriteRule ^(.*) %{HTTP_HOST}$1 [C]
RewriteRule ^([^\.w{3}]+)\.domain\.com(.*) http://www.domain.com/app.php?sub=$1 [R=301,L]
			
			

Another example of rewriting subdomains would be if you want to keep the user at the foo.domain.com subdomain, instead of redirecting them to www.domain.com.

Code listing:

			
RewriteEngine On

RewriteCond %{HTTP_HOST} ^[^\.w{3}]+\.domain.com$
RewriteRule ^$ %{HTTP_HOST} [C]
RewriteRule ^([^\.w{3}]+)\.domain\.com$ http://$1.domain.com/app.php?subdomain=$1 [R=301,L]
			
			

This way, if someone accesses foo.domain.com they get redirected to foo.domain.com/app.php?subdomain=foo. It is technically redirecting them to www.domain.com/app.php?subdomain=foo, but because of the wildcards in DNS and Apache, they will stay at the foo subdomain.

Remember, if you're having problems check the log files, enable mod_rewrite log writing for your domain in the VirtualHost declaration as is above (or in the .htaccess file):

Code listing:

			
RewriteLog "/var/log/apache2/rewrite_log"
RewriteLogLevel 9
			
			

Note: do NOT leave this log file inplace when you go live, it again will slow down your server when bombarded with traffic. Only use for debugging purposes. More info about RewriteLog (http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html).

See the mod_rewrite guide (http://httpd.apache.org/docs-2.0/misc/rewriteguide.html) for more info on rules.

More Info

Contents   Prev   Next