Friday, 29 July 2011

Avoiding duplicate content due to case-sensitive URLs

A point that's been raised a fair bit recently by SEO specialists I work with is duplicate page content. This happens when you're using URL rewriting to point a URL to a page, and there is more than one way to get to that content.

For example:

http://www.website.com/news
and
http://www.website.com/news/home

Could both lead to the same place. Google crawls both of these URLs, and distinguishes them as separate. This can allegedly affect your ranking, and as a result, it's good to have your duplicate URL 301 redirect to the other.

This is also the case with URL rewriting case-sensitive URLs:

http://www.website.com/ThisURL
and
http://www.website.com/thisurl

A lot of regular expression matching URL rewriters such as URLRewriter.NET will happily handle both URLs and point you to the same page content. To get around this, I've implemented some code in the global.asax file to convert URLs to all lower-case.

http://www.website.com/ThisURL

will be checked in the global.asax file, and a 301 redirect will send you to:

http://www.website.com/thisurl

There's one other thing to bare in mind; ASP.NET's built-in AJAX functionality requires you to have some references to AXD files such as scriptresource.axd and webresource.axd. These files have case-sensitive querystrings, and if you don't add an exception for AXD files, these files won't be reachable, and your AJAX controls will no longer work.

Finally, you also have to ensure that you don't get in the way of postbacks. If you don't distinguish between a postback and a standard URL request, your postbacks will no longer work, and all your forms will break.

If you check out the code sample below, you'll see that I've ensured that only GET requests are checked, and that AXD files are excluded.

No comments: