URL Normalization: Is a Trailing Slash the Same Page

Dec 28, 2004 - 3:00 pm 1 by

There is a very interesting thread brewing at Search Engine Watch Forums named Is A Trailing / On A Directory Seen As A Differnet File By Google?. In this thread a member lists an example of the same page, different URLs due to the trailing slash, have different PageRank values. His example is:

http://www.avismauritius.com/en/locations/ PR=3 http://www.avismauritius.com/en/locations PR=0

In the thread, Orion, the resident search technology guru at SEW forums, discusses how search engines normalize the URLs in order to give each URL a unique identifier. I hope that I explain this correctly. It is my understanding that the unique identifier is a hash string, possibly a 64 or 128 bit hash string. In order to assign a unique identifier, the URL needs to be stripped down and normalized. The process is a bit like Orion stated:

Removal of the protocol prefix (http://) if present Removal of a :80 port number specification if present (However, non-standard port number specifications are retained) Conversion of the server name to lower case Removal of all trailing slashes ("/")

However, this does not really explain if Google does all or some or none of this. Moderator Chris_D referenced an old WebmasterWorld thread where GoogleGuy sheds some more light on this topic. He talks a lot about http responses and URL requests, but the important line to get out of the thread is "I would always recommend the trailing slash. If you know the exact right url, it's often best to give it directly and save everyone that extra redirect." You also might want to check out msg # 6 in that thread.

PageOneResults from the SEO Consultants Directory explains that this is more of a matter of "content negotiation". He goes on to explains;

The W3C and other large website structures are now utilizing content negotiation. That means that this...

www.example.com/sub

...could be different than this...

www.example.com/sub/

With the use of content negotiation, there are no file extensions. Basically you are cleaning the URI of all underlying identifying technologies.

Bottom line, the same URL with and without a trailing slash can and is considered different to most search engines. Most are weeded out through the use of duplicate content filters, and most sites do not have this problem because of the built in way the server handles these URL requests.

 

Popular Categories

The Pulse of the search community

Search Video Recaps

 
- YouTube
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: December 20, 2024

Dec 20, 2024 - 10:00 am
Search Video Recaps

Search News Buzz Video Recap: Google December Core Update Done, Spam Update Starts, Google Ranking Exploit Leaked, Google Tests Double Serving Ads

Dec 20, 2024 - 8:01 am
Google Updates

Google December 2024 Spam Update 👾 Rollout Shocks Before Holidays

Dec 20, 2024 - 7:51 am
Google

Google Testing Shaded Button Sitelinks On Mobile

Dec 20, 2024 - 7:41 am
Google

Google Search To Gain AI Mode

Dec 20, 2024 - 7:31 am
Google Maps

Google Tests Nearby Hotels & Restaurants In Business Profile Listing

Dec 20, 2024 - 7:21 am
Previous Story: OR Factor: Originality Factor