Advanced SEO for Dynamic Website Structure
Dynamic website structure and SEO are a combination of topics that I've always had a particular interest in because of my background in software engineering. I have worked on, or maintained over 150 corporate websites having seen many of the things that can make a website go wrong, which can seriously impact a websites operation and search engine rankings.
Of the three pillars of SEO (Structure, Content, and Links) I find the structure of a website to be one of the most under rated things, even among search engine optimization companies. The structure of a website consists of several elements which all are interdependent on each other. These include the code behind your website, how your website interlinks, and the technologies used in your website.
At this point I'm going to strongly recommend that you're using Firefox with the Web Developer Toolbar installed. The web developer toolbar gives you an easy way to validate your website, test your site on multiple screen resolutions, and around another 100 functions.
Valid Markup and Cascading Style Sheets (CSS)
I have made it practice to develop all my projects in XHTML 1.0 Transitional (my personal preference so I can use target="_blank" and rel="nofollow" attributes) or XHTML 1.0 Strict and CSS 1.0. XHTML is a reformulation of HTML 4 as an XML 1.0 application. It is a very clean and semantic markup language which will also force you to write cleaner code. Whether you choose XHTML or HTML 4 your code will be friendly to the search engines (stay away from 3rd party standards like IHTML).
As for Cascading Style Sheets (CSS) it gives us the ability to abstract the design out of a webpage, or site into a secondary document. This gives us a lot of advantages, and very few disadvantages. By removing redundant design code from your website you place the content closer to the start of the document, while reducing your code to markup ratio. It also makes it easier, and more cost effective to maintain your website as you can implement simple design changes by only editing on file.
When converting a website from table based design, to pure CSS based design there is generally around a 40% decrease in code. The reason for this is when most people use tables they end up placing tables, within tables, within tables all with their own attributes (height, width, border, etc). Now multiple all that redundant, and unneeded markup by the numbers of pages of you site and you'll quickly see how Google (or any other search engine) will be able to index you website more efficiently.
In my research, and experience I have concluded using these two technologies in conjunction with each other is a part of guaranteeing your websites success, especially with its compatibility with Google. You will also find if you do any research on this topic a recurring mantra of CSS fanatics tables are for tabular data not design.
Now I'm going to start this section with a rant about Dreamweaver templates, and how useless they are. As a SEO / Web Developer there is nothing I loathe more than seeing a Dreamweaver template. If you're going to template a site use a technology like Server Side Includes, PHP Includes, or ASP includes. The disadvantages of Dreamweaver templates are:
- Embedded comments in your code can reak havoc on Keyword Density Tools
- If you need a non standard footer in an index file you will need to break it from the template, creating issues for future template updates.
- If you have a disagreement with your web developer / designer and you part company if he doesn't supply you with the template it'll cost you.
When building websites I personally use PHP for implementing Server Side Includes. PHP is a relative easy language to learn for implement simple things like includes. It is also one of the most popular Apache modules, as of April 2007 there were 20,917,850 domains, and 1,224,183 IP addresses with it installed. PHP is also available for the Microsoft IIS (Windows Server) web server.
Search Engine Friendly URLs
One thing that I can't stress enough is try to stay away from Dynamic URLs, these are URL addresses with variables, and values following the "?" character. Google used to state that it had troubles indexing sites with dynamic URLs, and to a degree this still holds true. If you are going to use Dynamic URLs always try to have less than 2 variables in your URL. I have seen sites with excessive products, and URLs where Google / Live / Yahoo all have a different number of pages cached.
A better approach is to URL Rewrite your URLs. For the Linux side Apache has Mod Rewrite, and for Windows you can use ISAPI Rewrite. When you implement a URL Rewriting system you are essentially creating a hash URL lookup table for your site, then when a server query comes in it checks the hash table to see if it finds a match then feeds it the corresponding entry.
To put it into simple terms what we strive to accomplish with URL Rewrites is to mask our dynamic content by having it appear as a static URL. A URL like Article?Id=52&Page=5 could be rewritten to /Article/ID/52/Page/5/, which to a search engine appears to be a directory with an index.htm (or whatever default / index page your particular web server uses). To see an implementation of Mod Rewrites check out Dr. Madcow's Web Portal in the Article Section, and Link Archive.
Dynamic Websites and Duplicate Content
If there is one reoccurring theme I see in a lot of dynamic websites on the internet is that they can sometimes present the same content on multiple pages. An example of this is when you visit a website that allows you to "view a printer friendly version of this page", a better web solution implementation would be to develop a printer friendly Cascading Stylesheet.
Another goal is also to avoid having any additional URLs on you site such as Links for changing currency with a redirect script, links to "Email to a friend" pages, or anything related to this. Always use Forms to POST date like this so that the same page, or a static page to reduce page count. This issue seems to plague a lot of custom developed ecommerce / CMSes. I've actually seen CMSes that will present up to 5 URL / Links for each page, in the long run the spiders got so confused in indexing the catalog that some of the main content pages were not cached.
Internal Site Navigation
If built properly most websites will never have a need for an XML Sitemap, other than to get their new pages indexed that much quicker (Ecommerce & Enterprise being exceptions). I will however recommend that every website have a user accessible Sitemap linked from every page to aide your users, and for internal linking.
Keep in mind the more internal links you have to a page, the more internal strength this page will be given. So when in doubt link it up.
Testing Your Site Structure
When it comes to reliable website deploying all I can say is "Test It, Test It, and then Test It Some More". When testing structure I rely on 3 different programs / firefox extensions. The first is Xenu Link Slueth, this is a great tool to run on your website to figure out how many pages can be spidered, and to find dead links. The second is the Web Developer Extension for Firefox, make sure you always validate your code when you make changes. And the last is consult Google and Yahoo to see how many pages are in your index compared to how many pages Xenu found, on Yahoo or Google type site:www.yourdomain.com (Don't use Live's site: function it is useless).
After you've finished testing your code if you need to debug it I strongly recommend the Firebug Firefox Extension, and the IE7 Developer Toolbar.
When trying to maximize your organic rankings your internal structure is paramount, consider your site structure to be equivalent to the foundation of your house. If your foundation is not built adequately your house may be livable, but may have long term issues. With websites your long term issues will be a failure to maximize your ROI of your website, so practice safe and smart structure.