As the author of the WWW FAQ, I regularly answer questions about the workings of the Web. If a question is frequently asked, I simply add an article to the FAQ. But sometimes a question is more detailed, more in-depth— not really a FAQ, but still of interest to others. You'll find those questions, with my answers, here in Innards along with commentary on other web-technology-related topics.
2003-12-15The Google search engine is, of course, the only one most people take seriously at the moment. That's been the case for a few years now, although it's not as near-perfect as it used to be. I recently added a WWW FAQ entry on how search engines rank web sites that delves into the problem a bit.
Still, let's say that things are working out for you, your page is useful and relevant, and people are finding it in Google searches... but the short summary that appears in Google isn't so hot. Google's summary technology is quite good, but it's not perfect. Sometimes you'll find that the text of your navigation links dominates the summary, even though it's apparent to your users what the primary content of the page is. I recently had this problem with a number of my pages.
Thanks to a clever suggestion from Ronan Waide, I was able to solve the problem by presenting my pages a little bit differently when the Google web spider comes calling to summarize them.
I should mention that these are treacherous waters. Yes, organizing the content of your page a little differently so that Google's summary spider understands it the way your human readers understand it is a good idea. No, giving Google radically different content is not a good idea; sooner or later, they might be forced to start blacklisting sites that abuse their good graces. That has already happened in certain situations. And if Google drops your site, you may as well hang it up.
By the way, eliminating your navigation links entirely for Google is not a good idea. While this might not misrepresent your site -- you're not changing the "meat" of the page -- it does mean that Google won't be able to follow links between your pages as easily or understand which other pages you consider important.
Still, there's nothing wrong with organizing your page in the proper order so that the summary makes more sense.
Here's how to configure Apache to set a variable that server side include (.shtml) pages can easily test. Just drop this line into your httpd.conf file and signal the server to reload it:
BrowserMatchNoCase Google GoogleWhen a browser visits your site, it usually identifies itself, although there's no guarantee it will tell the truth. When Google's indexing spider requests pages from your site, it identifies itself as
"Mediapartners-Google/2.1 (+http://www.googlebot.com/bot.html)", or something similar. The httpd.conf line above matches this or any reasonable variation thereof.
Now you can simply do this in your page:
<!--#if expr="${Google}"-->
<!--Don't send navigation links at top
of page if talking to Google-->
<!--#else-->
<!--Do send them to everyone else-->
<!--#include virtual="/navlinks.html"-->
<!--#endif-->
... The main body of the page goes here ...
<!--#if expr="${Google}"-->
<!--NOW send Google the navigation links-->
<!--#include "virtual="/navlinks.html"-->
<!--#endif-->
You should test the result, if you can, by fetching a copy of the page while momentarily impersonating Google's search spider -- again, not something you should ever do, unless you are talking to your own web site! You can do that using the wget utility that comes standard on all Linux systems and is also easily available on Windows as part of the excellent and free cygwin package:
wget -U "Google (Not Really)" http://mysite.com/mypage.htmlThis will save a local copy of mypage.html, which you should then review to make darn sure that your content and your navigation are still there for Google -- just organized in an order that helps their summary algorithm to better understand your page.
Follow us on Twitter | Contact Us
Copyright 1994-2012 Boutell.Com, Inc. All Rights Reserved.
