But
these two search engines are relatively
insignificant. Google will crawl dynamic URL's at
about a third the speed and depth at which it
indexes static pages. It will barely crawl at all if
there are session IDs in the query strings, because
it soon discovers that multiple URLs lead to the
same page and regards the site as being full of
duplicate content.
Another
challenge dynamic sites throw at search engines is
serving up different core content at the same URL. This
might result when a site has content that may be viewed
at the same URL in multiple languages, depending on the
browser settings, or content, such as on a news site,
which changes every few minutes.
Search
engines want to be accurate. They want visitors to a
particular URL to see the same content the spider saw.
They also want to be comprehensive. They vie with each
other to have the largest database. Thus, they have
billions of pages to index and typically can only visit
each URL once every few weeks or so (although Google is
pretty good at recognizing content that changes
frequently, and spidering it more often). So if a search
engine indexes your English content at a given URL, it
will probably not index your Spanish content at the same
URL during the same indexing period.
The
solution is to give each search engine unique core
content at a unique URL and ensure that all visitors see
the same core content. There are three main ways of
achieving this.
1) Use static URLs to reference
dynamic content. If a search engine sees a
static URL, it is more likely to index the content at
that URL than if it found the same content at a dynamic
URL. There are several ways of turning dynamic URLs into
static URLs, despite the fact that you are serving
dynamic content. Your method will depend upon your
server and other factors. A friend of mine had the
following experience after implementing this solution
for a client:
"For the last year, since
rewriting the dynamic URLs, my client's site has been
riding high in the rankings for thousands of search
terms. Before the URL rewriting, Google had indexed just
about 3,000 pages in the course of 18 months, on the
first week of using URL rewriting, Google was grabbing
3,000 pages per day from the 500,000-item database it
had previously barely touched. By the end of the first 2
months of using URL rewriting, Google had indexed over
200,000 pages from the site."
The
following sites offer instructions for two popular
servers:
* Apache:
http://httpd.apache.org/docs/mod/mod_rewrite.html
* ASP:
http://www.asp101.com/articles/wayne/extendingnames/
A good
step-by-step tutorial can be found at
fantomaster.com. The article links are on the right
hand side. There are four articles in the series.
Here
are some examples of sites that have implemented one of
these approaches:
*
Yahoo.com (yes, Yahoo!)
* Epinions.com
* Dooyoo.co.uk
* Pricerunner.com
URL
rewriting is a very common practice. Not only is it
exceptionally powerful in terms of search engine
optimization, but it is also superb for usability and
marketing in general. A shorter, more logical-seeming
URL is far easier for people to pass on in an email,
link to from their homepage, or spell out to a friend on
the telephone. Shorter URLs are good business.
2) Link to dynamic URLs from
static URL pages. The above solutions are
elegant, but may be difficult for some sites to
implement. Fortunately, there is a simple work around
for smaller sites.
One
method search engines use to crawl dynamic content while
avoiding dynamic spider traps is to follow links to
dynamic URLs from static URLs. If your site isn't too
large, you could build a static site map page consisting
of links to dynamic URLs. The search engines should
crawl those links, but will probably go no further.
An even
more effective technique would be to get other sites to
link to your dynamic pages. If these sites have good
Google PageRank, your dynamic pages will not only be
indexed, but the likelihood of their achieving a high
ranking for the key words on them will increase
significantly.
3) Pay for inclusion?
AltaVista, Ask Jeeves/TEOMA, FAST and Inktomi offer
Pay-per-inclusion (PPI) programs. You pay $25/page (or
so) to ensure that that page is spidered frequently (Inktomi
spiders every 48 hours for that price). This will garner
some traffic, but since Google now accounts for over 70%
of all search engine traffic and continues to grow
stronger all the time, don't throw too much money into
this solution unless you have deep pockets. If your site
is huge, the cost could be prohibitive. Paying to have
your pages spidered does not guarantee that they will
rank well, so they must be optimized properly. Frequent
spidering enables you to experiment with optimization
and see your results within a day or two. Search
engines, including those with PPI options, want their
databases to be as large as possible. So if you don't
pay for inclusion, and instead implement one of the
solutions discussed above, your pages will probably be
indexed anyway. On the other hand, if you pay for some
of your pages to be spidered, there's a good chance the
ones you don't pay for won't be.
To
summarize:
1.
Search engines have problems indexing dynamic content.
2. If possible, use static URLs to reference
dynamic content.
3. Otherwise, try to link to your dynamic URLs
from static pages.
4. If your budget allows, consider using
paid-inclusion programs.