BUAD 307





04--Search engine optimization

    • Criteria and their relative impact
    • Algorithms
    • Reciprocal linking

PowerPoint Narration

Many Internet users find desired information and sites through search engines such as Google. Research shows that a large proportion of the traffic goes to the first three sites listed, and few people go so sites that appear beyond the first “page” or screen. On Google, the default screen size is ten sites, so being in the top ten is essential.

Because of the importance of search engines, getting a good ranking or coming up early on the list for important keywords is vitally important. Many consultants offer, for large fees, to help improve a site’s ranking.

There are several types of sites that are similar to search engines. Directories involve sites that index information based on human analysis. Yahoo! started out that way, but now most of the information is accessed through search engine features. The Open Directory Project at http://www.dmoz.org indexes sites by volunteer human analysts. Some sites contain link collections as part of their sites—e.g., business magazines may have links to business information sites.

Several issues in search engines and directories are important. Some search engines, such as Google, base rankings strictly on merit (although sites are allowed to get preferred paid listings on the right side of the screen). Other search engines allow sites to “bid” to get listed first. Some sites may end up paying as much as a dollar for each surfer who clicks through. If a potential customer is valuable enough, it may be worth paying for enhanced listings. Often, however, it is better to be listed as number two or three since only more serious searchers are likely to go beyond the first site. The first listed site may attract a number of people who click through without much serious inspection of the site.

Some search engines are more specific than others. The goal of Google, Yahoo! and MSN is to contain as many sites as possible. Others may specialize in sites of a specific type to reduce the amount of irrelevant information that may come up.

Search engines often have different types of strategies. Google is very much technology oriented while Yahoo! appears to be more market oriented. Another major goal of Google is speed. Some sites may contain more content of one type than another. For example, AltaVista appears to have more images, as opposed to text pages, indexed.

Search engine rankings. The order in which different sites are listed for a given term is determined by a secret algorithm developed by the search engine. An algorithm is a collection of rules put together to identify the most relevant sites. The specific algorithms are highly guarded trade secrets, but most tend to heavily weigh the number of links from other sites to a site and the keywords involved. More credit is given for a link from a highly rated site—thus, having a link from CNN.com would count much more than one from the site of the Imperial Valley Press. On any given page, the weight given from a link will depend on the total number of links on that page. Having one of one hundred links will count less than being the only one. One source reports that the weight appears to be proportional so that one out of one hundred links would carry one percent of the weight of being the sole link, but that may change and/or vary among search engines.

For Google, some of the main ranking factors appear to be:

  1. Number and quality of links to the site, as discussed above.  This is by far the most significant factor.
  2. Relevant keywords.  Note that the ranking algorithm tests for “spam.”  Reckless repeating keywords may actually count against the rating of the site.
  3. The “click-through” share of the site.  Since late 2006 or early 2007, Google reportedly fine-tunes rankings by observing the percentage of the time that a particular site is chosen for a given set of search terms.  Sites that are selected more frequently may improve in rank and those less frequently selected—despite their merits presumed from the other factors—may move down.
  4. Location.  This is a new factor that has recently begun to affect items.  A user’s geographic location can usually be identified based on his or her “IP address,” a way of identifying a particular computer.  If one were to search for “Chinese restaurants” on a computer located at USC, many of the results would be generally, discussing issues relevant to Chinese restaurants overall without any necessary reference to any specific ones.  However, certain prominent Los Angeles area Chinese restaurants may also be listed.  In Portland, OR, the list would thus be different, possibly including some local establishments there.

Types of search engines. Some engines, such as Google, are general purpose search engines. Some are specialized. Some are hybrids, containing some directory structure in addition to search engine capabilities. Some sites are aggregator sites—they do not have their own databases but instead combine the results from simultaneous searches on other search engines.  In 2009, Microsoft released the Bing, the “decision engine,” which is intended to provide more “intuitive” results.  Rather than identifying a number of airlines offering a fare between two cities, for example, is intended to actually show these fares.  Yahoo has signed an agreement to use Bing as its search engine source.

Text optimization. It is important to repeat important words as much as possible subject to credibility. Search engines today are increasingly sophisticated in identifying “spamming” through frivolous repetition of the same words or early use of words that are not relevant to the main content of the site. Words that appear early in the text and on the index page will tend to be weighted more heavily. For some search engines, it may be useful to include common misspellings of a word so that the site will come up when that spelling is used. For aesthetic reasons, many firms may object to having much text on the front page, but text may be put below the graphic elements.  Some web site owners have attempted to include hidden text so that a search engine would find the desired words while the visitor would see something else. Some web designers, for example, would hide text behind a graphic, make the text in a very small font, and/or make the font color the same, or nearly the same, as the background. Other web site designers have made a “legitimate” site, only to have a command to move the visitor to another site when they go to the searched site. Search engines today are increasingly able to detect this type of abuse, and sites may be penalized as a result.

Early search engines relied heavily on “meta tags” where the web site creator specified what he or she believed to be appropriate keywords, content descriptions, and titles. Because these tags are subject to a lot of abuse, these no longer appear to be significant.

Link optimization. Many web sites engage in “link exchanges”—that is, complementary sites will agree to feature links to each other. It may be useful for a webmaster to ask firms whose content does not compete for a link. Sites should register with the Open Directory Project at http://www.dmoz.org since, if a site is classified favorably, this may help rankings.

The bottom line on Google.  Today, the most significant factor in search engine rankings appears to be the “value” of the links that reach a site.  Links from “low value” sites (those that are not rated highly, and especially those considered to the “spam”) count for very little.  Links from highly rated sites on the relevant keywords count for literally thousands—sometimes tens and hundreds of thousands—times as much as less important site.  In the past, the presence of important key terms on a site was the main driver of rankings, subject to some rudimentary safeguards against obvious “spamming” sites which used the words as a way to gain rankings without providing relevant information.  Now, the effect of keywords is secondary except for searches that involve a very unique key term.  Within the last year, it appears that Google has incorporated the frequency of “click-through” for a site when it is listed in search (“organic”) results.  That is, if a relatively high proportion of searchers go to the site, its ranking is likely to decrease.  However, if relatively few searchers actually end up going to a highly ranked site when it shows up in search listings, that site is likely to lose rank. Search engines cannot usually measure the amount of traffic that goes to a site.   Traditionally, then, the traffic of a site was not directly incorporated into the ranking system.  Today, however, Google is reported to weigh the percentage that a site is chosen for click-through when the site comes up in a search.  That is, if a site is initially highly ranked, if a small proportion of searchers actually choose to go to that site, this site is likely to have its rank reduced.

Google now offers a set of “Analytics” tools, including a set of web traffic statistics.  Webmasters can sign up voluntarily to participate in this by placing certain “meta tag” code in their web pages.  (This code is invisible to people viewing the respective web page in its regular display mode).  Therefore, for such sites, Google does, in principle, have access to traffic information from all sources, including other search engines or links from other sites.  It is not clear whether Google actually uses this information, however.

Main Midterm Review Page