Search Engines, Directories, Etc.

By James Harvey Stout (deceased). This material is now in the public domain. The complete collection of Mr. Stout's writing is now at http://stout.mybravenet.com/public_html/h/ >

 

Jump to the following topics:

  1. What are search engines, directories, etc.?  
  2. The various types of search engines, directories, etc.
  3. Registering with search engines, etc.   
  4. Keywords.
  5. Meta tags.  
  6. Other suggestions for search engines, directories, etc.  

What are search engines, directories, etc.? They are collections of information about websites; the information includes the URL and description of each site. People visit these search engines, etc., to find websites regarding a particular topic, e.g., "internet marketing" or "baseball." We want to list our website in these search engines, directories, etc., because most people use these resources to find websites regarding a topic. And we want to be ranked among the top 20 in our category, so that the people will see our listing; if we are ranked #31,546, we might as well not be listed at all. This chapter explains the fundamentals for gaining a high ranking; however, it does not attempt to give details for each search engine, etc., because (1) each one has its own criteria for determining the ranking of sites; (2) those criteria are constantly changing; (3) some of the criteria are kept secret; (4) this constantly changing information can be delivered by experts who actively track the changes in search engines and provide the data through consultations, websites, and mailing lists. The following information is generalized; for each search engine, etc., you need to decide which of the ideas will apply.

The various types of search engines, directories, etc.  

  1. Search engines. These are collections of links which gather their information via two means: (1) application forms where we can describe our site, and (2) "robots" or "spiders" or "crawlers," which are software programs which search the web for new sites (and they also search for changes in existing sites). Therefore, our site might be listed even if we do not submit an application. Some of the biggest search engines are HotBot, Lycos, InfoSeek, and WebCrawler.
  2. Directories. These are collections of links, but they gather their information only through application forms; they do not use spiders. Therefore, our site will not be listed unless we submit an application. The biggest directory is Yahoo.
  3. Specialized search engines and directories. These sites focus on a particular topic, e.g., a particular type of business, particular industries, particular geographical areas, etc.
  4. Free-for-all sites and classified-ad sites. These sites accept virtually any listing which is submitted; thus, the quality of listings is usually low. At some sites, the listings are not even categorized.

Registering with search engines, etc.    

  1. Before registering, we can make certain that our site is not listed already. It is possible that a robot has visited our site.
  2. We can prepare our information in advance. Before going to the search engines, etc., to register our site, we can write a description of our website (and the goods or services which are sold there). Each search engine, etc., allows a different number of words for our description, so we can prepare descriptions of various lengths: 20 words, 25 words, 50 words, etc. This description is very important, so it should not be improvised; instead, we need to give it as much care as we would give to our most-important ad copy (with rewriting, proofreading, and spell-checking). The description can be filled with our "keywords" (which are explained later), but we cannot simply use a list of keywords; instead, we need to write them into logical sentences. We will copy-and-paste this description into each application form.
  3. Read the rules for the application. Every search engine has different rules.
  4. Fill in every essential field in the application. Our application could be rejected simply for being incomplete.
  5. Be truthful in the application. After we submit an application to a search engine or directory, our site will be visited by a spider or by a human. If the website does not match our description, the site will not be listed in the search engine.
  6. Submit only one application for the entire site -- or submit each page individually. This is a gray area, depending on many factors:
    • We submit only the main page to a directory. Some directories (e.g., Yahoo) need only the URL of the main page. A human will visit the site, to gather information regarding the other pages, for the indexing of our site.
    • We submit only the main page to a search engine. If we submit the main page, the robots will use our pages' links to find (and index) the other pages of our site. However, some robots look at only the first page, or only the first few pages; the other pages will not be indexed. (One solution is to have a link from our main page directly to all of our important pages, so that the spider has to go down only one level to find those pages.)
    • We register the most-important pages. If we do this, we increase the possibility that those pages will be indexed. Even though our website has a particular overall theme, individual pages address different issues which should be indexed in a different category.
    • We register all of our pages. Some people take this option.
  7. Don't submit too many pages (or too many in a single day). The rules will indicate the maximum number of pages which will be accepted.
  8. Indicate the pages which are not to be indexed. If there are any pages which we do not want to be in the search engines, we can use various means to tell the robots to skip those pages.
    • A robots.txt file. This text file is put into our server's directories to give instructions to robots; the file contains the URLs of the pages which are not to be indexed. We have various options.
      • The file can be in the root directory, to give instructions regarding the entire site -- or in a subdirectory, to give instructions regarding that one directory.
      • The file can give instructions to particular robots. (Search engines are not the only entities which operate robots; there are also robots for link-checking software, and for shopping agents, etc.) Our traffic software has a list of robots which have visited our site.
    • A meta tag. The four possible instructions include: index, noindex, follow, and nofollow. We are telling the robot whether to index the page, and whether to follow the links from that page to other pages on our site. We use this syntax: <META NAME="robots" CONTENT="noindex, nofollow">. This meta tag (like other meta tags) goes within the <HEAD> tag on the web page.
  9. We can re-submit pages later. As time goes by, our rank will gradually slip, as other people's websites are submitted to the search engines. (In some cases, a website simply "disappears" from a search engine, for unknown reasons.) We can re-submit the site every few months. (To avoid being accused of spamming the pages, we should make some significant changes in them before re-submitting these pages.)
  10. We can submit new pages. After our original application, we will probably continue to expand our website. These new pages can be submitted to the search engines.

Keywords. "Keywords" are the words which are typed into a search engine's form to find websites on a particular topic; for example, if we want to go to websites regarding Corvettes, we would type in the word, "Corvettes." We need to use keywords very carefully on our website, because these are the words by which our potential customers will find us when they visit a search engine. We can use these guidelines in the use of keywords.

  1. We use keywords in the site-description which is requested on a search engine's submission form. In the description, the keywords should be written into logical sentences rather than just a list of the words.
  2. We can use specific terms. If we use a general term (e.g., "clothing"), our listing will be included in millions of other listings in that category; virtually no one will find us in a search. Instead, we need to use focused terms, (e.g., "leather gloves").
  3. We can use keywords frequently throughout the site. "Keyword density" refers to the number of times that our keywords appear on the site; search engines actually count the keywords. (Some webmasters calculate this keyword density as a percentage of the words which are keywords; they might strive for 10%.) However, the following techniques would be considered "spamming." (The possible penalty for this "keyword spamming": our website can be banned from the search engine.)
    • A simple list of keywords. Instead, use the keywords in actual sentences and phrases.
    • Excessive use of keywords. Each search engine has a limit on the number of allowable repetitions of keywords; in some cases, we will be penalized if we use a keyword more than three or four times on a page.
    • Hiding our keyword repetitions by putting the type into the same color as the background, so that the text is not visible. However, the search engines are aware of this trick, so they look for it.
  4. We can put keywords into these locations on our website:
    • The top portion of every page. Keywords which are in the top portion will attract the most attention from search engines. (If we have tables and graphics in the top portion, we leave less room for keywords.) Some search engines will create our listing simply by reading the first few sentences at the top of a page; we need to be certain that the page doesn't start with fluff, e.g., "Thank you for coming to my site."
    • The title. This is not the heading of the page; it is the phrase which appears above our page in the browser's "title" field; in html, it is between the <TITLE> and </TITLE> tags. We can use as many as 64 characters; if we use more than 64 characters, the extra characters will be cut off in our search-engine listing. (The title is important for another reason: these are the words which will show up in a browser's bookmark if our visitors bookmark the site.)
    • Text. Keywords need to be used throughout the text of our web pages, in virtually every paragraph.
    • Photo captions. Keywords can be used here, too.
    • Alt tags. An alt tag is the text which is displayed when a graphic is absent from its place on the page (perhaps because our visitors have turned off the graphics in their browsers). Some search engines make note of the text in our alt tags, so these tags should contain our keywords.
    • Headings. These headings include the main heading on the page, and all of the subheadings. Search engines consider headings to be very important, so we should have some keywords here.
    • Links to external sites. If we have links to other people's sites, the words in our link description will be picked up by search engines. For example, I wanted to find the URL of a software company. I had forgotten that my "freeware page" has a link to that company. When I went to a search engine, and typed in the name of the software, my freeware page came up! I had not submitted that page to the search engine (nor had the software company); apparently a robot had found the page and had indexed it.
    • Guestbooks and discussion boards. The keywords will be indexed by robots.
  5. We can use variations of words:
    • We can use some single words. For example, "books" or "jewelry."
    • We can use some phrases. For example, "science fiction books" or "emerald jewelry." At search engines, most people use two or more words in their searches. Also, phrases help to narrow our focus; for example, instead of being virtually invisible in a list of 10,000 jewelry dealers, we stand out in a list of 10 emerald jewelry dealers.
    • We can use plural forms of the words, if possible. For example, if our keyword is "lamp," our site will come up in a search of "lamp," but not "lamps." But if we use the plural, "lamps," our site will come up in a search of "lamp" or "lamps."  
    • We can use longer terms. This is similar to the idea of using plural forms of words; for example, "lamp" shows up in a search for "lamps" (but not vice versa) -- and "engineering" shows up in a search for "engineer" (but not vice versa).
    • We can use synonyms. For example, we might think that we sell "sofas," but some people will search for "couches." We can get synonyms from a thesaurus -- in a book, or in most word-processing software.
    • We can use misspellings. For example, if we sell water faucets, we might want to add a misspelling such as "fawcets" so that our site will come up when the word is misspelled by people.
    • We can use upper case and lower case. Some search engines differentiate between capitalized letters and small letters; i.e., the search engines are "case-sensitive." For example, if our keyword is "Videocassettes," it might not show up in a search which uses a lower-case "v": "videocassettes."
    • We can use other variations. For example, if we sell water skis, we can use the following keywords: water skis, waterskis, waterskiing, water skiing, etc.
  6. We can use keywords which are based on the real-life use of the product or service.
    • Benefits and features. For example, our time-saving, money-saving product can use these keywords: time management, personal finances, etc.
    • Our audience. For example, if we sell rototillers, one of our keywords can be "gardeners."
    • Parts or ingredients. For example, if we sell snacks, we might use "chocolate" as a keyword.
  7. We can look at the keywords on our competitors' websites. Particularly if those websites have high rankings in a search, we can see the words, and how they are used throughout the page. We won't plagiarize the text itself, but we can get some ideas for our own keywords.
  8. We can check our traffic logs to see which keywords are being used to find our site. We can add more of those words to our pages.
  9. We should use different keywords for each page. Some keywords will be used on virtually all of our pages, but each page has a different emphasis. For example, if our entire site pertains to vacation travel, one page might use "Bahamas vacation" as a keyword phrase, while another page uses "Puerto Rico vacation" as a keyword phrase.
  10. We can use the most-popular words at the search engines. Some websites have lists of the commonly used words:

Meta tags. Meta tags are html tags which describe a website. At a few of the major search engines, the robots look at our meta tags, to determine the content of our website for the purpose of indexing. Meta tags are not essential; only two or three search engines refer to them -- and if we do not have meta tags, those search engines will index our site by referring to the text of the page. (Meta tags are not visible on the page, but they can be viewed in "Page Source.")

  1. There are two types of meta tags.
    • "Description" meta tag. This tag is a description of our site, in 200 characters or less. We should use as many keywords as possible -- but we are writing actual sentences, not merely a list of keywords. The syntax is: <META NAME="description" CONTENT="This is where we put the description of our site"> .
    • "Keywords" meta tag. This is a list of keywords, in 1,000 characters or less (including punctuation and spaces). Search engines will reject our page if we repeat a keyword too many times in this meta tag; the limit varies, but we are safe if we do not repeat a keyword more than three or five times. The words are separated by commas, but no spaces. The syntax is: <META NAME="keywords" CONTENT="psychology,self-improvement,happiness"> .
  2. Meta tags are placed at the top of a page. They are between the <HEAD> and </HEAD> tags, on every page of our site. If we have javascript on our page, the meta tags should be higher on the page than the javascript.
  3. Use different meta tags for each page of the site. Each page is different, so it requires meta tags.
  4. Don't use competitors' names as keywords. We might be tempted to use a competitor's name or product, so that a search for that name or product would bring up our site, too. However, this practice might be considered trademark infringement.
  5. Don't use keywords which are unrelated to the site. The practice is prohibited by search engines. And it will attract unqualified visitors, who are looking for the "sex" that we put into our meta tags, when our site is actually about electronic supplies.
  6. Put meta tags into frames, if you have frames. The tags go into the <FRAMESET> page.
  7. Use the following "meta tag generators" if you need help in creating meta tags:

Other suggestions for search engines, directories, etc.  

  1. We can report dead links in search engines. If we are ranked #10, but 2 of the preceding listings are dead links, we can report those dead links to the search engine's management. After those links are removed, we will automatically move up to #8. We are in the same relative position, but we are reducing the frustration of people who are using the search engine.
  2. We can report websites which violate the search engine's rules. If we turn in violators who out-rank us, we move up in the rankings. Those companies got their high ranking by cheating; that is not fair to the honest people. (However, there is a possibility that the company submitted that page when the current rules did not yet exist.)
  3. We can check our rankings periodically. The ranking will change as new websites are submitted, and dead ones are eliminated. Sometimes our website will disappear entirely from the search engine; when this happens, we have to re-submit the site. These free services will reveal our rankings in the search engines:
  4. We can increase the number of links from other people's sites to our site. Search engines consider our "link popularity" when determining our rank; if there are many links to our site, we might receive a higher ranking.
  5. We can make certain that our site is always online. After our site has been accepted by a search engine, the robot will return occasionally to see whether the site is still here. (The site might be offline if our server is down for maintenance or repairs.) If the robot doesn't find our site, the search engine might remove our listing.
  6. We can avoid "bait and switch" tactics. This is the practice of submitting a website to the search engines, and then changing the site immediately after the site has been indexed. (The initial page might be designed merely for search-engine placement, while the replacement page is designed for actual sales.) However, this practice is not effective, because the robots can return at any time -- and they will then index our replacement page. There are variations of the bait-and-switch:
    • Some people have tried to use cgi scripts which can detect robots, so that the original page can be served. Apparently, this method does not work.
    • Another type of bait-and-switch (which is now prohibited) is done with a refresh page; the visitor comes to a page which is filled with keywords, but then the visitor is immediately transferred from that page to the actual main page.
  7. We can refrain from using similar pages with different URLs. Some people create more than one copy of their main page (or another page); perhaps they want to tailor the page for compliance with a particular search engine's criteria. However, this practice is prohibited by search engines. (In particular, the search engines look for filenames which indicate that there is more than one index page; for example, the filenames might be index1.htm, index2.htm, etc.)
  8. We can refrain from using free webhosts (e.g., geocities, tripod). Some of the search engines reject any website which is based at a free webhosting service.

line