This study attempts to explore the nature of the various extraction and representation schemes associated with some of the more popular search engines that provide access to information content on the web. The goal of this study is to test that harvesting and indexing behavior using a single web site, organized to examine the details of these processes. The web site "Project Whistlestop" (http://www.whistlestop.org) contains web pages that have been submitted for registration by five popular web-based search engines. Project Whistlestop includes a digital archive of some of the important communications of Harry Truman, President of the United States from 1944 until 1952. These artifacts are part of the Harry S Truman Presidential library and Museum located in Independence, Missouri. The project examines a subset of those search engines linked to the Netscape Communications homepage (http://home.netscape.com), The five search engines chosen were those found in the directory bar under NETSEARH on the Netscape homepage, These included Lycos, InfoSeek, AltaVista, Excite, and Yahoo!. Background data on web site activity had been monitored for approximately 30 days prior to registering the Whistlestop pages (9/17/97 through 10/17/97). Beginning October 17, 1997 the Whistlestop URL was formally registered with each of the search engines under consideration. The date and time of web page submission to each search engine was recorded. The process of page extraction and the creation of indexed pages was examined through the tracking of specially inserted character strings at varying levels within the web site under examination. It was found that different search engines harvest and index at different rates, using different procedures. The summary concludes that more experimental designs are needed to better understand the extraction and representation procedures used by various web-based search engines, so that information content can be proactively submitted by web designers in a fashion that improves search performance by end users.