Although originally created merely as repositories for compounds synthesized within an organization, chemical databases can now be searched to give novel ideas for lead discovery.The familiar chemical-structure diagrams are not amenable to computational operations such as database searching, so several types of chemical-structure representation have been developed by theoretical chemists for use in computer systems. The predominant form is the atom–bond connection table.A connection table along with two-dimensional coordinates for display is generally sufficient to identify the substance. However, to perform any energy calculations or to determine if the compound has the potential to bind to a receptor or enzyme of interest, three-dimensional coordinates are necessary.Chemical structures differ considerably from other entities that are commonly stored in databases, such as text, and so the various search modes also differ considerably, although some parallels can be drawn.Exact-match searches — which might be performed to find out if a proposed new structure already exists in a database, for example — can be thought of as looking up a complete word in a dictionary.Substructure searches — in which a user picks pieces of a chemical structure and requests that the system return a set of compounds that contain the pieces — are analogous to a wild-carded text search.Similarity searches — which might be performed if a user wants compounds that resemble the compound of interest to a chemist's intuitive thinking but do not necessarily reflect an exact or substructure match — are analogous to a 'sounds like' text search.In pharmacophoric searches, assumptions about which groups of atoms on the small molecule are involved in binding are combined with the spatial relationship of these groups to give a three-dimensional query. Although the process is generally slower than previously mentioned search types, the results provide an indication of whether a set of structures can bind to a receptor or enzyme, and so hits might be very valuable in the drug design process.Molecular docking — placing a series of candidate molecules from a database into the active site of a protein to evaluate how well the compounds might bind to the receptor or enzyme — has become a more popular mode of database searching owing to continual improvement in the quality of the docking and scoring algorithms.Once a database search has been performed, the list of potential molecules for biological testing can be refined by filtering (removing molecules deemed to have unsuitable properties), clustering (grouping similar compounds) and human inspection.