ABSTRACT

Information retrieval (IR) is concerned with techniques tha t can provide effec­ tive access to large collections of objects containing primarily text. Objects in the collection may take many forms, for example, scientific journal articles, messages in an electronic mail archive, medical reports, encyclopedia articles, or user manuals. Objects may also exhibit complex structure in which one object is formed by combining several others (e.g., chapters may be viewed as objects tha t make up a book). In most of what follows, we will assume tha t the objects of interest are documents.