Automatic Structuring of Sublanguage Information

doi:10.4324/9781315802206-6

ABSTRACT

Computer processing of free-text input can be used to obtain a database of patient information that contains in a structured form the relations among the medical events and observations recorded in the narrative. The information in the narrative can be structured by a computerized procedure because the information is expressed in a small number of sublanguage sentence types. For example, there is a treatment sentence type, a test and result sentence type, and a patient state sentence type. Each type is a syntactic relation among medical and English word classes. Once defined from an analysis of sample patient documents, these types can be used as target information structures into which the narrative portions of medical records can be mapped. This paper describes an implementation of text processing that utilizes English syntactic analysis and sublanguage word class combinations to determine to which of the known sublanguage sentence types an input occurrence conforms and to transform the occurrence into a standard form for the information carried by sentences of that type.