ABSTRACT

Document image understanding aims to extract and classify meaningful data individually from paper-formed documents. Until recently, many methods and approaches have been proposed with regard to structure recognition for various kinds of documents, technical enhancement problems for OCR, and requirements for practical usage. Of course, though the technical research issues in the early stage are looked upon as complementary attacks for the traditional OCR, which is dependent on character recognition techniques, the application ranges or related issues were widely investigated or should be researched progressively. This chapter addresses current topics about document image understanding from a technical point of view as a survey.