ABSTRACT

Document Processing Using Machine Learning aims at presenting a handful of resources for students and researchers working in the document image analysis (DIA) domain using machine learning since it covers multiple document processing problems. Starting with an explanation of how Artificial Intelligence (AI) plays an important role in this domain, the book further discusses how different machine learning algorithms can be applied for classification/recognition and clustering problems regardless the type of input data: images or text.

In brief, the book offers comprehensive coverage of the most essential topics, including:

· The role of AI for document image analysis

· Optical character recognition

· Machine learning algorithms for document analysis

· Extreme learning machines and their applications

· Mathematical foundation for Web text document analysis

· Social media data analysis

· Modalities for document dataset generation

This book serves both undergraduate and graduate scholars in Computer Science/Information Technology/Electrical and Computer Engineering. Further, it is a great fit for early career research scientists and industrialists in the domain.

chapter 1|14 pages

Artificial Intelligence for Document Image Analysis

ByHimadri Mukherjee, Payel Rakshit, Ankita Dhar, Sk Md Obaidullah, KC Santosh, Santanu Phadikar, Kaushik Roy

chapter 2|14 pages

An Approach toward Character Recognition of Bangla Handwritten Isolated Characters

ByPayel Rakshit, Chayan Halder, Kaushik Roy

chapter 3|15 pages

Artistic Multi-Character Script Identification

ByMridul Ghosh, Himadri Mukherjee, Sk Md Obaidullah, KC Santosh, Nibaran Das, Kaushik Roy

chapter 4|10 pages

A Study on the Extreme Learning Machine and Its Applications

ByHimadri Mukherjee, Sahana Das, Subhashmita Ghosh, Sk Md Obaidullah, KC Santosh, Nibaran Das, Kaushik Roy

chapter 5|15 pages

A Graph-Based Text Classification Model for Web Text Documents

ByAnkita Dhar, Niladri Sekhar Dash, Kaushik Roy

chapter 6|16 pages

A Study of Distance Metrics in Document Classification

ByAnkita Dhar, Niladri Sekhar Dash, Kaushik Roy

chapter 7|15 pages

A Study of Proximity of Domains for Text Categorization

ByAnkita Dhar, Niladri Sekhar Dash, Kaushik Roy

chapter 9|23 pages

The Effect of Using Features Computed from Generated Offline Images for Online Bangla Handwritten Character Recognition

ByShibaprasad Sen, Ankan Bhattacharyya, Kaushik Roy

chapter 10|18 pages

Handwritten Character Recognition for Palm-Leaf Manuscripts

ByPapangkorn Inkeaw, Jeerayut Chaijaruwanich, Jakramate Bootkrajang