OCR detection of checkboxes from PDF files
Loading...
Date
2022-06-10
Journal Title
Journal ISSN
Volume Title
Publisher
UNIVERSITY of M'SILA
Abstract
Automatic information extraction from scanned images is of great help for many fields such
as medicine, computer science, which can be exam sheets, disease cards, etc... In this
dissertation, we propose an algorithm to detect the positions of checkboxes and their values
(checked, unchecked) using deep learning with other techniques such as OCR. First, we
convert the pdf file into images representing each page of the pdf file, then load the image
into our algorithm and detect the regions of the checkboxes using OCR. After that, we crop
these regions into smaller images for use in the classification part, where we use deep
learning techniques to classify these cropped images into the appropriate classes.
Description
Keywords
deep learning, Artificial intelligence, Conventional neural network, optical character recognition, checkbox detection, training model,