OCR detection of checkboxes from PDF files

Loading...
Thumbnail Image

Date

2022-06-10

Journal Title

Journal ISSN

Volume Title

Publisher

UNIVERSITY of M'SILA

Abstract

Automatic information extraction from scanned images is of great help for many fields such as medicine, computer science, which can be exam sheets, disease cards, etc... In this dissertation, we propose an algorithm to detect the positions of checkboxes and their values (checked, unchecked) using deep learning with other techniques such as OCR. First, we convert the pdf file into images representing each page of the pdf file, then load the image into our algorithm and detect the regions of the checkboxes using OCR. After that, we crop these regions into smaller images for use in the classification part, where we use deep learning techniques to classify these cropped images into the appropriate classes.

Description

Keywords

deep learning, Artificial intelligence, Conventional neural network, optical character recognition, checkbox detection, training model,

Citation

Collections