Object Detection task in Visual Question Answering
Loading...
Date
2023-06-10
Journal Title
Journal ISSN
Volume Title
Publisher
University of M'sila
Abstract
This research proposes a novel approach for Visual Question Answering (VQA) by
incorporating object detection features into the model as image features instead of
traditional CNN features. The aim is to leverage specific information about objects
present in the image to improve the VQA task. The experiments yielded accuracy values
of 76% for Yes/No questions, 43% for counting questions, and 47% for other questions.
Overall, this research enhances the understanding and processing of visual information
by incorporating object detection features, leading to improved accuracy and
performance in answering questions based on images.
Description
Keywords
Visual Question Answering (VQA), object detection, image features, CNN, VQA v2 dataset.