Towards Improved Performance of Text Detection Algorithms: Exploring the Impact of Automatic Image Classification and Blind Deconvolution
Abstract
Purpose – This study aims to investigate the impact of the proposed BD-CRAFT, a variant of the CRAFT algorithm applying preprocessing steps as blurry or non-blurry image classification using Laplacian and a deblurring technique known as blind deconvolution, in improving the performance of the top three state-of-the-art scene text detection algorithms – SenseTime, TextFuseNet, and TencentAILab.
Methodology – The researchers utilized the ICDAR 2013 Focused Scene Text Competition Challenge 2 dataset and the Intersection over Union (IoU) to determine the performance of the proposed BD-CRAFT. The IoU h-mean of the top three algorithms was compared against those of the modified versions.
Results – Each algorithm variant significantly improves the overall h-mean and some of the precision and recall values. TextFuseNet + BD-CRAFT yields 93.55% h-mean, while the precision shows an impressive improvement of over 4% to increase the precision to 95.71%. Meanwhile, TencentAILab + BD-CRAFT achieved an h-mean result of 94.77% with precision and recall improvement. Furthermore, SenseTime + BD-CRAFT ranked first with a very impressive 95.22% h-mean and showed a significant improvement of over 4%, which made it the top-ranked algorithm.
Conclusion – Evidence shows that when BD-CRAFT is combined with other algorithms, their performances are improved; hence BD-CRAFT has a significant impact on the text detection performance of these algorithms.
Recommendation – As possibilities for further studies, it would be interesting to investigate the other state-of-the-art algorithms for scene text detection that would also benefit from BD-CRAFT. Exploring other preprocessing techniques that can be incorporated into text detection algorithms in general, may be suitable.
Research Implication – Though the performances of current state-of-the-art algorithms are already commendable, the use of image classification and blind deconvolution as preprocessing techniques helps the top-performing text detection algorithms perform better in natural scene images hence the proposed method can be utilized in improving scene text detection.
This work is licensed under a Creative Commons Attribution 4.0 International License.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.