我一直在尝试处理图像处理脚本/ OCR,该脚本可使我tesseract
从下图中的框中提取字母(使用)。
经过大量的处理,我能够使图片看起来像这样
为了消除噪声,我对图像进行了反转,然后进行泛洪和高斯模糊处理以消除噪声。这就是我接下来要讨论的内容。
在通过一些阈值和侵蚀来消除噪声(侵蚀是使文本失真的步骤)之后,我能够使图像看起来像这样,然后再通过tesseract运行
This, while a pretty good rendering, allows for fairly accurate results through tesseract. Though it sometimes fails because it reads the hash (#) as a H or W. This leads me to my question!
Is there a way using opencv, skimage, PIL (opencv preferably) I can sharpen this image in order to increase my chances of tesseract properly reading my image? OR Is there a way I can get from the third to final image WITHOUT having to use erosion which ultimately distorted the text in the image.
Any help would be greatly appreciated!
OpenCV does has functions like filter2D that convolves arbitrary kernel with given image. In particular you can use kernels that are used for image sharpening. The main question is whether this will improve the results of your OCR library or not. The image is already pretty sharp and the noise in the image is not a result of blur. I never worked with teseract myself, but I am fairly sure that it already does all the noise reduction it could. And 'helping' him in this process may actually have opposite effect. For example any sharpening process tends to amplify noise (as opposite to noise reduction processes that usually are blurring images). Most of computer vision libraries give better results when provided with raw (unprocessed) images.
Edit (after question update): There multiple ways to do so. The first one that I would test is this: Your first binary image is pretty clean and sharp. Instead of of using morphological operations that reduce quality of letters switch to filtering contours. Use findContours函数用于查找图像中的所有轮廓并存储其层次结构(即哪个轮廓在其中)。从所有找到的轮廓中,您实际上只需要第一和第二级轮廓,即每个字母的外部和内部轮廓(零级轮廓是最外部的轮廓)。其他轮廓可以丢弃。在属于第一级的轮廓中,您可以丢弃那些边界框太小而不能成为真实字母的轮廓。经过这两个丢弃程序,我希望剩下的大部分轮廓都是字母的一部分。在白色图像上绘制它们并运行OCR。(如果要在黑色背景上使用白色字母,则需要反转轮廓中的顶点顺序)。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句