未分類

Perform 7-segment image recognition in python to obtain data (OCR)

スポンサーリンク

Using python+pyocr+Tesseract

It seems that there are many libraries in python, which makes me realize once again how great it is. But, as it turns out, it’s very hard to use. I’m not sure how it would perform even if the camera was firmly fixed.

Practicable and realistic lines

Despite the fact that it is marginally possible (or even in smartphones) to use today’s developed technology. It works at the Showa level, so if we are going to make major changes there, I think this is realistic (mystery text)

Remembering writing (assignment)

link
I’ve been using this site as a reference for a while now…
※The binarization part should be img_gray, not gray, since the file name above is img_gray
I followed the rest of the flow.I tried the rest of the flow, but nothing was recognized.
Binarization is troublesome. The threshold value changes depending on the image.
The threshold value changes depending on the image, and if it is set automatically, the result is terrible.It seems that Tesseract is very serious, and even a slight deviation of the angle makes it impossible to read. It can only tolerate about one degree.
For this reason,
1) Binarization should be done manually to ensure that it is done properly

(2) Morphology transform should be about 2 for the image I prepared. np.ones((5,5),np.uint8) at 5
Correct the angle (for the value I got)
(4) Tesseract_layout=8 is better (looks like 6, but 8 works better)
It seems to be better to use tesseract_layout=8 (it looks like 6, but 8 works better)
I finally got it to work by doing the following.

Conclusion (a bit useless)

I’m having a lot of trouble deciding whether to proceed with this policy or use a different route. The angle shift depends on the camera fixation, so at least the binarization needs to be done automatically and cleanly.
It would be better to use our own images.
I’m also interested in SSOCR, but since I want to learn about machine learning, I think I’ll attack it from a different direction.

The first line of output is the result of setting the language to eng and
the second line of output is the result of setting the language to letsgodigital.

コメント

タイトルとURLをコピーしました