OCR made easy using tesserocr

There are numerous OCR libraries for python. tesserocr is the only library I found that has a decent, humanly-approachable API.

What is it exactly?

tesserocr is asimple, Pillow-friendly, wrapper around tesseract-ocr API.

Pillow is afriendly PIL fork (PIL is the Python Imaging Library).

Extracting text from a nutrition facts image

We’ll extract text from this image:

First, install all the requirements:

$ sudo apt install tesseract-ocr \ libtesseract-dev \ libleptonica-dev $ pip install Pillow cython tesserocr

Now run the following gist:

And viola!

$ python ocr.py /path/to/chocolate.jpg Nutrition Facts Serving Size 1 cup (249g) Servings Per Container 8 ― Amount Per Sewing Calories 210 Calories from Fat 80 % Daily Value" Total Fat 8g 13% Saturated Fat 5g 26% Trans Fat 0g Cholesterol 30mg 10% Sodium 200mg 9% Total Carbohydrate 27g 9% Dietary Fiber 1g 5% Sugars 25g Protein 9g Vitamin A 6% - Vitamin C 0% Calcium 30% - Iron 6% Vitamin D 30% *Percent Daily Values are based on a 2,000 calorie diet.

OCR made easy using tesserocr

Trending Articles

出售:uesugi 上杉升壓牛

几个Office零售版有效密钥

It's the politics, stupid！—全港性系統評估(TSA)「反面」睇

关门一家亲：习远平、张澜澜、徐才厚

[分享] 真元各階段需求及增加屬性列表

SM3267AE-量产失败，求助

郑州公安传唤多名维权人士及家属

台南危樓

[转载]贾平凹《废都》删节部分增补

討稅女王線上看第8集大結局

漫谈赵婷、李安、泰伦斯·马利克和摄影机的“上帝位置”

“75大屠杀”16年台立委：新疆沦现代集中营

中彰投抗空污，立即可做的，可能就是夜間灑水

Bandicam 螢幕錄影專家 5.3.0.1879 中文版 - 遊戲錄影軟體錄出來的檔案最小取代Fraps

Devart UniDAC v10.3.0 SOURCES Delphi / Lazarus [含附件]

【豌豆字幕组】[黑执事绿之魔女篇 / Kuroshitsuji_Midori no Majo-hen][10][简体][1080P][MP4]

名詞解釋：直接員工(DL)與間接員工(IDL)的差異，對工時的影響

用2D的X-Ray實例演練檢查BGA空焊問題

uini.upx2px已废弃了,uni.rpx2px在APP上不支持怎么处理?

出售: Samsung UA55F6400AJXZK 3D Smart TV “55吋”99%new