Tesserocr工具：简化Python中的光学字符识别-源码站

项目概述

这是一个用于Tesseract-OCR API的Python封装库，旨在简化光学字符识别（OCR）功能的使用，方便开发者在Python项目中集成文本识别能力。

项目地址

https://github.com/sirfz/tesserocr

项目页面预览

sirfz/tesserocr preview

关键指标

Stars：2145
主要语言：Python
License：MIT License
最近更新：2026-01-13T15:39:28Z
默认分支：master

本站高速下载（国内可用）

点击下载（本站镜像）
– SHA256：e5c66c25e88169ec9711763b16a4856a2fd5044d011c591d829f962977af5d0b

安装部署要点（README 精选）

（未获取到 README 安装段落，建议查看项目文档。）

常用命令（从 README 提取）

You can use the `simonflueckiger <https://anaconda.org/simonflueckiger/tesserocr>`_ channel to install from Conda:

::

    > conda install -c simonflueckiger tesserocr

Or alternatively the `conda-forge <https://anaconda.org/conda-forge/tesserocr>`_ channel:

::

    > conda install -c conda-forge tesserocr

pip

If you need Windows tessocr package and your Python version is not supported by above mentioned project,
you can try to follow `step by step instructions for Windows 64bit` in `Windows.build.md`_.

.. _Windows.build.md: Windows.build.md

tessdata
========

You may need to point to the tessdata path if it cannot be detected automatically. This can be done by setting the ``TESSDATA_PREFIX`` environment variable or by passing the path to ``PyTessBaseAPI`` (e.g.: ``PyTessBaseAPI(path='/usr/share/tessdata')``). The path should contain ``.traineddata`` files which can be found at https://github.com/tesseract-ocr/tessdata.

Make sure you have the correct version of traineddata for your ``tesseract --version``.

You can list the current supported languages on your system using the ``get_languages`` function:

.. code:: python

    from tesserocr import get_languages

    print(get_languages('/usr/share/tessdata'))  # or any other path that applies to your system

Usage
=====

Initialize and re-use the tesseract API instance to score multiple
images:

.. code:: python

    from tesserocr import PyTessBaseAPI

    images = ['sample.jpg', 'sample2.jpg', 'sample3.jpg']

    with PyTessBaseAPI() as api:
        for img in images:
            api.SetImageFile(img)
            print(api.GetUTF8Text())
            print(api.AllWordConfidences())
    # api is automatically finalized when used in a with-statement (context manager).
    # otherwise api.End() should be explicitly called when it's no longer needed.

``PyTessBaseAPI`` exposes several tesseract API methods. Make sure you
read their docstrings for more info.

Basic example using available helper functions:

.. code:: python

    import tesserocr
    from PIL import Image

    print(tesserocr.tesseract_version())  # print tesseract-ocr version
    print(tesserocr.get_languages())  # prints tessdata path and list of available languages

    image = Image.open('sample.jpg')
    print(tesserocr.image_to_text(image))  # print ocr text from image
    # or
    print(tesserocr.file_to_text('sample.jpg'))

``image_to_text`` and ``file_to_text`` can be used with ``threading`` to
concurrently process multiple images which is highly efficient.

Advanced API Examples
---------------------

GetComponentImages example:

.. code:: python

    from PIL import Image
    from tesserocr import PyTessBaseAPI, PSM

    with PyTessBaseAPI(psm=PSM.AUTO_OSD) as api:
        image = Image.open("/usr/src/tesseract/testing/eurotext.tif")
        api.SetImage(image)
        api.Recognize()

        it = api.AnalyseLayout()
        orientation, direction, order, deskew_angle = it.Orientation()
        print("Orientation: {:d}".format(orientation))
        print("WritingDirection: {:d}".format(direction))
        print("TextlineOrder: {:d}".format(order))
        print("Deskew angle: {:.4f}".format(deskew_angle))

or more simply with ``OSD_ONLY`` page segmentation mode:

.. code:: python

    from tesserocr import PyTessBaseAPI, PSM

    with PyTessBaseAPI(psm=PSM.OSD_ONLY) as api:
        api.SetImageFile("/usr/src/tesseract/testing/eurotext.tif")

        os = api.DetectOS()
        print("Orientation: {orientation}\nOrientation confidence: {oconfidence}\n"
              "Script: {script}\nScript confidence: {sconfidence}".format(**os))

more human-readable info with tesseract 4+ (demonstrates LSTM engine usage):

.. code:: python

    from tesserocr import PyTessBaseAPI, PSM, OEM

    with PyTessBaseAPI(psm=PSM.OSD_ONLY, oem=OEM.LSTM_ONLY) as api:
        api.SetImageFile("/usr/src/tesseract/testing/eurotext.tif")

        os = api.DetectOrientationScript()
        print("Orientation: {orient_deg}\nOrientation confidence: {orient_conf}\n"
              "Script: {script_name}\nScript confidence: {script_conf}".format(**os))

Iterator over the classifier choices for a single symbol: