tesseract

OCR (Optical Character Recognition) engine. More information: https://github.com/tesseract-ocr/tesseract.

Recognize text in an image and save it to output.txt (the .txt extension is added automatically):
tesseract {{image.png}} {{output}}
Specify a custom language (default is English) with an ISO 639-2 code (e.g. deu = Deutsch = German):
tesseract -l deu {{image.png}} {{output}}
List the ISO 639-2 codes of available languages:
tesseract --list-langs
Specify a custom page segmentation mode (default is 3):
tesseract -psm {{0_to_10}} {{image.png}} {{output}}
List page segmentation modes and their descriptions:
tesseract --help-psm

License and Disclaimer

The content on this page is copyright © 2014—present the tldr-pages team and contributors.
This page is used with permission under Creative Commons Attribution 4.0 International License.

While we do attempt to make sure content is accurate, there isn't a warranty of any kind.