Loading... ## Tesseract OCR 安装及使用 一、工具介绍 Tesseract-OCR 是一款由HP实验室开发由Google维护的开源OCR(Optical Character Recognition , 光学字符识别)引擎。与Microsoft Office Document Imaging(MODI)相比,我们可以不断的训练的库,使图像转换文本的能力不断增强;如果团队深度需要,还可以以它为模板,开发出符合自身需求的OCR引擎。 语言包:[__https://github.com/tesseract-ocr/tessdata__](https://github.com/tesseract-ocr/tessdata) 直接下载地址:[__http://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-4.00.00dev.exe__](http://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-setup-4.00.00dev.exe) tesseract下载地址:[https://digi.bib.uni-mannheim.de/tesseract/](https://digi.bib.uni-mannheim.de/tesseract/) 二、配置环境变量 2.1 进入环境变量配置界面 右键点击此电脑–属性–高级系统设置–环境变量–系统变量–Path 2.2 添加系统变量 找到系统变量的 Path ,将 Tesseract-OCR 的安装目录添加进去: 2.3 添加 tessdata 系统变量 如下图新建系统变量 : TESSDATA_PREFIX 变量值为 tessdata 文件夹的路径(在Tesseract-OCR的安装目录下): ![](https://tcs.teambition.net/storage/3128642f2feb894185f86c88a22a68aef791?Signature=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJBcHBJRCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9hcHBJZCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9vcmdhbml6YXRpb25JZCI6IiIsImV4cCI6MTYyODM5OTc5MiwiaWF0IjoxNjI3Nzk0OTkyLCJyZXNvdXJjZSI6Ii9zdG9yYWdlLzMxMjg2NDJmMmZlYjg5NDE4NWY4NmM4OGEyMmE2OGFlZjc5MSJ9.iCPyJdn_la0YljTccu-jjiYxtqOPWy0WTjyk1JPQObA&download=image.png "") 三、使用 Tesseract-OCR 3.1 进入cmd 输入下面的命令查看版本,正常运行则安装成功: `tesseract --version` ![](https://tcs.teambition.net/storage/3128f174e48617248ba6a9b230852682d78b?Signature=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJBcHBJRCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9hcHBJZCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9vcmdhbml6YXRpb25JZCI6IiIsImV4cCI6MTYyODM5OTc5MiwiaWF0IjoxNjI3Nzk0OTkyLCJyZXNvdXJjZSI6Ii9zdG9yYWdlLzMxMjhmMTc0ZTQ4NjE3MjQ4YmE2YTliMjMwODUyNjgyZDc4YiJ9.sLL3y04X0oNY8FFoOa90PaaLLB5ABQrm_omOob5KbYI&download=image.png "") 3.2 使用下面命令识别图片 `tesseract 图片路径 输出文件` ![](https://tcs.teambition.net/storage/312832a551e6e21e91e6fe5ff9f2f87adb30?Signature=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJBcHBJRCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9hcHBJZCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9vcmdhbml6YXRpb25JZCI6IiIsImV4cCI6MTYyODM5OTc5MiwiaWF0IjoxNjI3Nzk0OTkyLCJyZXNvdXJjZSI6Ii9zdG9yYWdlLzMxMjgzMmE1NTFlNmUyMWU5MWU2ZmU1ZmY5ZjJmODdhZGIzMCJ9.-hkiVaXXBw2fexvlcCopie9_54YqvRRQAz6mm8-5xcw&download=image.png "") 查看输出的 2.txt文件: ![](https://tcs.teambition.net/storage/31283c461c0310a477012f27323dd21c0675?Signature=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJBcHBJRCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9hcHBJZCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9vcmdhbml6YXRpb25JZCI6IiIsImV4cCI6MTYyODM5OTc5MiwiaWF0IjoxNjI3Nzk0OTkyLCJyZXNvdXJjZSI6Ii9zdG9yYWdlLzMxMjgzYzQ2MWMwMzEwYTQ3NzAxMmYyNzMyM2RkMjFjMDY3NSJ9.pcbyvUvYEqOM_ghhiTQo9rXc794tlIYVBd8umrXFc0Y&download=image.png "") ![](https://tcs.teambition.net/storage/3128a46416315f3fc960596ff597c3cc20cf?Signature=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJBcHBJRCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9hcHBJZCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9vcmdhbml6YXRpb25JZCI6IiIsImV4cCI6MTYyODM5OTc5MiwiaWF0IjoxNjI3Nzk0OTkyLCJyZXNvdXJjZSI6Ii9zdG9yYWdlLzMxMjhhNDY0MTYzMTVmM2ZjOTYwNTk2ZmY1OTdjM2NjMjBjZiJ9.n2kO4Kx9ONZpVrDog7h1FW-kx3WagWJ_R4Xwi1YjLXc&download=image.png "") ![](https://tcs.teambition.net/storage/31284ae15c5c1990b54b0b961714f4320d87?Signature=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJBcHBJRCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9hcHBJZCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9vcmdhbml6YXRpb25JZCI6IiIsImV4cCI6MTYyODM5OTc5MiwiaWF0IjoxNjI3Nzk0OTkyLCJyZXNvdXJjZSI6Ii9zdG9yYWdlLzMxMjg0YWUxNWM1YzE5OTBiNTRiMGI5NjE3MTRmNDMyMGQ4NyJ9.CgQfB6EYgIIQ_ovHMiIQdWqB3euBzP-lJ_xxQt5YGX4&download=image.png "") 结果正确! 打开命令终端,输入:`tesseract -v`,可以看到版本信息 用命令`tesseract --list-langs`来查看Tesseract-OCR支持语言。 ![](https://tcs.teambition.net/storage/3128fb9f6c80c2d3f547f80eb07ebfc2327f?Signature=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJBcHBJRCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9hcHBJZCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9vcmdhbml6YXRpb25JZCI6IiIsImV4cCI6MTYyODM5OTc5MiwiaWF0IjoxNjI3Nzk0OTkyLCJyZXNvdXJjZSI6Ii9zdG9yYWdlLzMxMjhmYjlmNmM4MGMyZDNmNTQ3ZjgwZWIwN2ViZmMyMzI3ZiJ9.l5bgSKyJ5WKBzkrDn6xQ_8OrZyM6_ahHxZFYHLhIU6k&download=image.png "") ## Tesseract OCR 语言包下载 esseract OCR语言包的下载地址 https://github.com/tesseract-ocr/tessdata ![](https://tcs.teambition.net/storage/3128f7ce2d4b5e592b73a148d03f67af199f?Signature=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJBcHBJRCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9hcHBJZCI6IjU5Mzc3MGZmODM5NjMyMDAyZTAzNThmMSIsIl9vcmdhbml6YXRpb25JZCI6IjVmYmU4NGVmYWNhNDBjMTZiMmRhZDhjOSIsImV4cCI6MTYyNzc5ODcyOSwiaWF0IjoxNjI3Nzk1MTI5LCJyZXNvdXJjZSI6Ii9zdG9yYWdlLzMxMjhmN2NlMmQ0YjVlNTkyYjczYTE0OGQwM2Y2N2FmMTk5ZiJ9.YXcqtqVY3c4Az3zkRnEzeYeLkFRPT6xZ_6tVxhsBDe4&download=image.png "") 写的比较好的文章 [Tesseract怎么识别中文_欧世乐-CSDN博客_tesseract 中文](https://blog.csdn.net/qq_43576028/article/details/102907722) [Windows下的Tesseract的配置安装与使用_欧世乐-CSDN博客_windows安装tesseract](https://blog.csdn.net/qq_43576028/article/details/102907170) 参考:1. [https://blog.csdn.net/qq_37193537/article/details/81335165__](https://blog.csdn.net/qq_37193537/article/details/81335165) [https://blog.csdn.net/weixin_43656359/article/details/103401848](https://blog.csdn.net/weixin_43656359/article/details/103401848) ![](https://img-blog.csdnimg.cn/20210430134609740.png#pic_center) 最后修改:2021 年 08 月 01 日 © 允许规范转载 打赏 赞赏作者 支付宝微信 赞 如果觉得我的文章对你有用,请随意赞赏