Comparison of optical character recognition software


This '''comparison of optical character recognition software includes:
NameFounded yearLatest stable versionRelease yearLicenseOnlineWindowsMac OS XLinuxBSDProgramming languageSDK?LanguagesFontsOutput FormatsNotes
Google Drive OCR or Google Cloud Vision2015YesBrowserBrowserBrowserUnknownUnknownYes200+All fontstextGoogle blog post
Tesseract19854.1.12019C++, C100+Any printed fontText, ALTO, hOCR, PDF, others with different user interfaces or the APICreated by Hewlett-Packard; under further development by Google
ABBYY FineReader1989152019C/C++192All fontsDOC, DOCX, XLS, XLSX, PPTX, RTF, PDF, HTML, CSV, TXT, ODT, DjVu, EPUB, FB2ABBYY also supplies SDKs for embedded and mobile devices. Professional, Corporate and Site License Editions for Windows, Express Edition for Mac.
E-aksharayan201014RTF, TXT, BRL
Asprise OCR SDK1998152015Java, C#,VB.NET, C/C++/Delphi20+Plain text, searchable PDF, XMLJava, C#, VB.NET, C/C++/Delphi SDKs for OCR and Barcode recognition on Windows, Linux, Mac OS X and Unix.
AnyDoc Software1989VBScriptWorks with structured, semi-structured, and unstructured documents.
CuneiForm19961.12011-04-19C/C++28Any printed fontHTML, hOCR, native, RTF, TeX, TXTEnterprise-class system, can save text formatting and recognizes complicated tables of any structure
Dynamsoft OCR SDK20038.22012C/C++40+PDF, TXT
OmniPage1970s19.22015C/C++, C#125Machine and handprinted fontsDOC/DOCX XLS/XLSX PPTX RTF PDF PDF/A Searchable PDF HTML Text XML ePUB MP3Product of Nuance Communications
Microsoft Office OneNote 200720112007
GOCR20000.522018-10-15C20+
Ocrad0.262017-03-31C++Latin alphabetCommand line
SmartScore199110.5.82015-07For musical scores
Microsoft Office Document ImagingOffice 20072007Uses OmniPage
Puma.NET2009-10-29C#28Any printed font.NET OCR SDK based on Cognitive Technologies' CuneiForm recognition engine. Wraps Puma COM server and provides simplified API for.NET applications
ReadSoft14Scan, capture and classify business documents such as invoices, forms and purchase orders integrated with business processes.
ScantronFor working with localized interfaces, corresponding language support is required.
OCRFeeder2009-030.8.12014-12-22PythonFeatures a full user interface and has a command-line tool for automatic operations. Has its own segmentation algorithm but uses system-wide OCR engines like Tesseract or Ocrad
OCRopus20071.3.32017-12-16PythonAll languages using Latin script Normal Latin script and Fraktur TXT, hOCR, PDFPluggable framework under active development, used for Google Books
OCRvision201990+Searchable PDF
NameFounded yearLatest stable versionRelease yearLicenseOnlineWindowsMac OS XLinuxBSDProgramming languageSDK?LanguagesFontsOutput FormatsNotes

Evaluation

An analysis of the accuracy and reliability of the OCR packages Google Docs OCR, Tesseract, ABBYY FineReader, and Transym, employing a dataset including 1227 images from 15 different categories concluded Google Docs OCR and ABBYY to be performing better than others.