Omniscien » Language Studio » Features » OCR »
Best in class AI driven optical character recognition and machine translation deliver
image conversions to MS Office formats. image tables into Excel. PDF conversions into Word. searchable PDFs. translated images and PDFs.
Technical Features Overview
210 Written Languages
- Arabic, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Norwegian, Polish, Portuguese, Russian, Slovakian, Slovenian, Spanish, Swedish, Tamil, Thai, Turkish, Ukraine, Vietnamese, and more…
Automated Document Analysis
- Artificial intelligence and machine learning is applied for accuracy and document layout reconstruction
- Document layout reconstruction, incl. internal structure and formatting
- Detection and recreation of balanced columns of text
- Detection of tables and layout reconstruction ensures that tables (even ones without visible column borders) are processed correctly
- Accurate font detection and mapping
Powerful Server and API for Integration
- Enterprise class scalability and processing features.
- Scale to tens of thousands of pages per hour.
- Automatically identify the document’s language.
- Submit batch files for processing via API.
- Integrate your own applications and workflows seamlessly.
- Dynamically scale server resources up and down based on demand.
Advanced Image Pre-Processing
Image pre-processing increases the recognition accuracy by optimizing the image for OCR. Even low-quality images can deliver best OCR results after automated image correction steps are applied. Pre-processing features include:
- Auto-cropping and auto-splitting of dual pages
- Filtering of color stamps and marks, noise removal, and local contrast improvement
- Image mirroring, inverting, scaling, cropping and clipping
- Automated detection of page orientation (90, 180, and 270 degrees)
- Automated splitting of double-pages
- Camera OCR
- Deskew (up to +/- 20 degrees) and rotate images
- Automated distortion correction, image despeckling/clean-up, ISO noise reduction
- Despeckling images in individual blocks Texture filtering and Adaptive Binarization
- Adjusting text and background color
- Text line straightening
Unrivalled Photo Processing (Camera OCR)
Digital cameras, smartphones and tablets take pictures with suitable resolution and image quality, but typically have many device specific and user introduced distortions that makes reading the printed text difficult.
Artificial intelligence identifies images captured by a digital camera and implements special image processing algorithms to eliminate distortion on digital photos, such as blur, curved text lines and other errors caused by insufficient light.
- Correct image resolution
- Straighten curved lines
- Automatic 3D perspective distortions correction
Speed and Accuracy
A balance between speed and accuracy is achieved by optimizing the configuration to match your requirements.
- Switch between thorough or fast recognition modes.
- Consistently outperforms other OCR products for accuracy and document layout reconstruction in independent evaluations
- Uses the latest artificial intelligence and machine learning
- Integrated dictionaries are provided for many languages, with support for your own custom dictionaries and character patterns.
- When converting many pages such as complete document archives or books, developers can leverage the Language Studio’s flexible and scalable architecture
- Use multi-core CPUs and processing images in parallel on multiple threads, the OCR steps can be performed significantly faster
Understanding Core Technical Features
System Requirements
Align the system specification to your workload
For smaller and low processing volume deployments our out-of-the-box single server configuration should be sufficient for all features. For higher volumes and scalable deployments, the Omniscien team will guide you on the hardware requirements and specifications that match your anticipated workload.
Requirements Summary:
Feature | Description |
---|---|
Memory |
|
Hard Disk Space |
|
Other |
|
Fonts |
|
Translation and Natural Language Processing (NLP)
Get more value from OCR data with NLP and translation
Make your OCR apps smarter. Use Natural Language Processing tools to get more from your data. Easily enable applications to extract context, syntax, parts of speech, key terms, sentiment, meaning, summarize voice content, and even translate your OCR data and documents into other languages.
- Translate images into another language by automatically converting the image to a Microsoft Office using OCR and then translating it, keeping the layout, structure and fonts.
- Extract text, email addresses, URLs, etc.
- Extract key phrases and terminology
- Analyze sentiment, syntax, parts of speech, etc.
- Determine the language of a document
- Automatically detect and extract tables from images into Excel
Scalability
Scale to Thousands of Pages and Users
During the OCR process, a range of different algorithms are applied. They depend on image quality, document languages, layout complexity and number of pages in the document. Accordingly, such algorithms might require higher memory resources. It is recommended to set up the system in accordance with the outlined memory requirements to optimize the processing speed by allocating adequate system memory. The out-of-the-box single-server configuration is suitable for smaller organizations. The Omniscien team will guide you on deployments that have higher demands.
- With built in load balancing, Language Studio can scale servers up on-demand to meet even the highest of loads.
- Language Studio’s architecture is designed for high-availability and scaling. Learn more >>
RESTful API
Integrate OCR into your Applications
Use the RESTful APIs to power your applications with Language Studio’s artificial intelligence based tools.
- Process multiple files concurrently by submitting them via the batch mode API
- A large array of settings can be configured for processing and for output document format control
Input Image and Document Formats
A wide variety of image formats are supported
Note: Images must be no larger than 32,512 * 32,512 pixels.
File Extension | Description |
---|---|
BMP | BMP
|
BMP | BMP
|
DCX | DCX
|
GIF | GIF
|
JB2 | JBIG2
|
JPG, JPEG, JFIF | Joint Photographic Experts Group gray, color |
JP2, JPC, J2K | JPEG 2000
|
PCX | PiCture eXchange
|
PDF
| |
PNG | Portable Network Graphic
|
TIF, TIFF | Tagged Image File Format
|
Output Document Formats
Output to a wide range of formats
Language Studio can save the recognized text in the following formats:
File Extension | Description |
---|---|
XML* | ALTO (Analyzed Layout and Text Object) *ALTO3.0 -is an XML Schema that details technical metadata for describing the layout and content of physical text resources, such as pages of a book or a newspaper |
CSV | Comma Separated Values
|
DOCX / DOC | Microsoft Word
|
EPUB | Electronic Publisher Format
|
FB2 | FictionBook 2.0
|
HTML / HTML5 | Hyper Text Markup Language
|
ODT | OpenDocument Text Document
|
Portable Document Format
| |
RTF | Rich Text Format |
TXT | Plain Text
|
XLSX / XLS | Microsoft Excel
|
XML | Extensible Markup Language
|