OCR (Optical Character Recognition) – a technical overview

Optical Character Recognition (OCR) is software that assists in reading text, translating, and converting an image into a text file. OCR system comprises of the optical scanner for reading the text and disenchanted software for converting an image to a text. This software ease reading of complicated letters, books, and journals. OCR has the capability of reading text in large various fonts. Unfortunately, OCR does not provide a great support to handwritten text files. This article will provide you with the technology behind OCR, an essential element of OCR, principles based on OCR, and how one can convert scanned image to the editable file using OCR.

Technology Behind Optical Character Recognition

The most advanced type of OCR currently is ABBY FineReader OCR. Usually, OCR works with three basic principles- integrity, purposefulness, and adaptability. OCR is easy to use and consist of three steps which are scanning a document, recognizing it, and store it in the right format (RFT, XLS, PDF, and TXT). How does OCR recognize text? It isolates document pages to an element like the block of text, image or even tables. The line is partitioned to the word and then to the character for them to be recognized. After character have been recommended the software do a comparison with a set of the image pattern. It then improves diverse hypothesis about character recommended. Regarding this hypothesis, the program inquires different effect of subdividing lines into word and word into characters. After processing colossal numbers of such probabilistic hypothesis the software proceeds with its decision of presenting you with the recognized text. Modern OCR such as ABBY FineReader can support 45+ languages from the dictionary. This facilitates auxiliary inquiry of text element on word unit. It ensures more accurate inquiry and recognition of document and assists in getting text information from the complex document. These are the three basic principles of equipped on the OCR with maximum reliability and brilliance that make it possible for human recognition.

Essential Element for OCR

The essential elements are scanning and recognition. Both two elements involve various procedures.

Recognition: images captured through scanning digital camera can be as well recognized by OCR for them to be converted to text form. A digital camera needs to have bright light for their images be recognized by OCR. Modern OCR such as ABBY FineReader has dependable recognition technology that targets processing camera images. They have been well built to counter image bias at the edge, perfect recognition.

Scanning: this software can scan to two types; scanning to pdf and scanning to a word. Scanning to pdf ensure layout accuracy. Scanned document retain original outlook on screen resembling virtual photocopy. One can click on a single word or listen to overall document. Scan to a word is done for flexibility thus providing power to edit and change text layout

Converting Scanned Images to Editable File

It involves various steps when converting scanned images from source to editable file using OCR.

Step 1:

Involves detecting the direction of the text. The scanned image is never perfectly aligned hence you slightly need to rotate scanned images so that text line become 100 % horizontal.

Step 2:

Involves discovering whether the text is unit column or double column

Step 3:

In every column, you need to locate ‘baseline’ position of the consecutive text line. Double column text needs to be changed to a single column “long ribbon of printed character”. The format used is black and white.

Step 4:

Express this ribbon into unity character by recognizing vertical stripes of the white pixel. Each “token” is of rectangular mini-image of black/white pixel. In case two tokens are sandwiched by more than average white space you can add “space” token

Step 5:

Go through the token, comparing one by one with a pixel of known characters (letters, number, etc.). Find the length of a token and each of selected templates. You require selecting short length character being the right one. But in case this step doesn’t return token for you take unit character but instead probabilities.

Step 6:

Sum variety of probabilities called language model which is always specific to a language, e.g. English. Example, likely previous letter “who” is accompanied by the letter “I” or digit “1” which has same probability in the next token. However, language model will lean on the letter”I” instead of “1”. OCR lacking this step typically produce many non-sensical word –language free but the one with language model produce ideal transcript no matter the blurred image( image captured with an out of focus camera or the one printed both side of paper where text of back can penetrate through)

To convert your scanned images into an editable format, you can use our online OCR Conversion tool.

 

Leave a comment