OCR

OCR



OCR

OCR Plugin steps have been designed to convert images to text with tools using Optical Character Recognition technology.


OCR: Google Vision 

Description

OCR: Google Vision plugin step detects and extracts text from an image and provides text output in JSON format. 

Prerequisites:

  1. Create a Google Cloud Vision API key

https://cloud.google.com/docs/authentication/api-keys?hl=en&visit_id=637051029162974596-3924725435&rd=1#creating_an_api_key

  1. Add restrictions to API keys

https://cloud.google.com/docs/authentication/api-keys#api_key_restrictions

  1. Fill the details under the following as seen in the snapshot below,

Billing -> Payment Settings and Billing -> Payment Method for API Key to work. 




Configurations 

No.

Field Name

Description

1

Step Name

Name of the step.  This name has to be unique in a single workflow.

2

API Key:


3

Accept Value as variable/static

Leave checkbox unchecked to accept API Key value from a field in the previous steps of the stream using a drop down list. 

Else enable checkbox for API Key field to appear as Text box.

4

API Key

Specify the API Key for authentication to Google Cloud Platform. This field is mandatory. API Key is encrypted and is not stored in the .psw file.


API Key is entered using a widget. The widget handles both Text (static value or environment variable) and Combo (drop down containing values from previous steps). If checkbox above is enabled API Key field appears as Text box. Else if checkbox above is disabled API Key field appears as a drop down to select fields from previous steps.

5

Button: Test Connection

Test connection with the API provided. Verifies whether the connection is available or not.


Note: If the connection fields are provided from previous step, then Test Connection Button does not work.


Input Tab:

No.

Field Name

Description


Input Fields:


1

Path/URL

Specify the path of the image file to be converted to text or click the Browse button to browse the file path.

2

Button: Browse

Clicking on this button brings up the dialog to browse the image file to be converted to text format.

3

Type

Specify an annotation features that support optical character recognition (OCR). Specify one of the following annotation features,

  1. ‘TEXT_DETECTION’ detects and extracts text from any image. For example, a photograph might contain a street sign or traffic sign. The JSON includes the entire extracted string, as well as individual words, and their bounding boxes.
  2. ‘DOCUMENT_TEXT_DETECTION’ also extracts text from an image, but the response is optimized for dense text and documents. The JSON includes page, block, paragraph, word, and break information. 
  3. ‘OBJECT_LOCALIZATION’

Detects multiple objects in an image and provides information about the objects and where the object was found in the image.


Output Tab:

No.

Field Name

Description


Output Fields:


1

Result

Specify an output field to hold converted json text on successful plugin execution. The default value is OutputText.


Common Buttons:

No.

Field Name

Description


Buttons:


1

OK

On click of this button. It will check the field values.  If any required field values are missing, then it will display validation error message.

If all the required field values are provided then it will save the field values.

2

Cancel

On click of this button, it will cancel the window and do not save any values.




OCR: Tesseract 

Description

OCR: Tesseract plugin step detects and extracts text from an image to a readable text type. Supported image types: BMP, PNG, JPG, JPEG.

Compatibility: Tesseract version 4.0.0.


Prerequisites:

  1. Download tessdata(tesseract-ocr) version 4.0.0.
  2. https://github.com/tesseract-ocr/tessdata
  3. After download, extract it and put it on the processing machine on a particular path. You will need to specify this path in the ‘Data Folder Path’ in the step.
  4. Install Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017, and 2019 (32 bit & 64 bit)
  1. https://aka.ms/vs/16/release/vc_redist.x64.exe or
  2. https://aka.ms/vs/16/release/vc_redist.x86.exe


Configurations 

No.

Field Name

Description

1

Step Name

Name of the step.  This name has to be unique in a single workflow.


Input Tab:

No.

Field Name

Description


Input Fields:


1

Data Folder Path

Specify the Tesseract data folder path or click the Browse button to browse the folder path (data folder path is mentioned in the prerequisites).

The data type is String. This field is mandatory.

2

Button: Browse

Clicking on this button brings up the dialog to browse the Tesseract data folder path. 

3

File Path

Specify the path of the input image file to extract readable text. Alternately browse the file path.

Note: Supported image types are BMP, PNG, JPG, JPEG

The data type is String. This field is mandatory.

4

Button: Browse

Clicking on this button brings up the dialog to browse the image File path. 

5

Language Code

Specify Language. (e.g. eng for English, hin for Hindi, urd for Urdu). Multiple languages can be passed. Add ‘+’ sign to extract multi-language output.


For language code refer URL:

https://muthu.co/all-tesseract-ocr-options/ 

Default value is: eng. The data type is String.

6

Page Segment Mode

Select Page Segmentation Mode required as per the input file type. Allowed values are 0-13. The data type is String.

Please refer table below for a list of Page Segmentation Mode with a description. 


Sr. No.

Page Segment Mode

Description

1

0

Orientation and script detection (OSD) only.

2

1

Automatic page segmentation with OSD.

3

2

Automatic page segmentation, but no OSD, or OCR.

4

3

Fully automatic page segmentation, but no OSD. (Default)

5

4

Assume a single column of text of variable sizes.

6

5

Assume a single uniform block of vertically aligned text.

7

6

Assume a single uniform block of text.

8

7

Treat the image as a single text line.

9

8

Treat the image as a single word.

10

9

Treat the image as a single word in a circle.

11

10

Treat the image as a single character.

12

11

Sparse text. Find as much text as possible in no particular order.

13

12

Sparse text with OSD.

14

13

Raw line. Treat the image as a single text line, bypassing hacks that are Tesseract-specific.


Output Tab:

No.

Field Name

Description


Output Field:


1

Output Text

Specify an output field to hold converted text on successful plugin execution. The default value is OutputText.


Common Buttons:

No.

Field Name

Description


Buttons:


1

OK

On click of this button. It will check the field values.  If any required field values are missing then it will display validation error message.

If all the required field values are provided then it will save the field values.

2

Cancel

On click of this button, it will cancel the window and do not save any values







      Links to better reach 

            Bot Store

             EPD