OCR API API Reference

The powerful Optical Character Recognition (OCR) APIs let you convert scanned images of pages into recognized text.

Swagger OpenAPI Specification | .NET Framework Client | .NET Core Client | Java Client | Node.JS Client | Python Client | Drupal Client

API Endpoint
https://api.cloudmersive.com
Schemes: https
Version: v1

Authentication

Apikey

API Key Authentication

type
apiKey
name
Apikey
in
header

ImageOcr

Convert a scanned image into text

POST /ocr/image/toText


Converts an uploaded image in common formats such as JPEG, PNG into text via Optical Character Recognition. This API is intended to be run on scanned documents. If you want to OCR photos (e.g. taken with a smart phone camera), be sure to use the photo/toText API instead, as it is designed to unskew the image first.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

recognitionMode: string
in header

Optional; possible values are 'Basic' which provides basic recognition and is not resillient to page rotation, skew or low quality images uses 1-2 API calls; 'Normal' which provides highly fault tolerant OCR recognition uses 26-30 API calls; and 'Advanced' which provides the highest quality and most fault-tolerant recognition uses 28-30 API calls. Default recognition mode is 'Advanced'

language: string
in header

Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)

preprocessing: string
in header

Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image before OCR is applied; this is recommended).

Code Example:
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "MeanConfidenceLevel": "number (float)",
  "TextResult": "string"
}

Convert a scanned image into words with location

POST /ocr/image/to/words-with-location


Converts an uploaded image in common formats such as JPEG, PNG into words/text with location information and other metdata via Optical Character Recognition. This API is intended to be run on scanned documents. If you want to OCR photos (e.g. taken with a smart phone camera), be sure to use the photo/toText API instead, as it is designed to unskew the image first.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

language: string
in header

Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)

preprocessing: string
in header

Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image before OCR is applied; this is recommended).

Code Example:
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "Successful": "boolean",
  "Words": [
    {
      "WordText": "string",
      "LineNumber": "integer (int32)",
      "WordNumber": "integer (int32)",
      "XLeft": "integer (int32)",
      "YTop": "integer (int32)",
      "Width": "integer (int32)",
      "Height": "integer (int32)",
      "ConfidenceLevel": "number (double)",
      "BlockNumber": "integer (int32)",
      "ParagraphNumber": "integer (int32)",
      "PageNumber": "integer (int32)"
    }
  ]
}

Convert a scanned image into words with location

POST /ocr/image/to/lines-with-location


Converts an uploaded image in common formats such as JPEG, PNG into lines/text with location information and other metdata via Optical Character Recognition. This API is intended to be run on scanned documents. If you want to OCR photos (e.g. taken with a smart phone camera), be sure to use the photo/toText API instead, as it is designed to unskew the image first.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

language: string
in header

Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)

preprocessing: string
in header

Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image before OCR is applied; this is recommended).

Code Example:
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "Successful": "boolean",
  "Lines": [
    {
      "LineText": "string",
      "Words": [
        {
          "WordText": "string",
          "LineNumber": "integer (int32)",
          "WordNumber": "integer (int32)",
          "XLeft": "integer (int32)",
          "YTop": "integer (int32)",
          "Width": "integer (int32)",
          "Height": "integer (int32)",
          "ConfidenceLevel": "number (double)",
          "BlockNumber": "integer (int32)",
          "ParagraphNumber": "integer (int32)",
          "PageNumber": "integer (int32)"
        }
      ]
    }
  ]
}

Convert a photo of a document into text

POST /ocr/photo/toText


Converts an uploaded photo of a document in common formats such as JPEG, PNG into text via Optical Character Recognition. This API is intended to be run on photos of documents, e.g. taken with a smartphone and supports cases where other content, such as a desk, are in the frame and the camera is crooked. If you want to OCR a scanned image, use the image/toText API call instead as it is designed for scanned images.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

recognitionMode: string
in header

Optional; possible values are 'Basic' which provides basic recognition and is not resillient to page rotation, skew or low quality images uses 1-2 API calls; 'Normal' which provides highly fault tolerant OCR recognition uses 26-30 API calls; and 'Advanced' which provides the highest quality and most fault-tolerant recognition uses 28-30 API calls. Default recognition mode is 'Advanced'

language: string
in header

Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)

Code Example:
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "MeanConfidenceLevel": "number (float)",
  "TextResult": "string"
}

Convert a photo of a document or receipt into words with location

POST /ocr/photo/to/words-with-location


Converts a photo of a document or receipt in common formats such as JPEG, PNG into words/text with location information and other metdata via Optical Character Recognition. This API is intended to be run on photographs of documents. If you want to OCR scanned documents (e.g. taken with a scanner), be sure to use the image/toText API instead, as it is designed for that use case.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

recognitionMode: string
in header

Optional; possible values are 'Normal' which provides highly fault tolerant OCR recognition uses 26-30 API calls; and 'Advanced' which provides the highest quality and most fault-tolerant recognition uses 28-30 API calls. Default recognition mode is 'Advanced'

language: string
in header

Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)

preprocessing: string
in header

Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image before OCR is applied; this is recommended).

diagnostics: string
in header

Optional, diagnostics mode, default is 'false'. Possible values are 'true' (will set DiagnosticImage to a diagnostic PNG image in the result), and 'false' (no diagnostics are enabled; this is recommended for best performance).

Code Example:
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "Successful": "boolean",
  "TextElements": [
    {
      "Text": "string",
      "XLeft": "integer (int32)",
      "YTop": "integer (int32)",
      "Width": "integer (int32)",
      "Height": "integer (int32)",
      "BoundingPoints": [
        {
          "X": "integer (int32)",
          "Y": "integer (int32)"
        }
      ],
      "ConfidenceLevel": "number (double)"
    }
  ],
  "DiagnosticImage": "string (byte)"
}

Recognize a photo of a receipt, extract key business information

POST /ocr/photo/recognize/receipt


Analyzes a photograph of a receipt as input, and outputs key business information such as the name of the business, the address of the business, the phone number of the business, the total of the receipt, the date of the receipt, and more.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

recognitionMode: string
in header

Optional, enable advanced recognition mode by specifying 'Advanced', enable handwriting recognition by specifying 'EnableHandwriting'. Default is disabled.

language: string
in header

Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)

preprocessing: string
in header

Optional, preprocessing mode, default is 'None'. Possible values are None (no preprocessing of the image), and 'Advanced' (automatic image enhancement of the image before OCR is applied; this is recommended and needed to handle rotated receipts).

Code Example:
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "Successful": "boolean",
  "Timestamp": "string (date-time)",
  "BusinessName": "string",
  "BusinessWebsite": "string",
  "AddressString": "string",
  "PhoneNumber": "string",
  "ReceiptItems": [
    {
      "ItemDescription": "string",
      "ItemPrice": "number (double)"
    }
  ],
  "ReceiptSubTotal": "number (double)",
  "ReceiptTotal": "number (double)"
}

Recognize a photo of a business card, extract key business information

POST /ocr/photo/recognize/business-card


Analyzes a photograph of a business card as input, and outputs key business information such as the name of the person, name of the business, the address of the business, the phone number, the email address and more.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

Code Example:
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "Successful": "boolean",
  "PersonName": "string",
  "PersonTitle": "string",
  "BusinessName": "string",
  "AddressString": "string",
  "PhoneNumber": "string",
  "EmailAddress": "string",
  "Timestamp": "string (date-time)"
}

Recognize a photo of a form, extract key fields and business information

POST /ocr/photo/recognize/form


Analyzes a photograph of a form as input, and outputs key business fields and information. Customzie data to be extracted by defining fields for the form.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

formTemplateDefinition: application/json
in header

Form field definitions

recognitionMode: string
in header

Optional, enable advanced recognition mode by specifying 'Advanced', enable handwriting recognition by specifying 'EnableHandwriting'. Default is disabled.

preprocessing: string
in header

Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image - including automatic unrotation of the image - before OCR is applied; this is recommended). Set this to 'None' if you do not want to use automatic image unrotation and enhancement.

diagnostics: string
in header

Optional, diagnostics mode, default is 'false'. Possible values are 'true' (will set DiagnosticImage to a diagnostic PNG image in the result), and 'false' (no diagnostics are enabled; this is recommended for best performance).

language: string
in header

Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)

Code Example:
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "Successful": "boolean",
  "FieldValueExtractionResult": [
    {
      "TargetField": {
        "FieldID": "string",
        "LeftAnchor": "string",
        "TopAnchor": "string",
        "BottomAnchor": "string",
        "AlternateAnchor": "string",
        "AnchorMode": "string",
        "DataType": "string",
        "TargetDigitCount": "integer (int32)",
        "MinimumCharacterCount": "integer (int32)",
        "AllowNumericDigits": "boolean",
        "VerticalAlignmentType": "string",
        "HorizontalAlignmentType": "string",
        "TargetFieldWidth_Relative": "number (double)",
        "TargetFieldHeight_Relative": "number (double)",
        "TargetFieldHorizontalAdjustment": "number (double)",
        "TargetFieldVerticalAdjustment": "number (double)",
        "Ignore": [
          "string"
        ],
        "Options": "string"
      },
      "FieldValues": [
        {
          "Text": "string",
          "XLeft": "integer (int32)",
          "YTop": "integer (int32)",
          "Width": "integer (int32)",
          "Height": "integer (int32)",
          "BoundingPoints": [
            {
              "X": "integer (int32)",
              "Y": "integer (int32)"
            }
          ],
          "ConfidenceLevel": "number (double)"
        }
      ]
    }
  ],
  "TableValueExtractionResults": [
    {
      "TableDefinition": {
        "TableID": "string",
        "ColumnDefinitions": [
          {
            "ColumnID": "string",
            "TopAnchor": "string",
            "AnchorMode": "string",
            "DataType": "string",
            "MinimumCharacterCount": "integer (int32)",
            "AllowNumericDigits": "boolean"
          }
        ],
        "TargetTableHeight_Relative": "number (double)",
        "TargetRowHeight_Relative": "number (double)"
      },
      "TableRowsResult": [
        {
          "TableRowCellsResult": [
            {
              "ColumnID": "string",
              "CellValues": [
                {
                  "Text": "string",
                  "XLeft": "integer (int32)",
                  "YTop": "integer (int32)",
                  "Width": "integer (int32)",
                  "Height": "integer (int32)",
                  "BoundingPoints": [
                    {
                      "X": "integer (int32)",
                      "Y": "integer (int32)"
                    }
                  ],
                  "ConfidenceLevel": "number (double)"
                }
              ]
            }
          ]
        }
      ]
    }
  ],
  "Diagnostics": [
    "string"
  ],
  "BestMatchFormSettingName": "string"
}

Recognize a photo of a form, extract key fields using stored templates

POST /ocr/photo/recognize/form/advanced


Analyzes a photograph of a form as input, and outputs key business fields and information. Customzie data to be extracted by defining fields for the form. Uses template definitions stored in Cloudmersive Configuration; to configure stored templates in a configuration bucket, log into Cloudmersive Management Portal and navigate to Settings > API Configuration > Create Bucket



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

bucketID: string
in header

Bucket ID of the Configuration Bucket storing the form templates

bucketSecretKey: string
in header

Bucket Secret Key of the Configuration Bucket storing the form templates

recognitionMode: string
in header

Optional, enable advanced recognition mode by specifying 'Advanced', enable handwriting recognition by specifying 'EnableHandwriting'. Default is disabled.

preprocessing: string
in header

Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image - including automatic unrotation of the image - before OCR is applied; this is recommended). Set this to 'None' if you do not want to use automatic image unrotation and enhancement.

diagnostics: string
in header

Optional, diagnostics mode, default is 'false'. Possible values are 'true' (will set DiagnosticImage to a diagnostic PNG image in the result), and 'false' (no diagnostics are enabled; this is recommended for best performance).

Code Example:
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "Successful": "boolean",
  "FieldValueExtractionResult": [
    {
      "TargetField": {
        "FieldID": "string",
        "LeftAnchor": "string",
        "TopAnchor": "string",
        "BottomAnchor": "string",
        "AlternateAnchor": "string",
        "AnchorMode": "string",
        "DataType": "string",
        "TargetDigitCount": "integer (int32)",
        "MinimumCharacterCount": "integer (int32)",
        "AllowNumericDigits": "boolean",
        "VerticalAlignmentType": "string",
        "HorizontalAlignmentType": "string",
        "TargetFieldWidth_Relative": "number (double)",
        "TargetFieldHeight_Relative": "number (double)",
        "TargetFieldHorizontalAdjustment": "number (double)",
        "TargetFieldVerticalAdjustment": "number (double)",
        "Ignore": [
          "string"
        ],
        "Options": "string"
      },
      "FieldValues": [
        {
          "Text": "string",
          "XLeft": "integer (int32)",
          "YTop": "integer (int32)",
          "Width": "integer (int32)",
          "Height": "integer (int32)",
          "BoundingPoints": [
            {
              "X": "integer (int32)",
              "Y": "integer (int32)"
            }
          ],
          "ConfidenceLevel": "number (double)"
        }
      ]
    }
  ],
  "TableValueExtractionResults": [
    {
      "TableDefinition": {
        "TableID": "string",
        "ColumnDefinitions": [
          {
            "ColumnID": "string",
            "TopAnchor": "string",
            "AnchorMode": "string",
            "DataType": "string",
            "MinimumCharacterCount": "integer (int32)",
            "AllowNumericDigits": "boolean"
          }
        ],
        "TargetTableHeight_Relative": "number (double)",
        "TargetRowHeight_Relative": "number (double)"
      },
      "TableRowsResult": [
        {
          "TableRowCellsResult": [
            {
              "ColumnID": "string",
              "CellValues": [
                {
                  "Text": "string",
                  "XLeft": "integer (int32)",
                  "YTop": "integer (int32)",
                  "Width": "integer (int32)",
                  "Height": "integer (int32)",
                  "BoundingPoints": [
                    {
                      "X": "integer (int32)",
                      "Y": "integer (int32)"
                    }
                  ],
                  "ConfidenceLevel": "number (double)"
                }
              ]
            }
          ]
        }
      ]
    }
  ],
  "Diagnostics": [
    "string"
  ],
  "BestMatchFormSettingName": "string"
}

PdfOcr

Converts an uploaded PDF file into text via Optical Character Recognition.

POST /ocr/pdf/toText


imageFile: file
in formData

PDF file to perform OCR on.

recognitionMode: string
in header

Optional; possible values are 'Basic' which provides basic recognition and is not resillient to page rotation, skew or low quality images uses 1-2 API calls per page; 'Normal' which provides highly fault tolerant OCR recognition uses 26-30 API calls per page; and 'Advanced' which provides the highest quality and most fault-tolerant recognition uses 28-30 API calls per page. Default recognition mode is 'Basic'

language: string
in header

Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)

preprocessing: string
in header

Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image before OCR is applied; this is recommended).

Code Example:

OK

Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "Successful": "boolean",
  "OcrPages": [
    {
      "PageNumber": "integer (int32)",
      "MeanConfidenceLevel": "number (float)",
      "TextResult": "string"
    }
  ]
}

Convert a PDF into words with location

POST /ocr/pdf/to/words-with-location


Converts a PDF into words/text with location information and other metdata via Optical Character Recognition. This API is intended to be run on scanned documents. If you want to OCR photos (e.g. taken with a smart phone camera), be sure to use the photo/toText API instead, as it is designed to unskew the image first.



imageFile: file
in formData

PDF file to perform OCR on.

language: string
in header

Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)

preprocessing: string
in header

Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image before OCR is applied; this is recommended).

Code Example:
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "Successful": "boolean",
  "OcrPages": [
    {
      "PageNumber": "integer (int32)",
      "Successful": "boolean",
      "Words": [
        {
          "WordText": "string",
          "LineNumber": "integer (int32)",
          "WordNumber": "integer (int32)",
          "XLeft": "integer (int32)",
          "YTop": "integer (int32)",
          "Width": "integer (int32)",
          "Height": "integer (int32)",
          "ConfidenceLevel": "number (double)",
          "BlockNumber": "integer (int32)",
          "ParagraphNumber": "integer (int32)",
          "PageNumber": "integer (int32)"
        }
      ]
    }
  ]
}

Convert a PDF into text lines with location

POST /ocr/pdf/to/lines-with-location


Converts a PDF into lines/text with location information and other metdata via Optical Character Recognition. This API is intended to be run on scanned documents. If you want to OCR photos (e.g. taken with a smart phone camera), be sure to use the photo/toText API instead, as it is designed to unskew the image first.



imageFile: file
in formData

PDF file to perform OCR on.

language: string
in header

Optional, language of the input document, default is English (ENG). Possible values are ENG (English), ARA (Arabic), ZHO (Chinese - Simplified), ZHO-HANT (Chinese - Traditional), ASM (Assamese), AFR (Afrikaans), AMH (Amharic), AZE (Azerbaijani), AZE-CYRL (Azerbaijani - Cyrillic), BEL (Belarusian), BEN (Bengali), BOD (Tibetan), BOS (Bosnian), BUL (Bulgarian), CAT (Catalan; Valencian), CEB (Cebuano), CES (Czech), CHR (Cherokee), CYM (Welsh), DAN (Danish), DEU (German), DZO (Dzongkha), ELL (Greek), ENM (Archaic/Middle English), EPO (Esperanto), EST (Estonian), EUS (Basque), FAS (Persian), FIN (Finnish), FRA (French), FRK (Frankish), FRM (Middle-French), GLE (Irish), GLG (Galician), GRC (Ancient Greek), HAT (Hatian), HEB (Hebrew), HIN (Hindi), HRV (Croatian), HUN (Hungarian), IKU (Inuktitut), IND (Indonesian), ISL (Icelandic), ITA (Italian), ITA-OLD (Old - Italian), JAV (Javanese), JPN (Japanese), KAN (Kannada), KAT (Georgian), KAT-OLD (Old-Georgian), KAZ (Kazakh), KHM (Central Khmer), KIR (Kirghiz), KOR (Korean), KUR (Kurdish), LAO (Lao), LAT (Latin), LAV (Latvian), LIT (Lithuanian), MAL (Malayalam), MAR (Marathi), MKD (Macedonian), MLT (Maltese), MSA (Malay), MYA (Burmese), NEP (Nepali), NLD (Dutch), NOR (Norwegian), ORI (Oriya), PAN (Panjabi), POL (Polish), POR (Portuguese), PUS (Pushto), RON (Romanian), RUS (Russian), SAN (Sanskrit), SIN (Sinhala), SLK (Slovak), SLV (Slovenian), SPA (Spanish), SPA-OLD (Old Spanish), SQI (Albanian), SRP (Serbian), SRP-LAT (Latin Serbian), SWA (Swahili), SWE (Swedish), SYR (Syriac), TAM (Tamil), TEL (Telugu), TGK (Tajik), TGL (Tagalog), THA (Thai), TIR (Tigrinya), TUR (Turkish), UIG (Uighur), UKR (Ukrainian), URD (Urdu), UZB (Uzbek), UZB-CYR (Cyrillic Uzbek), VIE (Vietnamese), YID (Yiddish)

preprocessing: string
in header

Optional, preprocessing mode, default is 'Auto'. Possible values are None (no preprocessing of the image), and Auto (automatic image enhancement of the image before OCR is applied; this is recommended).

Code Example:
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "Successful": "boolean",
  "OcrPages": [
    {
      "PageNumber": "integer (int32)",
      "Successful": "boolean",
      "Lines": [
        {
          "LineText": "string",
          "Words": [
            {
              "WordText": "string",
              "LineNumber": "integer (int32)",
              "WordNumber": "integer (int32)",
              "XLeft": "integer (int32)",
              "YTop": "integer (int32)",
              "Width": "integer (int32)",
              "Height": "integer (int32)",
              "ConfidenceLevel": "number (double)",
              "BlockNumber": "integer (int32)",
              "ParagraphNumber": "integer (int32)",
              "PageNumber": "integer (int32)"
            }
          ]
        }
      ]
    }
  ]
}

Preprocessing

Convert an image of text into a binarized (light and dark) view

POST /ocr/preprocessing/image/binarize


Perform an adaptive binarization algorithm on the input image to prepare it for further OCR operations.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

Code Example:
200 OK

OK

type
string (byte)
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
"string (byte)"

Convert an image of text into a binary (light and dark) view with ML

POST /ocr/preprocessing/image/binarize/advanced


Perform an advanced adaptive, Deep Learning-based binarization algorithm on the input image to prepare it for further OCR operations. Provides enhanced accuracy than adaptive binarization. Image will be upsampled to 300 DPI if it has a DPI below 300.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

Code Example:
200 OK

OK

type
string (byte)
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
"string (byte)"

Get the angle of the page / document / receipt

POST /ocr/preprocessing/image/get-page-angle


Analyzes a photo or image of a document and identifies the rotation angle of the page.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

Code Example:

OK

Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
  "Successful": "boolean",
  "Angle": "number (double)"
}

Detect and unrotate a document image

POST /ocr/preprocessing/image/unrotate


Detect and unrotate an image of a document (e.g. that was scanned at an angle). Great for document scanning applications; once unskewed, this image is perfect for converting to PDF using the Convert API or optical character recognition using the OCR API.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

Code Example:
200 OK

OK

type
string (byte)
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
"string (byte)"

Detect and unrotate a document image (advanced)

POST /ocr/preprocessing/image/unrotate/advanced


Detect and unrotate an image of a document (e.g. that was scanned at an angle) using deep learning. Great for document scanning applications; once unskewed, this image is perfect for converting to PDF using the Convert API or optical character recognition using the OCR API.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

Code Example:
200 OK

OK

type
string (byte)
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
"string (byte)"

Detect and unskew a photo of a document

POST /ocr/preprocessing/image/unskew


Detect and unskew a photo of a document (e.g. taken on a cell phone) into a perfectly square image. Great for document scanning applications; once unskewed, this image is perfect for converting to PDF using the Convert API or optical character recognition using the OCR API.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

Code Example:
200 OK

OK

type
string (byte)
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
"string (byte)"

Receipts

Convert a photo of a receipt into a CSV file containing structured information from the receipt

POST /ocr/receipts/photo/to/csv


Leverage Deep Learning to automatically turn a photo of a receipt into a CSV file containing the structured information from the receipt.



imageFile: file
in formData

Image file to perform OCR on. Common file formats such as PNG, JPEG are supported.

Code Example:
200 OK

OK

type
object
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
"object"

Schema Definitions

ImageToTextResponse: object

Response from an OCR to text operation. Includes the confience rating and converted text result.

MeanConfidenceLevel: number (float)

Confidence level rating of the OCR operation; ratings above 80% are strong.

TextResult: string

Converted text string from the image input.

Example
{
  "MeanConfidenceLevel": "number (float)",
  "TextResult": "string"
}

ImageToWordsWithLocationResult: object

Result of an image to words-with-location OCR operation

Successful: boolean

True if successful, false otherwise

Words: OcrWordElement

Word elements in the image

OcrWordElement
Example
{
  "Successful": "boolean",
  "Words": [
    {
      "WordText": "string",
      "LineNumber": "integer (int32)",
      "WordNumber": "integer (int32)",
      "XLeft": "integer (int32)",
      "YTop": "integer (int32)",
      "Width": "integer (int32)",
      "Height": "integer (int32)",
      "ConfidenceLevel": "number (double)",
      "BlockNumber": "integer (int32)",
      "ParagraphNumber": "integer (int32)",
      "PageNumber": "integer (int32)"
    }
  ]
}

OcrWordElement: object

A single word in an OCR document

WordText: string

Text of the word

LineNumber: integer (int32)

Line number of the word

WordNumber: integer (int32)

Index of the word in the line

XLeft: integer (int32)

X location of the left edge of the word in pixels

YTop: integer (int32)

Y location of the top edge of the word in pixels

Width: integer (int32)

Width of the word in pixels

Height: integer (int32)

Height of the word in pixels

ConfidenceLevel: number (double)

Confidence level of the machine learning result; possible values are 0.0 (lowest accuracy) - 1.0 (highest accuracy)

BlockNumber: integer (int32)

Index of the containing block

ParagraphNumber: integer (int32)

Index of the containing paragraph

PageNumber: integer (int32)

Index of the containing page

Example
{
  "WordText": "string",
  "LineNumber": "integer (int32)",
  "WordNumber": "integer (int32)",
  "XLeft": "integer (int32)",
  "YTop": "integer (int32)",
  "Width": "integer (int32)",
  "Height": "integer (int32)",
  "ConfidenceLevel": "number (double)",
  "BlockNumber": "integer (int32)",
  "ParagraphNumber": "integer (int32)",
  "PageNumber": "integer (int32)"
}

ImageToLinesWithLocationResult: object

Result of an image to lines-with-location OCR operation

Successful: boolean

True if successful, false otherwise

Lines: OcrLineElement

Words in the image

OcrLineElement
Example
{
  "Successful": "boolean",
  "Lines": [
    {
      "LineText": "string",
      "Words": [
        {
          "WordText": "string",
          "LineNumber": "integer (int32)",
          "WordNumber": "integer (int32)",
          "XLeft": "integer (int32)",
          "YTop": "integer (int32)",
          "Width": "integer (int32)",
          "Height": "integer (int32)",
          "ConfidenceLevel": "number (double)",
          "BlockNumber": "integer (int32)",
          "ParagraphNumber": "integer (int32)",
          "PageNumber": "integer (int32)"
        }
      ]
    }
  ]
}

OcrLineElement: object

A contiguous line of text in an OCR document

LineText: string

Text of the line

Words: OcrWordElement

Word objects in the line

OcrWordElement
Example
{
  "LineText": "string",
  "Words": [
    {
      "WordText": "string",
      "LineNumber": "integer (int32)",
      "WordNumber": "integer (int32)",
      "XLeft": "integer (int32)",
      "YTop": "integer (int32)",
      "Width": "integer (int32)",
      "Height": "integer (int32)",
      "ConfidenceLevel": "number (double)",
      "BlockNumber": "integer (int32)",
      "ParagraphNumber": "integer (int32)",
      "PageNumber": "integer (int32)"
    }
  ]
}

PhotoToWordsWithLocationResult: object

Result of an photo to words-with-location OCR operation

Successful: boolean

True if successful, false otherwise

TextElements: OcrPhotoTextElement

Word elements in the image

OcrPhotoTextElement
DiagnosticImage: string (byte)

Typically null. To analyze OCR performance, enable diagnostic mode by adding the HTTP header "DiagnosticMode" with the value "true". When this is true, a diagnostic image showing the details of the OCR result will be set in PNG format into DiagnosticImage.

Example
{
  "Successful": "boolean",
  "TextElements": [
    {
      "Text": "string",
      "XLeft": "integer (int32)",
      "YTop": "integer (int32)",
      "Width": "integer (int32)",
      "Height": "integer (int32)",
      "BoundingPoints": [
        {
          "X": "integer (int32)",
          "Y": "integer (int32)"
        }
      ],
      "ConfidenceLevel": "number (double)"
    }
  ],
  "DiagnosticImage": "string (byte)"
}

OcrPhotoTextElement: object

A single text in an OCR document

Text: string

Text of the word

XLeft: integer (int32)

X location of the left edge of the word in pixels

YTop: integer (int32)

Y location of the top edge of the word in pixels

Width: integer (int32)

Width of the word in pixels

Height: integer (int32)

Height of the word in pixels

BoundingPoints: Point

Points that form the bounding polygon around the text

Point
ConfidenceLevel: number (double)

Confidence level of the machine learning result; possible values are 0.0 (lowest accuracy) - 1.0 (highest accuracy)

Example
{
  "Text": "string",
  "XLeft": "integer (int32)",
  "YTop": "integer (int32)",
  "Width": "integer (int32)",
  "Height": "integer (int32)",
  "BoundingPoints": [
    {
      "X": "integer (int32)",
      "Y": "integer (int32)"
    }
  ],
  "ConfidenceLevel": "number (double)"
}

Point: object

Point location in 2D in an image, where 0, 0 represents the top/left corner of the image

X: integer (int32)

X location in 2D in the image, where 0 represents the left edge of the image

Y: integer (int32)

Y location in 2D in the image, where 0 represents the top edge of the image

Example
{
  "X": "integer (int32)",
  "Y": "integer (int32)"
}

ReceiptRecognitionResult: object

Result of recognizing a receipt, to extract the key information from the receipt

Successful: boolean

True if the operation was successful, false otherwise

Timestamp: string (date-time)

The date and time printed on the receipt (if included on the receipt)

BusinessName: string

The name of the business printed on the receipt (if included on the receipt)

BusinessWebsite: string

The website URL of the business printed on the receipt (if included on the receipt)

AddressString: string

The address of the business printed on the receipt (if included on the receipt)

PhoneNumber: string

The phone number printed on the receipt (if included on the receipt)

ReceiptItems: ReceiptLineItem

The individual line items comprising the order; does not include total (see ReceiptTotal)

ReceiptLineItem
ReceiptSubTotal: number (double)

Optional; if available, the monetary value of the receipt subtotal - typically not including specialized line items such as Tax. If this value is not available, it will be 0.

ReceiptTotal: number (double)

The total monetary value of the receipt (if included on the receipt)

Example
{
  "Successful": "boolean",
  "Timestamp": "string (date-time)",
  "BusinessName": "string",
  "BusinessWebsite": "string",
  "AddressString": "string",
  "PhoneNumber": "string",
  "ReceiptItems": [
    {
      "ItemDescription": "string",
      "ItemPrice": "number (double)"
    }
  ],
  "ReceiptSubTotal": "number (double)",
  "ReceiptTotal": "number (double)"
}

ReceiptLineItem: object

Receipt line item, comprised of a product or item and a price (if available)

ItemDescription: string

Description of the item

ItemPrice: number (double)

Price of the item if available

Example
{
  "ItemDescription": "string",
  "ItemPrice": "number (double)"
}

BusinessCardRecognitionResult: object

Result of recognizing a business card, to extract the key information from the business card

Successful: boolean

True if the operation was successful, false otherwise

PersonName: string

The name of the person printed on the business card (if included on the business card)

PersonTitle: string

The title of the person printed on the business card (if included on the business card)

BusinessName: string

The name of the business printed on the business card (if included on the business card)

AddressString: string

The address printed on the business card (if included on the business card)

PhoneNumber: string

The phone number printed on the business card (if included on the business card)

EmailAddress: string

The email address printed on the business card (if included on the business card)

Timestamp: string (date-time)

The date and time printed on the business card (if included on the business card)

Example
{
  "Successful": "boolean",
  "PersonName": "string",
  "PersonTitle": "string",
  "BusinessName": "string",
  "AddressString": "string",
  "PhoneNumber": "string",
  "EmailAddress": "string",
  "Timestamp": "string (date-time)"
}

FormRecognitionResult: object

The result of extracting form field values

Successful: boolean

True if the operation was successful, false otherwise

FieldValueExtractionResult: FieldResult

Result of form field OCR data extraction

FieldResult
TableValueExtractionResults: TableResult

Result of form table OCR data extraction

TableResult
Diagnostics: string[]

Diagnostic images - default is null, enable diagnostics=true to populate this parameter with one image per field

string
BestMatchFormSettingName: string

Optional; populated when using photo/recognize/form/advanced with the Setting Name of the best-matching highest-relevance form

Example
{
  "Successful": "boolean",
  "FieldValueExtractionResult": [
    {
      "TargetField": {
        "FieldID": "string",
        "LeftAnchor": "string",
        "TopAnchor": "string",
        "BottomAnchor": "string",
        "AlternateAnchor": "string",
        "AnchorMode": "string",
        "DataType": "string",
        "TargetDigitCount": "integer (int32)",
        "MinimumCharacterCount": "integer (int32)",
        "AllowNumericDigits": "boolean",
        "VerticalAlignmentType": "string",
        "HorizontalAlignmentType": "string",
        "TargetFieldWidth_Relative": "number (double)",
        "TargetFieldHeight_Relative": "number (double)",
        "TargetFieldHorizontalAdjustment": "number (double)",
        "TargetFieldVerticalAdjustment": "number (double)",
        "Ignore": [
          "string"
        ],
        "Options": "string"
      },
      "FieldValues": [
        {
          "Text": "string",
          "XLeft": "integer (int32)",
          "YTop": "integer (int32)",
          "Width": "integer (int32)",
          "Height": "integer (int32)",
          "BoundingPoints": [
            {
              "X": "integer (int32)",
              "Y": "integer (int32)"
            }
          ],
          "ConfidenceLevel": "number (double)"
        }
      ]
    }
  ],
  "TableValueExtractionResults": [
    {
      "TableDefinition": {
        "TableID": "string",
        "ColumnDefinitions": [
          {
            "ColumnID": "string",
            "TopAnchor": "string",
            "AnchorMode": "string",
            "DataType": "string",
            "MinimumCharacterCount": "integer (int32)",
            "AllowNumericDigits": "boolean"
          }
        ],
        "TargetTableHeight_Relative": "number (double)",
        "TargetRowHeight_Relative": "number (double)"
      },
      "TableRowsResult": [
        {
          "TableRowCellsResult": [
            {
              "ColumnID": "string",
              "CellValues": [
                {
                  "Text": "string",
                  "XLeft": "integer (int32)",
                  "YTop": "integer (int32)",
                  "Width": "integer (int32)",
                  "Height": "integer (int32)",
                  "BoundingPoints": [
                    {
                      "X": "integer (int32)",
                      "Y": "integer (int32)"
                    }
                  ],
                  "ConfidenceLevel": "number (double)"
                }
              ]
            }
          ]
        }
      ]
    }
  ],
  "Diagnostics": [
    "string"
  ],
  "BestMatchFormSettingName": "string"
}

FieldResult: object

A pairing target field and actual value read from form

TargetField: FormFieldDefinition

Target field to extract from the form

FieldValues: OcrPhotoTextElement

Result field value(s) extracted

OcrPhotoTextElement
Example
{
  "TargetField": {
    "FieldID": "string",
    "LeftAnchor": "string",
    "TopAnchor": "string",
    "BottomAnchor": "string",
    "AlternateAnchor": "string",
    "AnchorMode": "string",
    "DataType": "string",
    "TargetDigitCount": "integer (int32)",
    "MinimumCharacterCount": "integer (int32)",
    "AllowNumericDigits": "boolean",
    "VerticalAlignmentType": "string",
    "HorizontalAlignmentType": "string",
    "TargetFieldWidth_Relative": "number (double)",
    "TargetFieldHeight_Relative": "number (double)",
    "TargetFieldHorizontalAdjustment": "number (double)",
    "TargetFieldVerticalAdjustment": "number (double)",
    "Ignore": [
      "string"
    ],
    "Options": "string"
  },
  "FieldValues": [
    {
      "Text": "string",
      "XLeft": "integer (int32)",
      "YTop": "integer (int32)",
      "Width": "integer (int32)",
      "Height": "integer (int32)",
      "BoundingPoints": [
        {
          "X": "integer (int32)",
          "Y": "integer (int32)"
        }
      ],
      "ConfidenceLevel": "number (double)"
    }
  ]
}

TableResult: object

The result of reading a table via OCR from a form

TableDefinition: FormTableDefinition

The input table definition for reference

TableRowsResult: TableRowResult

Rows of data in the table

TableRowResult
Example
{
  "TableDefinition": {
    "TableID": "string",
    "ColumnDefinitions": [
      {
        "ColumnID": "string",
        "TopAnchor": "string",
        "AnchorMode": "string",
        "DataType": "string",
        "MinimumCharacterCount": "integer (int32)",
        "AllowNumericDigits": "boolean"
      }
    ],
    "TargetTableHeight_Relative": "number (double)",
    "TargetRowHeight_Relative": "number (double)"
  },
  "TableRowsResult": [
    {
      "TableRowCellsResult": [
        {
          "ColumnID": "string",
          "CellValues": [
            {
              "Text": "string",
              "XLeft": "integer (int32)",
              "YTop": "integer (int32)",
              "Width": "integer (int32)",
              "Height": "integer (int32)",
              "BoundingPoints": [
                {
                  "X": "integer (int32)",
                  "Y": "integer (int32)"
                }
              ],
              "ConfidenceLevel": "number (double)"
            }
          ]
        }
      ]
    }
  ]
}

FormFieldDefinition: object

Definition of a form field for OCR data extraction from images

FieldID: string

The identifier of the field; use this to identify which field is being referenced. Set to SkipField if you do not wish to return the value of this field in the result.

LeftAnchor: string

Optional - the left-hand anchor of the field

TopAnchor: string

Optional - the top anchor of the field

BottomAnchor: string

Optional - the bottom anchor of the field

AlternateAnchor: string

Optional - alterate match text for the specified anchor

AnchorMode: string

Optional - the matching mode for the anchor. Possible values are Complete (requires the entire anchor to match) and Partial (allows only part of the anchor to match) and Horizontal (anchor must be laid out horizontally). Default is Partial.

DataType: string

The data type of the field; possible values are INTEGER (Integer value), STRING (Arbitrary string value, spaces are permitted), DATE (Date in a structured format), DECIMAL (Decimal number), ALPHANUMERIC (Continuous alphanumeric string with no spaces), STRINGNOWHITESPACE (A string that contains no whitespace characters), SERIALNUMBER (A serial-number style string that contains letters and numbers, and certain symbols; must contain at least one number), ALPHAONLY (Alphabet characters only, no numbers or symbols or whitespace)

TargetDigitCount: integer (int32)

Optional - the target number of digits in the field; useful for fixed-length fields

MinimumCharacterCount: integer (int32)

Optional - the target number of digits in the field; useful for fixed-length fields

AllowNumericDigits: boolean

Optional - set to false to block values that contain numeric digits, set to true to allow numeric digits

VerticalAlignmentType: string

Vertical alignment of target value area relative to the field anchor; Possible values are VCenter, Top, Bottom

HorizontalAlignmentType: string

Horizontal alignment of target value area relative to the field anchor; Possible values are Left, Right

TargetFieldWidth_Relative: number (double)

Optional - scale factor for target field width - relative to width of field title; a value of 1.0 indicates the target value area has the same width as the field value as occurring in the image; a value of 2.0 would indicate that the target value area has 2 times the width of the field value as occurring in the image.

TargetFieldHeight_Relative: number (double)

Optional - scale factor for target field height - relative to height of field title

TargetFieldHorizontalAdjustment: number (double)

Optional - horizontal adjestment in relative width of the field

TargetFieldVerticalAdjustment: number (double)

Optional - vertical adjestment in relative height of the field

Ignore: string[]

Optional - Ignore any result items that contain a partial or complete match with these text strings

string
Options: string

Optional - additional options that can be set for this field definition, separated by commas. Possible values are AllowMultiMatch (allow the same anchor to be matched to multiple fields)

Example
{
  "FieldID": "string",
  "LeftAnchor": "string",
  "TopAnchor": "string",
  "BottomAnchor": "string",
  "AlternateAnchor": "string",
  "AnchorMode": "string",
  "DataType": "string",
  "TargetDigitCount": "integer (int32)",
  "MinimumCharacterCount": "integer (int32)",
  "AllowNumericDigits": "boolean",
  "VerticalAlignmentType": "string",
  "HorizontalAlignmentType": "string",
  "TargetFieldWidth_Relative": "number (double)",
  "TargetFieldHeight_Relative": "number (double)",
  "TargetFieldHorizontalAdjustment": "number (double)",
  "TargetFieldVerticalAdjustment": "number (double)",
  "Ignore": [
    "string"
  ],
  "Options": "string"
}

FormTableDefinition: object

Definition of a form table for OCR data extraction from images

TableID: string

Optional; the ID of the table

ColumnDefinitions: FormTableColumnDefinition

Definition of the columns in the table

FormTableColumnDefinition
TargetTableHeight_Relative: number (double)

Optional - scale factor for target table height - relative to maximum height of headers of columns

TargetRowHeight_Relative: number (double)

Optional - scale factor for target row height - relative to height of column header

Example
{
  "TableID": "string",
  "ColumnDefinitions": [
    {
      "ColumnID": "string",
      "TopAnchor": "string",
      "AnchorMode": "string",
      "DataType": "string",
      "MinimumCharacterCount": "integer (int32)",
      "AllowNumericDigits": "boolean"
    }
  ],
  "TargetTableHeight_Relative": "number (double)",
  "TargetRowHeight_Relative": "number (double)"
}

TableRowResult: object

One row of data in the resulting table

TableRowCellsResult: TableCellResult

Table cells in this row result

TableCellResult
Example
{
  "TableRowCellsResult": [
    {
      "ColumnID": "string",
      "CellValues": [
        {
          "Text": "string",
          "XLeft": "integer (int32)",
          "YTop": "integer (int32)",
          "Width": "integer (int32)",
          "Height": "integer (int32)",
          "BoundingPoints": [
            {
              "X": "integer (int32)",
              "Y": "integer (int32)"
            }
          ],
          "ConfidenceLevel": "number (double)"
        }
      ]
    }
  ]
}

FormTableColumnDefinition: object

Definition of a column within a table for OCR data extraction from images

ColumnID: string

The identifier of the field; use this to identify which field is being referenced

TopAnchor: string

Optional - the top anchor of the column heading

AnchorMode: string

Optional - the matching mode for the anchor. Possible values are Complete (requires the entire anchor to match) and Partial (allows only part of the anchor to match). Default is Partial.

DataType: string

The data type of the field; possible values are INTEGER (Integer value), STRING (Arbitrary string value, spaces are permitted), DATE (Date in a structured format), DECIMAL (Decimal number), ALPHANUMERIC (Continuous alphanumeric string with no spaces), STRINGNOWHITESPACE (A string that contains no whitespace characters), SERIALNUMBER (A serial-number style string that contains letters and numbers, and certain symbols; must contain at least one number), ALPHAONLY (Alphabet characters only, no numbers or symbols or whitespace)

MinimumCharacterCount: integer (int32)

Optional - the target number of digits in the field; useful for fixed-length fields

AllowNumericDigits: boolean

Optional - set to false to block values that contain numeric digits, set to true to allow numeric digits

Example
{
  "ColumnID": "string",
  "TopAnchor": "string",
  "AnchorMode": "string",
  "DataType": "string",
  "MinimumCharacterCount": "integer (int32)",
  "AllowNumericDigits": "boolean"
}

TableCellResult: object

The recognition result of one cell in one row in a table of a form

ColumnID: string

The ID of the column

CellValues: OcrPhotoTextElement

Result cell value(s) extracted

OcrPhotoTextElement
Example
{
  "ColumnID": "string",
  "CellValues": [
    {
      "Text": "string",
      "XLeft": "integer (int32)",
      "YTop": "integer (int32)",
      "Width": "integer (int32)",
      "Height": "integer (int32)",
      "BoundingPoints": [
        {
          "X": "integer (int32)",
          "Y": "integer (int32)"
        }
      ],
      "ConfidenceLevel": "number (double)"
    }
  ]
}

FormDefinitionTemplate: object

Definition of a form template; use a form template definition to recognize the fields in a form with Cloudmersive OCR

FieldDefinitions: FormFieldDefinition

Field definitions in the template; a field is comprised of a key/value pair

FormFieldDefinition
TableDefinitions: FormTableDefinition

Table definitions in the template; a table is comprised of columns and rows and exists in a 2-dimensional layout; a common example of a table would be an invoice

FormTableDefinition
Example
{
  "FieldDefinitions": [
    {
      "FieldID": "string",
      "LeftAnchor": "string",
      "TopAnchor": "string",
      "BottomAnchor": "string",
      "AlternateAnchor": "string",
      "AnchorMode": "string",
      "DataType": "string",
      "TargetDigitCount": "integer (int32)",
      "MinimumCharacterCount": "integer (int32)",
      "AllowNumericDigits": "boolean",
      "VerticalAlignmentType": "string",
      "HorizontalAlignmentType": "string",
      "TargetFieldWidth_Relative": "number (double)",
      "TargetFieldHeight_Relative": "number (double)",
      "TargetFieldHorizontalAdjustment": "number (double)",
      "TargetFieldVerticalAdjustment": "number (double)",
      "Ignore": [
        "string"
      ],
      "Options": "string"
    }
  ],
  "TableDefinitions": [
    {
      "TableID": "string",
      "ColumnDefinitions": [
        {
          "ColumnID": "string",
          "TopAnchor": "string",
          "AnchorMode": "string",
          "DataType": "string",
          "MinimumCharacterCount": "integer (int32)",
          "AllowNumericDigits": "boolean"
        }
      ],
      "TargetTableHeight_Relative": "number (double)",
      "TargetRowHeight_Relative": "number (double)"
    }
  ]
}

PdfToTextResponse: object

Response from an OCR to text operation. Includes the confidence rating and converted text result.

Successful: boolean

True if successful, false otherwise

OcrPages: OcrPageResult

Page OCR results

OcrPageResult
Example
{
  "Successful": "boolean",
  "OcrPages": [
    {
      "PageNumber": "integer (int32)",
      "MeanConfidenceLevel": "number (float)",
      "TextResult": "string"
    }
  ]
}

OcrPageResult: object

PageNumber: integer (int32)

Page number of the page that was OCR-ed, starting with 1 for the first page in the PDF file

MeanConfidenceLevel: number (float)

Confidence level rating of the OCR operation; ratings above 80% are strong.

TextResult: string

Converted text string from the image input.

Example
{
  "PageNumber": "integer (int32)",
  "MeanConfidenceLevel": "number (float)",
  "TextResult": "string"
}

PdfToWordsWithLocationResult: object

Response from an OCR to words with location operation. Includes the confience rating and converted text result, along with the locations of the words in the pages.

Successful: boolean

True if successful, false otherwise

OcrPages: OcrPageResultWithWordsWithLocation

OCR page results

OcrPageResultWithWordsWithLocation
Example
{
  "Successful": "boolean",
  "OcrPages": [
    {
      "PageNumber": "integer (int32)",
      "Successful": "boolean",
      "Words": [
        {
          "WordText": "string",
          "LineNumber": "integer (int32)",
          "WordNumber": "integer (int32)",
          "XLeft": "integer (int32)",
          "YTop": "integer (int32)",
          "Width": "integer (int32)",
          "Height": "integer (int32)",
          "ConfidenceLevel": "number (double)",
          "BlockNumber": "integer (int32)",
          "ParagraphNumber": "integer (int32)",
          "PageNumber": "integer (int32)"
        }
      ]
    }
  ]
}

OcrPageResultWithWordsWithLocation: object

OCR results of a page, including words of text and their location

PageNumber: integer (int32)

Page number of the page that was OCR-ed, starting with 1 for the first page in the PDF file

Successful: boolean

True if successful, false otherwise

Words: OcrWordElement

Word elements in the image

OcrWordElement
Example
{
  "PageNumber": "integer (int32)",
  "Successful": "boolean",
  "Words": [
    {
      "WordText": "string",
      "LineNumber": "integer (int32)",
      "WordNumber": "integer (int32)",
      "XLeft": "integer (int32)",
      "YTop": "integer (int32)",
      "Width": "integer (int32)",
      "Height": "integer (int32)",
      "ConfidenceLevel": "number (double)",
      "BlockNumber": "integer (int32)",
      "ParagraphNumber": "integer (int32)",
      "PageNumber": "integer (int32)"
    }
  ]
}

PdfToLinesWithLocationResult: object

Response from an OCR to lines with location operation. Includes the confience rating and converted text result, along with the locations of the lines in the pages.

Successful: boolean

True if successful, false otherwise

OcrPages: OcrPageResultWithLinesWithLocation

OCR results for each page

OcrPageResultWithLinesWithLocation
Example
{
  "Successful": "boolean",
  "OcrPages": [
    {
      "PageNumber": "integer (int32)",
      "Successful": "boolean",
      "Lines": [
        {
          "LineText": "string",
          "Words": [
            {
              "WordText": "string",
              "LineNumber": "integer (int32)",
              "WordNumber": "integer (int32)",
              "XLeft": "integer (int32)",
              "YTop": "integer (int32)",
              "Width": "integer (int32)",
              "Height": "integer (int32)",
              "ConfidenceLevel": "number (double)",
              "BlockNumber": "integer (int32)",
              "ParagraphNumber": "integer (int32)",
              "PageNumber": "integer (int32)"
            }
          ]
        }
      ]
    }
  ]
}

OcrPageResultWithLinesWithLocation: object

OCR results of a page, including lines of text and their location

PageNumber: integer (int32)

Page number of the page that was OCR-ed, starting with 1 for the first page in the PDF file

Successful: boolean

True if successful, false otherwise

Lines: OcrLineElement

Word elements in the image

OcrLineElement
Example
{
  "PageNumber": "integer (int32)",
  "Successful": "boolean",
  "Lines": [
    {
      "LineText": "string",
      "Words": [
        {
          "WordText": "string",
          "LineNumber": "integer (int32)",
          "WordNumber": "integer (int32)",
          "XLeft": "integer (int32)",
          "YTop": "integer (int32)",
          "Width": "integer (int32)",
          "Height": "integer (int32)",
          "ConfidenceLevel": "number (double)",
          "BlockNumber": "integer (int32)",
          "ParagraphNumber": "integer (int32)",
          "PageNumber": "integer (int32)"
        }
      ]
    }
  ]
}

GetPageAngleResult: object

Result of performing a get-page-angle operation

Successful: boolean

True if the operation was successful, false otherwise

Angle: number (double)

Angle of the page in radians; 0 represents perfectly horizontal

Example
{
  "Successful": "boolean",
  "Angle": "number (double)"
}