Speech and Voice Recognition API API Reference

API Endpoint

https://api.cloudmersive.com

Schemes: https

Version: v1

Authentication

Apikey

API Key Authentication

type

apiKey

name

Apikey

header

Recognize

Recognize audio input as text using machine learning

POST /speech/recognize/file

Uses advanced machine learning to convert input audio, which can be mp3 or wav, into text.

speechFile: file

in formData

Speech file to perform the operation on. Common file formats such as WAV, MP3 are supported.

Code Example:

200 OK

SpeechRecognitionResult

Response Content-Types: application/json, text/json, application/xml, text/xml

Response Example (200 OK)

{
  "TextResult": "string"
}


Apikey

Speak

Perform text-to-speech on a string

POST /speech/speak/text/voice/basic/audio

Takes as input a string and a file format (mp3 or wav) and outputs a wave form in the appropriate format.

TextToSpeechRequest

String input request

Code Example:

Request Content-Types: application/json, text/json, application/xml, text/xml, application/x-www-form-urlencoded

Request Example

{
  "Format": "string",
  "Text": "string"
}

200 OK

type

object

Response Content-Types: application/json, text/json, application/xml, text/xml

Response Example (200 OK)

"object"


Apikey

Speak

Perform text-to-speech on a string

POST /speech/speak/text/basicVoice/{format}

Takes as input a string and a file format (mp3 or wav) and outputs a wave form in the appropriate format.

The text you would like to conver to speech. Be sure to surround with quotes, e.g. "The quick brown fox jumps over the lazy dog."

format: string

in path

File format to generate response in; possible values are "mp3" or "wav"

Code Example:

Request Content-Types: application/json, text/json, application/xml, text/xml, application/x-www-form-urlencoded

Request Example

"string"

200 OK

type

object

Response Content-Types: application/octet-stream

Response Example (200 OK)

"object"


Apikey

Schema Definitions

SpeechRecognitionResult: object

Result of recognizing speech

TextResult: string: Recognition result in text format

Example

{
  "TextResult": "string"
}

TextToSpeechRequest: object

Input to a Text To Speech request

Format: string: File format for output audio file: wav or mp3, default is mp3
Text: string: Text to be converted to speech

Example

{
  "Format": "string",
  "Text": "string"
}