Speech and Voice Recognition API API Reference
Speech APIs enable you to recognize speech and convert it to text using advanced machine learning, and also to convert text to speech.
Swagger OpenAPI Specification | .NET Framework Client | .NET Core Client | Java Client | Node.JS Client | Python Client | Drupal Client
API Endpoint
https://api.cloudmersive.com
Schemes: https
Version: v1
Authentication
Apikey
API Key Authentication
Recognize
Recognize audio input as text using machine learning
Uses advanced machine learning to convert input audio, which can be mp3 or wav, into text.
Speech file to perform the operation on. Common file formats such as WAV, MP3 are supported.
Code Example:
OK
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
{
"TextResult": "string"
}
Speak
Perform text-to-speech on a string
Takes as input a string and a file format (mp3 or wav) and outputs a wave form in the appropriate format.
String input request
Code Example:
Request Content-Types: application/json, text/json, application/xml, text/xml, application/x-www-form-urlencoded
Request Example
{
"Format": "string",
"Text": "string"
}
OK
Response Content-Types: application/json, text/json, application/xml, text/xml
Response Example (200 OK)
"object"
Perform text-to-speech on a string
Takes as input a string and a file format (mp3 or wav) and outputs a wave form in the appropriate format.
The text you would like to conver to speech. Be sure to surround with quotes, e.g. "The quick brown fox jumps over the lazy dog."
File format to generate response in; possible values are "mp3" or "wav"
Code Example:
Request Content-Types: application/json, text/json, application/xml, text/xml, application/x-www-form-urlencoded
Request Example
"string"
OK
Response Content-Types: application/octet-stream
Response Example (200 OK)
"object"
Schema Definitions
SpeechRecognitionResult: object
Result of recognizing speech
- TextResult: string
-
Recognition result in text format
Example
{
"TextResult": "string"
}
TextToSpeechRequest: object
Input to a Text To Speech request
- Format: string
-
File format for output audio file: wav or mp3, default is mp3
- Text: string
-
Text to be converted to speech
Example
{
"Format": "string",
"Text": "string"
}