In this article, let us see how to create an Azure Cognitive Services – Text to Speech API Application using C# and the Speech Recognition API.
Before getting started, we need to get the Azure Subscription or even a 30 days trial is also offered by Microsoft.
As usual, let us go by step by step procedures.
1. Login to the Azure.Microsoft.com portal with a valid Azure Account.
https://azure.microsoft.com/en-us/try/cognitive-services/
2. The home page will looks like below.
3. Click on Login and Login with the valid credential.
4. We can see the list of APIs provided by Microsoft.
5. Add the Bing Speech API (Which is valid for 30 days)
6. We will get the End Point and the Keys for the Speech API. Make a note of the End Point and any of the Key. We can use one among the two keys.
7. With this, now, come back to the Visual Studio.
8. Open the Visual Studio. Create a New project – Console App.
9. Add the NuGet Package for Azure Speech Recognition. “Microsoft.ProjectOxford.SpeechRecognition-x64”. Initially the project Name was ProjectOxford before the Cognitive Services. Hence, the NuGet Package name is like ProjectOxford instead of Cognitive Services.
10. Rebuild the Application.
11. Execute the Application and we will see the below exception.
12. If we get any App break exception, then refer to the previous article
13. Now, edit the Program.CS and paste the below code.
using CognitiveServicesTTS;
using System;
using System.Media;
using System.Threading;
using System.Threading.Tasks;
namespace CognitiveServices.Demo
{
class Program
{
static void Main(string[] args)
{
//Get the Input from the User
Console.WriteLine("Enter the Text to Speak : ");
string input = Console.ReadLine();
//Calling the speak method
Speak(input);
Console.ReadLine();
}
public static async Task Speak(string speech)
{
string accessToken;
//The below Authentication class is available on the class TTSClient, which can be got //from the attached source code
//Paste the Key which we got from the Azure Portal
Authentication auth = new Authentication("**********");
accessToken = auth.GetAccessToken();
//End Point URL
string uri = "https://speech.platform.bing.com/synthesize";
var speaker = new Synthesize();
// Initialize the OnAudio Available Event
speaker.OnAudioAvailable += Speaker_OnAudioAvailable;
// Various Options for the speak formats
var options = new Synthesize.InputOptions
{
RequestUri = new Uri(uri),
Text = speech,
VoiceType = Gender.Female,
Locale = "en-US",
VoiceName = "Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)",
OutputFormat = AudioOutputFormat.Riff16Khz16BitMonoPcm,
AuthorizationToken = "Bearer " + accessToken
};
await speaker.Speak(CancellationToken.None, options);
}
//Trigger once the response available
private static void Speaker_OnAudioAvailable(object sender, GenericEventArgs<System.IO.Stream> e)
{
SoundPlayer player = new SoundPlayer(e.EventData);
player.PlaySync();
e.EventData.Dispose();
}
}
}
14. Execute the code and the screen will popup for the text to input.
Happy Coding,
Sathish Nadarajan.
Leave a comment