Monday 16 April 2018

Speech Recognition on Office 365 SharePoint with Azure Cognitive Service Speech API

In this post, let us see how speech recognition can be implemented using Microsoft Speech API on SharePoint portals using JavaScript client side libraries/SDKs. This shows how speech to text can be converted on Office 365 SharePoint using Azure Cognitive Services.

In our previous post, we have seen implementing Speech recognition using browser SpeechRecognition objects on SharePoint portals. This post is special for those people who love implementing speech recognition using Microsoft Speech API (Azure Cognitive service speech API).

Note: If you are interested only in the implementation, scroll down to the bottom section. :)


Azure Cognitive Service - Bing Speech API


Microsoft Speech API supports both speech to text and text to speech conversions. In this case, only we are focusing on speech to text conversion. Microsoft Speech API provides two approaches of speech to text conversion. One using the REST API and the other way is using the client libraries. We will be leveraging the client libraries, which provides speech SDK bundles.

Microsoft Azure provides various services and components, that is available as part of cloud. The services can be implemented on any portals, but we are focused on implementing Azure services on SharePoint. On Azure Cognitive Services, the speech API service is available as Bing Speech API for speech recognition, which we will be leveraging for this POC.

As stated in the previous post, Speech recognition helps recognizing the real-time audio from the microphone and converts it to the respective text. In this case, the speech to text conversion happens on the cloud platform with the help of Azure Bing Speech API. This kind of interfaces helps in building the voice triggered apps like chat bots, etc.


Advantages of using Microsoft Speech API? 


There are several advantages of using Microsoft Speech API.


Implementation



Let us look at the basic example of integrating Bing Speech API into SharePoint portals.

Create the Azure Cognitive Bing Speech API from Azure portal.



Extracting Azure Speech SDK bundle File: First SDK bundle javascript file need to be extracted from the microsoft-speech-browser-sdk npm package (If you have speech.sdk.bundle.js file downloaded, then skip this step).
  • Navigate to the required folder and install the speech browser sdk package using the npm command “npm install microsoft-speech-browser-sdk –save”
  • Then extract the speech.sdk.bundle.js file by navigating to the folder (<path_to_speech_SDK>/node_modules/microsoft-speech-browser-sdk/distrib/)


HTML snippet embedded into SharePoint page: Then refer the above bundle file (speech.sdk.bundle.js) in the code. The below HTML code snippet shows the HTML being rendered.


Custom JavaScript code snippet implementing Azure Speech Service: The below javascript snippet shows the custom script written for processing the speech. RecognizerStart function shows the multiple events. recognizer object is bind with Azure Cognitive Bing speech service SDK.



The below snapshot shows the Azure Cognitive service implementation on SharePoint portal. 




Note: Start and Stop buttons can be used to start/stop recording speech. In the below snapshot, the current hypothesis displays the running speech. Results shows the detailed results returned from Azure cognitive service. Status shows the current state (idle/listening).