Speech

Download this episode

Download Video

Description

In this first episode of 'Context', we take a look at speech technologies, looking at how we can make use of speech in modern, Universal Windows Apps but also how we can use some of the additional Microsoft Cognitive Services for cross-platform work and beyond.

Here's a breakdown of the show;

Please note that the recording does refer to 'Project Oxford' - these services were moved to the 'Microsoft Cognitive Services' shortly after the episode was recorded but at the time of publication this renaming has not reached through to SDKs and so all the hyperlinks and code samples should work fine.

Show Notes

Embed

Format

Available formats for this video:

Actual format may change based on video formats available and browser capability.

    The Discussion

    • User profile image
      Ian2

      Jolly good show chaps!

      Could this be used in conjunction with Cortana such that we might trap and act on our own words before Cortana gets a chance to act on them? 

      Or maybe could we have our own listener such that if we started 'Hey Bob' then our own "Bob App" gets exclusive use of the subsequent speech? 

      BTW The last few minutes are missing (video cuts off half way through discussion of LUIS)

    • User profile image
      mtaulty

      Thanks Ian - we will come back and talk about Cortana in a follow-on show but you can (to some extent) mix and match Cortana with speech recognition and synthesis. Generally, Cortana is about the Windows shell either launching your application with some parameters OR it can be about the Windows shell interacting with your application via calling your background tasks without your UI.

      The speech bits that we look at here are about what happens INSIDE your app.

      We'll look into the video problem ASAP but I've noticed that the HIGH quality is fine, the LOW quality is fine but the MEDIUM quality is missing the last 5 minutes of video or so.

      Mike.

    • User profile image
      Ian2

      @mtaulty:Thanks

    • User profile image
      Mandava

      Nice show!

      Does the Text to Speech work in background tasks on lock screen.

      I am having hard time to get it working when the app is tombstoned.

      My idea is simple, I want to the apps background tasks announce something when ever it happend instead of buzzing and asks me to open the device to see what it is.

    • User profile image
      FrankLaVigne

      This is great!

    • User profile image
      mtaulty

       The video issue should be fixed - let us know if you encounter it again.

    • User profile image
      bruce

      Skynet is live !
      Seriously - You guys and this technology ROCKS !

    • User profile image
      mtaulty

      @bruce: Thanks Bruce, glad you enjoyed it. Hopefully, another show coming next week.

      Mike

       

    • User profile image
      Spooner

      Thanks all. Do let us know what you want to see more of, in the comments here or on Twitter - @andspo and @mtaulty

    • User profile image
      PeterNann

      Hey - Really excited to see SRGS constrained recognition in the cloud API!

      I've been working in IVR Speech Reco for over 20 years (yes, really), and have been frustrated in the mobile space that all we could run are open/dictation models. (Well, there was AT&T until they pulled the pin recently...)

      This rocks. I'm off to develop a multi-platform speech app...

      Please tell me it supports Australian English... EDIT: Yes it does. Love your work!

    • User profile image
      PauloAboim​Pinto

      Nice ... finally a show about UWP development

      that is great!

      best of lucks
      Paulo Aboim Pinto

    • User profile image
      issam

      hi,

      any chance to use these stuff in wpf or xamarin ?

    • User profile image
      mtaulty

      @issam:

      Can you let us know which of the pieces that we talked about here you were interested in?

      In the video, we showed some UWP APIs like SpeechRecognizer and SpeechSynthesizer and we also showed some RESTful APIs from Microsoft's Cognitive Services. We also showed some pieces on Android.

      The RESTful services you can use from anywhere including WPF and 'Xamarin' code (i.e. I would expect you to be able to code this into a portable class library although recording audio might be a challenge that you have to overcome).

      For the UWP APIs...In general, there are a lot of UWP APIs that you can use from 'the desktop' and MSDN has a guide to that here and some work has been done here to make that more accessible.

      I'm not sure that UWP's SpeechRecognizer and SpeechSynthesizer are part of that list of APIs but then the .NET Framework already has speech APIs within System.Speech.Synthesis and System.Speech.Recognition so I'd expect that you can get equivalent functionality that way.

      In terms of using UWP API's 'from Xamarin' - I'd expect that you would define an interface to abstract out the functionality that you want and then you'd implement it once for UWP, once for iOS and once for Android and as long as all 3 platforms have what you need in terms of speech then you would be good to go. That work may well already exist out there somewhere.

      Thanks

      Mike

    Comments closed

    Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.