Speech

Sign in to queue

Description

In this first episode of 'Context', we take a look at speech technologies, looking at how we can make use of speech in modern, Universal Windows Apps but also how we can use some of the additional Microsoft Cognitive Services for cross-platform work and beyond.

Here's a breakdown of the show;

Please note that the recording does refer to 'Project Oxford' - these services were moved to the 'Microsoft Cognitive Services' shortly after the episode was recorded but at the time of publication this renaming has not reached through to SDKs and so all the hyperlinks and code samples should work fine.

Show Notes

Embed

Download

Download this episode

The Discussion

  • User profile image
    Ian2

    Jolly good show chaps!

    Could this be used in conjunction with Cortana such that we might trap and act on our own words before Cortana gets a chance to act on them? 

    Or maybe could we have our own listener such that if we started 'Hey Bob' then our own "Bob App" gets exclusive use of the subsequent speech? 

    BTW The last few minutes are missing (video cuts off half way through discussion of LUIS)

  • User profile image
    mtaulty

    Thanks Ian - we will come back and talk about Cortana in a follow-on show but you can (to some extent) mix and match Cortana with speech recognition and synthesis. Generally, Cortana is about the Windows shell either launching your application with some parameters OR it can be about the Windows shell interacting with your application via calling your background tasks without your UI.

    The speech bits that we look at here are about what happens INSIDE your app.

    We'll look into the video problem ASAP but I've noticed that the HIGH quality is fine, the LOW quality is fine but the MEDIUM quality is missing the last 5 minutes of video or so.

    Mike.

  • User profile image
    Ian2

    @mtaulty:Thanks

  • User profile image
    Mandava

    Nice show!

    Does the Text to Speech work in background tasks on lock screen.

    I am having hard time to get it working when the app is tombstoned.

    My idea is simple, I want to the apps background tasks announce something when ever it happend instead of buzzing and asks me to open the device to see what it is.

  • User profile image
    FrankLaVigne

    This is great!

  • User profile image
    mtaulty

     The video issue should be fixed - let us know if you encounter it again.

  • User profile image
    bruce

    Skynet is live !
    Seriously - You guys and this technology ROCKS !

  • User profile image
    mtaulty

    @bruce: Thanks Bruce, glad you enjoyed it. Hopefully, another show coming next week.

    Mike

     

  • User profile image
    Spooner

    Thanks all. Do let us know what you want to see more of, in the comments here or on Twitter - @andspo and @mtaulty

  • User profile image
    PeterNann

    Hey - Really excited to see SRGS constrained recognition in the cloud API!

    I've been working in IVR Speech Reco for over 20 years (yes, really), and have been frustrated in the mobile space that all we could run are open/dictation models. (Well, there was AT&T until they pulled the pin recently...)

    This rocks. I'm off to develop a multi-platform speech app...

    Please tell me it supports Australian English... EDIT: Yes it does. Love your work!

  • User profile image
    PauloAboim​Pinto

    Nice ... finally a show about UWP development

    that is great!

    best of lucks
    Paulo Aboim Pinto

  • User profile image
    issam

    hi,

    any chance to use these stuff in wpf or xamarin ?

  • User profile image
    mtaulty

    @issam:

    Can you let us know which of the pieces that we talked about here you were interested in?

    In the video, we showed some UWP APIs like SpeechRecognizer and SpeechSynthesizer and we also showed some RESTful APIs from Microsoft's Cognitive Services. We also showed some pieces on Android.

    The RESTful services you can use from anywhere including WPF and 'Xamarin' code (i.e. I would expect you to be able to code this into a portable class library although recording audio might be a challenge that you have to overcome).

    For the UWP APIs...In general, there are a lot of UWP APIs that you can use from 'the desktop' and MSDN has a guide to that here and some work has been done here to make that more accessible.

    I'm not sure that UWP's SpeechRecognizer and SpeechSynthesizer are part of that list of APIs but then the .NET Framework already has speech APIs within System.Speech.Synthesis and System.Speech.Recognition so I'd expect that you can get equivalent functionality that way.

    In terms of using UWP API's 'from Xamarin' - I'd expect that you would define an interface to abstract out the functionality that you want and then you'd implement it once for UWP, once for iOS and once for Android and as long as all 3 platforms have what you need in terms of speech then you would be good to go. That work may well already exist out there somewhere.

    Thanks

    Mike

Add Your 2 Cents