Face it... Windows 10 Vision and Face Detection Round-up
- Posted: Dec 23, 2015 at 6:00AM
- 1 comment
Loading user information from Channel 9
Something went wrong getting user information from Channel 9
Loading user information from MSDN
Something went wrong getting user information from MSDN
Loading Visual Studio Achievements
Something went wrong getting the Visual Studio Achievements
With all the pictures we'll be taking I thought it would be cool to do a round-up of vision API, facial detection and the like. We're got a ton of posts from three different authors...
A musical coding project that turns users into musical instruments using new media API's.
At the end of this project you will have a simple Windows 10 app that uses face detection on a live video stream from a camera to trigger musical events. Users will be able to make music by moving their faces to different positions on the screen. This is accomplished by creating a grid of cells and assigning a unique sound to each cell. When a face is detected within a cell, the sound is triggered. The FaceTracker API makes it possible for multiple people to join as it is able to track several faces at one time.
Showcasing new APIs for image processing and low-latency audio.
Estimated time commitment: 2 hours
Watch this video and then complete the activity. Most of all don't forget to have fun!
- Visual Studio 2015 and Windows developer tooling
- Ensure you are using Windows 10 or better
- Source code
- A device that supports video capture.
Note: Features in this app are subject to change.
- FaceTracker – Detects faces in VideoFrame objects and tracks faces across subsequent video frames.
- Part of Windows.Media.FaceAnalysis namespace which provides APIs for face detection in bitmaps or video frames.
- Audio Graphs class - parent of all nodes that make up the graph.
- MediaCapture class – Provides functionality for capturing photos, audio, and videos from a capture device, such as a webcam.
I’ve got quite an old compact camera. It’s probably around 5 years old at this point and I often think that I should update it but, frankly, it does the job and so I continue to make use of it when I’m not just photographing things on my phone.
I remember that at the time that I bought the camera it was one of the first that I’d had which did face detection in the sense that it would draw a little box around the people that it saw in the photos and it would even try and identify them if you did some set up and gave it some names for familiar faces.
I’ve written quite a lot in the past around doing face detection with different technologies. For instance, there was this post which made use of the Kinect for Windows V2 in order to locate a face within video frames and monitor it for facial features such as eyes open/closed and so on. I also wrote this post around working with Intel’s RealSense SDK and analysing facial positions and features there.
But both of those approaches require special hardware and, in the RealSense case, can do quite high-fidelity facial recognition in the sense that depth data can be used to differentiate between a real face and a photograph of a face.
What if you don’t have that hardware? What if you just have a web cam?
I was looking into this in the light of two sets of APIs.
I wanted to experiment and so I thought that I’d combine both. I started on the device.
More Posts from Mike;
After doing some samples using Face API for face detection and Emotion APIs for emotions detection, now is the time for a review by the capabilities provided by Vision API.
This Project Oxford service allows you to analyze images and the result of this analysis shows information such as the categories associated with the image, perform some pornographic score, analysis of dominant colours, etc.
For example, the following image is a collage with pictures of the London rugby world cup from a month ago. In addition to the analysis of faces and emotions, in the 3rd column, we show the information of the results of the analysis with Vision API.
In it we can see that the detected category is outdoor, and in addition it has also detected faces.
Name : outdoor_sportsfield; Score : 0.7890625
Age : 17; Gender : Female
Age : 41; Gender : Male
Age : 10; Gender : Female
Age : 6; Gender : Female
In the case of my image in a Ford Mustang, again faces, emotions, and the car category are detected.
In upcoming posts I will comment on the detail of the use of this API, however an interesting detail is that we already have some NuGet packages for working with these APIs. Still they are not PCLs, so we use only in Desktop projects, but with 10 minutes of work, you can create your PCL implementation...
I previous posts I wrote about one of the Project Oxford features > VisionAPI .
Using the API you can analize and get different properties of an image. For example:
- Image type: JPG, PNG, TIF, etc.
- Image size
- Type of image, clipart, line drawing, etc.
- Sex-Rate score
- Categories, which examines details such as “photo on the beach”, “sports field”, “mountain”. For each category is defined a value between 0 and 1, with the Score for that category
- Dominant colors in the image
- Result of face detection. It uses FaceAPI to the face detection and analysis of the age and sex of each
For example, if we carry out analysis in the classic background of Windows XP we have this result
The API VisionAPI in Project Oxford also gives us the ability to perform optical character recognition in an image. What we usually known as OCR.
The result of the OCR process, shows us information with
- the language of the detected language
- the area where the text has been detected
- the angle of the text
- a collection of lines within each area of detected text
- a collection of words per line
... [Entire Post]
In yesterday’s post I wrote on how to create thumbnails using Vision API in Project Oxford. While there are several options in the .Net community to create thumbnails, this stands for creating the miniature bearing in mind the main content of the original image.
For this, this process uses VisionAPI capabilities to detect “the main areas” in the image, and using this as source it create the required thumbnail. This is important when the image we ask does not respect the original format. For example to move from an image with 4:3 aspect ratio to a 16:9 format.
In the yesterday’s sample, I used as original image a picture of Rapunzel and Martina at Disney in Portrait mode with a size of 346 x 518 pixels. Then I generated several miniatures with the following sizes