Loading user information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading user information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

Developing Neural Networks Using Visual Studio

1 hour, 13 minutes, 55 seconds


Right click “Save as…”

Slides (view online)
+My schedule - My schedule

A neural network is an artificial intelligence technique that is based on biological synapses and neurons. Neural networks can be used to solve difficult or impossible problems such as predicting which sports team will win in a contest such as the Super Bowl. In a short and informal session, Dr. James McCaffrey, from Microsoft Research in Redmond, WA, will describe exactly what neural networks are, explain the types of problems that can be solved using neural networks, and demonstrate how to create neural networks using Visual Studio. You will leave this session with an in-depth understanding of neural networks and how they can be used to extract valuable intelligence from data.

Download sample code (demo): http://research.microsoft.com/NeuralNetworks/BackPropDemo.aspx

For more information, check out this course on Microsoft Virtual Academy:

Follow the discussion

  • Oops, something didn't work.

    Getting subscription
    Subscribe to this conversation
  • Now that sounds like a really interesting topic. Can't wait!

  • (I am the speaker) After attending this talk, I believe you'll understand exactly what neural networks are, and have the all the knowledge needed to create a neural network using C# and Visual Studio. At the end of the talk I will give attendees the link to a "secret" URL that has the complete source code to a high-quality neural network demo. James M 

  • I enjoyed for the topics and presentations

  • (from the speaker) I'd like to thank all the Build 2013 attendees who sat in on the Neural Networks talk today. I'm grateful for the overflow turnout (300+). And, I appreciate the patience of all those you had to stand through the entire talk. And third, my appreciation to everyone who stayed around after the talk and who asked very useful and interesting questions during the Q&A.

  • Absolutely fantastic presentation! I learnt more from your talk in 60 minutes than I did in a semester studying machine learning (albeit 20-odd years ago).  Congratulations on the turnout!

  • I absolutely loved the presentation. I did leave with one question though (for now): how do neural networks compare to recommendations (such as mahout)? Are they alternative means of accomplishing the same thing or are they related somehow?

  • (speaker response to devMentalMadness' question) I've never used Mahout before, but I have worked with a similar internal-Microsoft system called Cosmos. The idea is that most traditional machine learning techniques (which includes things such as neural network classification, logistic regression, k-means clustering) assume a relatively small set of data -- typically hundreds or a few thousand lines of data -- and so many of these classical techniques assume that all data can be stored in RAM, often in a matrix data structure. But with Big Data, those classical techniques do not work. So, there are several efforts to adapt well-known machine learning algorithms to situations where data does not nicely fit into RAM. Mahout is one such project. JM

  • the talk was confusing, especially starting with Title. You could have replaced "Visual Studio" with anything you want in the title like "Android Studio" or "RADX4" etc. I thought VS provides some API for that. Plus URL with source code at the end has copyright so technically u can't use it

  • Debiprasad GhoshDebiprasad Ghosh

    I beleave intersection of
    (a) People know Neural network
    (b) People know software development
    is more than you stated in your talk.

  • lesderidlesderid

    I loved watching the talk and the questions! It's a topic that's new to me and you really got me interested.

    Could you give some more pointers on how to get started with neural networks, without using the example source code? Is this feasible in a limited time-frame? If so, what do I need to know to write something similar to this from scratch?

    Again, thank you for the talk! :)

  • This was a *great* talk!  I would like to have seen a little more explanation of how the activation functions work and what actually happens during training.  Still, this presentation was WORLDS above a machine learning class I took in grad school, and I hope we continue to see more of this type of content at future conferences.

    I still remember my grad school professor launching into content 5-10 minutes into the start of the course with a slide full of various symbols and terms, without any explanation of what they meant.  Like you said, when so many different terms and symbols are used to represent the same concept (or even multiple symbols used to represent the same concept), learning and understanding this stuff is extremely difficult.  We didn't even get any real-world examples.  As you allude to, as a math-oriented researcher, the professor felt examples, code and practical considerations infringed on his beautiful theory!  The best part was when mid-semester he announced that we were entering the "theoretical" part of the course (all I remember was trying to do proofs surrounding the kernel trick).

    Bottom line: as you might imagine, as someone interested in Actually Building Stuff That Works, I can't express how much I appreciate a great explanation of some of these concepts.  More, please!

  • This was an awesome talk... I have just discovered your articles on Visual Studio Magazine yesterday, and now this talk - these were by far the most clear explanations of neural networks I have seen. I'm going to have to play with the code now.


  • (speaker reply to WeCarpoolCOM) Sorry the title confused you. The intent of the title was to indicate that the talk would describe developing neural networks from scratch, as opposed to the more common approach of using a tool like Weka. And good catch on the copyright notice - it was just boilerplate. I removed it and added a clarifying statement in the source code comments. 

  • (speaker reply to Debiprasad Ghosh) I wouldn't be at all surprised if you are correct and the number is greater than the 100 order-of-magnitude guess-timate from three years ago. As I pointed out, it's not a question of technical ability, it's more about weeding out all the incorrect information that you'll find online. For example, I could not find a single example of a NN implementation that correctly uses cross-entropy error . Now that doesn't mean that thousands of developers haven't figured out how to code it. A good example of the confusion is at http://stackoverflow.com/questions/2930299/how-does-the-cross-entropy-error-function-work-in-an-ordinary-back-propagation-a

  • (speaker reply to lesderid) Thank you for the nice words. Getting started with neural networks is somewhat difficult, not because there isn't enough information, but rather because there are too many resources. Additionally, the best place to start depends a lot upon your particular background. That said, I'll say that my favorite reference is the one I mentioned at the end of the talk: ftp://ftp.sas.com/pub/neural/FAQ.html

    And it's hard to say how feasible it is to write a NN from scratch - again it depends on your background.

  • (speaker reply to compupc1 [James] As a former college professor myself, I can't tell you how many times I saw exactly what you described - a professor launches off, really talking more to himself, and not placing himself in the shoes of someone who wants to learn. The most brilliant people I've ever met have the ability to keep things simple. Anyway, glad you liked the presentation.

  • (speaker reply to mdpopescu [Marcel] Thank you for the compliments. I think many of us in the software development community share the same love for learning and exploration.

  • Best presentation on NN. I cannot express how much helpful it will be for me. Thanks for such wonderful talk. 

  • (speaker reply to vinodhkumarm​) Thank you for the kind words. As I mentioned near the beginning of my talk, when I originally proposed the idea of a talk on neural networks at Build to the conference organizers and to my Microsoft managers, I think there was a little bit of skepticism on their part: I got the feeling that some people felt that neural networks might be too advanced a topic for the general software developer audience. But the turnout at Build showed there's quite a bit of interest in NNs.

  • Konstantin Tarkustarkus Web and cloud solutions architect, programmer, hacker, entrepreneur

    Very nice presentation. Spired a desire to learn more about this subject and write some code.

  • GeekGeek

    Can you comment on the built-in NN in SQL 2012:

  • (speaker reply to Geek) I'm not the best person to give an objective comment on the neural network functionality in SQL Server 2012, but I'll toss out my personal opinion. I experimented a bit with the SQL Server NN feature, but to me it had somewhat of the feel of a square peg in a round hole. By that I mean I'm much more comfortable working with procedural code (especially C#) than with set-based statements like SQL, and NNs are inherently procedural to me.
    Additionally, as a developer, I like having total control over my code -- with a neural network developed using Visual Studio I can customize the NN in any way I like. But using canned NN functionality as in SQL Server or Weka isn't satisfying to me because I don't have that total control. A couple of my SQL-guru friends love the NN feature of SQL however. So I guess what I'm trying to say is that the NN feature in SQL just didn't feel natural to me because of my background, but people with more SQL experience might find the NN feature a nice addition. 

  • (speaker reply to Konstantin Tarkus) Thank you; glad you liked the talk.

  • Nigel FindlaterNigel Findlater

    This was a great session very well presented. I hope we can have the pleasure of more of these types of presentations in future Build events

  • (speaker response to Nigel Findlater) Thank you for the compliment. I've been an advocate for talks on machine learning and artificial intelligence at Build for several years and am glad the conference organizers decided to give this topic a chance. I'm already thinking about possible ML topics for Build 2014.

  • Thanks so much for presenting such a great session.

    For me this was hands down the best session that I saw at the conference.

    I really hope to see more content like this at future build conferences. 

    Also giving us access to a working  prototype to play with is a huge plus! (cant wait to download and play with this. I have several problems where this may be of use)

    Thanks again for making a complex topic like this  understandable.

  • Fred ScefiFred Scefi

    A Very good and informative session. Just watched this online and loved it. I will try to make good use of your code.
    Keep up the good work.

  • (speaker reply to motslots) Thank you for the compliment about the talk. MS Research is working on some incredible things and I hope to a.) be back at Build 2014 with another machine learning talk, and b.) see an additional talk or two from other MS Research speakers.

  • (speaker reply to Fred Scefi) Thank you for the nice words, and I hope you enjoy experimenting with the demo code.

  • James, thank you, very nice presentation.

    Can you share results about `equivalency` two-layer and one-layer NN that you mentioned in Q&A session?

  • (speaker reply to sergey_tihon) Thank you for the compliment.

    The research paper that mathematically proved that a NN with a single hidden layer can approximate any function (with a few conditions attached) is "Multilayer Feedforward Networks are Universal Approximators" by Hornik, Stinchcombe, and White. See http://weber.ucsd.edu/~hwhite/pub_files/hwcv-028.pdf. Their paper was a follow-up to an earlier paper that proved a more narrow result that appled to NNs with logistic-sigmoid activation functions ("Approximation by Superpositions of Sigmoidal Functions" by George Cybenko). A related theorem about activation functions is called Cover's Theorem.

    To be honest, there is a lot of uncertainty in my mind about how these ideas are connected to deep learning, which typically uses more than one hidden layer.

  • Ryan BRyan B

    This was an excellent presentation, thank you :D

    Recently I learned how to program genetic algorithms, but I didn't understand neural networks until now,

    The only part I don't understand, why are genetic algorithms used instead of back propitiation when they're more accurate?

  • (speaker reply to Ryan B) Your question is a bit difficult to answer. It turns out that training a NN is an "NP-complete" problem which, loosely speaking, means that the only way to find the best set of weights and bias values is to try every possible combination of values -- which is impossible. Therefore no training algorithm (back-propagation, particle swarm optimization, genetic algorithm) is guaranteed to find the best set of weights and bias values.

    Back-propagation is used most often because it is reasonably fast and has nice math properties. But back-propagation is extremely sensitive to the values of the initial weights, the learning rate, the (optional) momentum value, and the (optional) weight decay values. In short, no NN training algorithm is most accurate -- they all have strengths and weaknesses.

  • (comment from the speaker) I just reviewed this presentation and noticed I made a few mistakes while speaking. Most of the mistakes were relatively minor but I want to correct one mistake. At the end of the talk, while answering the very last question from the audience, I said that Cover's Theorem established that NNs are universal function approximators. I should have said Cybenko's Theorem (which was later extended by Hornik) instead. Cover's Theorem is related to Cybenko's Theorem but they are really two different things. JM

  • James I am very disappointed by your omissions re the NN environment. You comment above about your experience with SQL's NN. Yet your talk made no mention of that as an alternative.
    That work & related DM algorithms were developed by MSR. The papers they wrote on this topic were seen by the global DM research community as significant breakthroughs. They worked closely with the dev teams to incorporate all those learning's into the SQL Data Mining engine.
    Unlike the core algorithm you present here, They scale massively & solved many of the "limited by memory" & model training issues most DM algorithms / products have.
    They also provide / solve a ton of other issues you'd need to think about when embedding a NN system into your code. eg: They have a pluggable interface that works with most standards (ADO, OLDDB). A language, DMX, that makes it easy to enhance, configure & use with no code change. Tools that automate the training & evaluation of your model, Ability to tweak your model's parameters.
    AND as it is a platform is it easy to extend the DM experience by embedding your own algorithm into that platform to create new mining models.
    (Given you prefer your own algorithm, consider writing it as a DM plug-in & compare your perf with what already exists. It frees you from the plumbing & allows you to focus on the bit you do best.)

    Yet you turned your back in it all. Suggesting NN wasn't well documented (which is is) & talking about "the only thing going was a java app" (there are heaps of products). Now you've encouraging attendees to completely reinvent the wheel. Instead of a project that could take hours/days (6 lines of code to embed into their app, a sql report, etc) they will take weeks or months doing everything from scratch & still unlikely to get close to the multi-core performance, the parallel model training, the scale nor the benefit of the insight outlined in the many research papers published on this subject by the MSR folks.

    In isolation. Nice talk you covered the technical details & background of NN well.

    But as a representative of MSR &/or Microsoft you failed to accurately brief these folks. And did your audience a disservice by implying that their only option was to start at square one.

    You are a smart man & do write great articles. You have great influence. Please be more responsible in the future.

  • (author reply to David_Lean) I would have loved to discuss many additonal topics, inlcuding SQL, but the point of the talk was to discuss how to implement NNs in Visual Studio -- not an overview of existing NN tools. My role at Microsoft is purely technical; I leave marketing to our Marketing people.

  • (author comment to David_Lean comment) I spent some time looking at DMX (http://technet.microsoft.com/en-us/library/ms132058.aspx) that you mention as a possibility for customizing SQL Server's DM functionality. Pretty interesting stuff, but I found it fairly difficult to grasp because of my lack of SQL depth knowledge, and suspect that I'd have a pretty steep learning curve (I'm imagining a scenrio where I'd want to use some alternative numerical optimization algorithm, or maybe implement NN drop-out, or something like that). What really comes to mind here is that, for me at least, I prefere working in a mostly imperative programming language environment, where I use SQL just for storage, rather than working in a meta-SQL environment where most things are SQL-centric. 

Remove this comment

Remove this thread


Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.