A Tour of Deep Learning With C++
Deep Learning is a subfield of artificial intelligence that employs deep neural network architectures and novel learning algorithms to achieve state of the art results in image classification, speech recognition, motion planning and other domains. While all machine learning algorithms are initially formulated in mathematical equations (the only programming language where single letter variable names are encouraged), they must eventually be translated into a computer program. Moreover, because deep neural networks can often be composed of many hundreds of millions of trainable parameters and operate on gigabytes of data, these computer programs have to be fast, lean, often distributed and squeeze every last ounce of performance out of modern CPUs, GPUs and even specialized hardware. This is synonymous with saying machine learning algorithms are usually implemented in C or C++ under the hood, even though libraries like TensorFlow, Torch or Caffe expose APIs in Python or Lua to ease the process of research and speed up iteration. This talk aims to break the single responsibility principle and do three things at once: 1. Give a sweeping introduction to the state of the art in deep learning, 2. Give examples of what it means to implement neural networks in C++, from an implementer's perspective, 3. Give examples of building deep learning models in C++, from a researcher's perspective. Here, the distinction between building and implementing is that the former means stacking together high level modules to achieve some machine learning task, while the latter means actually writing the CPU or GPU kernels that make the magic happen. The goal of the talk is for every attendee to walk away with a general understanding of the state and challenges of the field and hopefully be in a position to implement and build their own deep learning models.