Synergy among deep learning, security, and programming languages

Sign in to queue

Description

Deep learning has achieved great successes in many application scenarios, such as image classification and autonomous driving. Many of these applications are security sensitive. In this talk, we will introduce some of our recent works at the intersection of deep learning and security with applications to program analysis and program synthesis.

The talk has two parts. The first part covers our study to apply security techniques to analyze the security of deep learning systems. We demonstrate that it is possible to construct so-called adversarial examples to fool a machine learning model even the underlying model is unknown, and our approach can successfully construct adversarial examples to a commercial image classification system, clarifai.com. We further demonstrate that adversarial examples exist not only in image classification systems, but also in other models, such as generative models and reinforcement learning systems.

The second part covers our study to apply deep learning techniques to enhance security. In particular, we study how to apply deep learning approaches toward analyzing programs' security and constructing secure programs automatically. For program analysis, we will demonstrate that we can apply a structure2vec model to convert a binary program's CFG into a vector representation, so as to improve vulnerability search in terms of both efficiency (16,000x faster) and effectiveness (on average 25 more detected vulnerabilities among top-50 results). For program synthesis, we present a novel approach for input-output program synthesis. Our approach combines learning neural programs operating a domain- specific non-differentiable machine and several improved reinforcement learning-based search techniques to enable synthesizing a program parser that is more complex than the programs synthesized from input-output pairs before, and our evaluation shows that the learned neural parsing program can achieve 100% accuracy on test inputs that are 100x longer than training samples. We view our work as an important step toward synthesizing programs satisfying security properties.

Embed

Download

Download this episode

The Discussion

Add Your 2 Cents