This was a *great* talk! I would like to have seen a little more explanation of how the activation functions work and what actually happens during training. Still, this presentation was WORLDS above a machine learning class I took in grad school, and I hope we continue to see more of this type of content at future conferences.
I still remember my grad school professor launching into content 5-10 minutes into the start of the course with a slide full of various symbols and terms, without any explanation of what they meant. Like you said, when so many different terms and symbols are used to represent the same concept (or even multiple symbols used to represent the same concept), learning and understanding this stuff is extremely difficult. We didn't even get any real-world examples. As you allude to, as a math-oriented researcher, the professor felt examples, code and practical considerations infringed on his beautiful theory! The best part was when mid-semester he announced that we were entering the "theoretical" part of the course (all I remember was trying to do proofs surrounding the kernel trick).
Bottom line: as you might imagine, as someone interested in Actually Building Stuff That Works, I can't express how much I appreciate a great explanation of some of these concepts. More, please!