A Linear Method for Non-Linear Work: Our Data Science Process

Sign in to queue

Description

All data scientists have some sort of a process to move a project from initial customer contact through to successful completion. Often, this is not formally documented, though we can point out when we are being asked to take short cuts that process, even if we haven't described the steps to our customers.
In this talk, John Ehrlinger (Data Scientist, Algorithm & Data Science team) walks us through the 7 step process for data science detailing how data science works along the way. These 7 steps were developed in response to a customer request for insight into how a data science project would proceed in an effort to estimate time to completion. This process closely parallels more formalized methods with well-defined boundaries and goals and clear guidance on expected deliverables. We assume a linear approach toward project completion similar to waterfall method. However, we also understanding that data science is still science. An exercise in discovery, and allow returning to previously "completed" steps as new information is uncovered or becomes available.
 
TechNet blog post:

Embed

Download

Download this episode

The Discussion

  • User profile image
    JulesK

    Nice. And I guess Feature extraction, kernel exploitation and PCA analysis etc in the mix of Data Analysis.

  • User profile image
    daniel

    Very nicely explained. My hobby is developing custom algorithms for the ML portion of the waterfall chart, and for my ML the steps are slightly different as the data exploration portion is mostly done in code as well. Overall an excellent video.

Add Your 2 Cents