A Linear Method for Non-Linear Work: Our Data Science Process

Download this episode

Download Video

Description

All data scientists have some sort of a process to move a project from initial customer contact through to successful completion. Often, this is not formally documented, though we can point out when we are being asked to take short cuts that process, even if we haven't described the steps to our customers.
In this talk, John Ehrlinger (Data Scientist, Algorithm & Data Science team) walks us through the 7 step process for data science detailing how data science works along the way. These 7 steps were developed in response to a customer request for insight into how a data science project would proceed in an effort to estimate time to completion. This process closely parallels more formalized methods with well-defined boundaries and goals and clear guidance on expected deliverables. We assume a linear approach toward project completion similar to waterfall method. However, we also understanding that data science is still science. An exercise in discovery, and allow returning to previously "completed" steps as new information is uncovered or becomes available.
 
TechNet blog post:

Embed

Format

Available formats for this video:

Actual format may change based on video formats available and browser capability.

    The Discussion

    • User profile image
      JulesK

      Nice. And I guess Feature extraction, kernel exploitation and PCA analysis etc in the mix of Data Analysis.

    • User profile image
      daniel

      Very nicely explained. My hobby is developing custom algorithms for the ML portion of the waterfall chart, and for my ML the steps are slightly different as the data exploration portion is mostly done in code as well. Overall an excellent video.

    Comments closed

    Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.