Engineering Data Science at Automattic

Data for Breakfast

Most data scientists have to write code to analyze data or build products. While coding, data scientists act as software engineers. Adopting best practices from software engineering is key to ensuring the correctness, reproducibility, and maintainability of data science projects. This post describes some of our efforts in the area.

Data scientist Venn diagram example One of many data science Venn diagrams. Source: Data Science Stack Exchange

Different data scientists, different backgrounds

Data science is often defined as the intersection of many fields, including software engineering and statistics. However, as demonstrated by the above Venn diagram, viewing it as an intersection tends to be too exclusive – in reality, it’s a union of many fields. Hence, data scientists tend to come from various backgrounds, and it is common to encounter data scientists with no formal training in computer science or software engineering. According to Michael Hochster, data scientists can be classified into two types

View original post 1,069 more words

Five Misconceptions About Data Science

Track 2 Analytics

Introduction

Getting Your Head Around Data Science

Data science has made its way into practically all facets of society – from retail and marketing, to travel and hospitality, to finance and insurance, to sports and entertainment, to defense, homeland security, cyber, and beyond. It is clear that data science has successfully sold its claim of “actionable insights from data,” and truth be told, it often delivers on that claim, adding value that would otherwise go untapped. As a result, data science is often looked to as a panacea, a Swiss army knife, a silver bullet, a must-have, [insert your own cliché here]. This has implications for both data scientists and the organizations they work with. On one hand, data scientists are now beginning to face a new set of challenging problems, problems that even the most advanced machine learning algorithms have yet to solve: managing expectations. And on the other…

View original post 4,202 more words