A leading fitness icon, during one of his now-famous workouts, stated that there are three cornerstones of ultimate fitness: speed, balance, and range of motion. Together they form the three pillars that support overall physical well-being.
During one of his grueling workouts, convincing myself that today’s pain would be worth tomorrow’s strength, it dawned on me: Much like fitness, data science also has foundational anchors that, when exercised in combination (pun intended), will maximize analytic potential and results.
Cornerstone #1: Speed
In data science, we obviously don’t think about speed the same way as in fitness. Humans aren’t trying to manually calculate or predict faster, but we do need the computers we rely on to demonstrate analytical speed and perform computations with increasing rapidity. And, we need the software we rely on to seamlessly manage vast amounts of data. In a world where complex models can be built in minutes (sometimes seconds), we must be prepared to leverage the tools at our disposal.
Consumers are creating more data points than ever before, and our ability to handle them with scalable solutions will shape our future success. The rise and adoption of machine learning techniques, artificial intelligence, and cloud environments will serve as the means to our end – equipping us to aggregate and synthesize all the data points into meaningful insights and recommendations. Our ability to provide expeditious results is critical to understanding the consumer behavioral ecosystem and simultaneously supplying marketers with the real-time information required to credibly navigate through it.
Cornerstone #2: Balance
Holding a one-arm plank for 45 seconds is quite an accomplishment but getting there takes practice. You may start out wobbly and hold steady for 15 seconds. Gradually, you get to 30 seconds. Eventually, as your strength grows, and your muscles adapt, you hit 45 seconds. The moves vary in difficulty, but each one targets distinct muscle groups.
In analytics, the problems we solve also vary in difficulty. Through practice and repeat exposure, we achieve analytical balance by recognizing that some problems require less complicated solutions (e.g. portraits), while others necessitate advanced techniques (e.g. machine learning algorithms). Knowing which solution to implement means knowing which of the three analytic phases the ask falls into. Descriptive Analytics (Phase 1) helps us understand what has happened. We use portraits, reporting and cross-tabulations to understand patterns, behaviors and distributions. Simple summary statistics and aggregations can provide meaningful insight with minimal effort. Predictive Analytics (Phase 2) helps us understand what will happen. Forecasting and predictive modeling identify likely future behaviors based on historical actions to help maximize spend and targeting efficiency. Prescriptive Analytics (Phase 3) helps us understand what we should do. In this phase, we combine and leverage our learnings from the descriptive and predictive phases to help optimize recommendations. Often, a business challenge cuts across all three phases, so It’s important to qualify each problem at the onset to identify the complexity involved in achieving the end deliverable.
Cornerstone 3: Range of Motion
Healthy joints are crucial to overall wellness. Fighting to increase mobility and flexibility provides conditioning to expect a wide range of motion. Doing so also helps prevent injuries from simple, daily activities.
In data science, we must also be flexible and consider a wide range of techniques in our toolbox to analyze any given problem. For example, when we are working with a binary target (1/0), there are several techniques worth evaluating. Logistic regression is a robust approach that provides a significant amount of transparency. But sometimes, we can get a better prediction if we use ensemble modeling algorithms such as Random Forests or Gradient Boosting, even though doing so may sacrifice our interpretability. Also, there is often work done prior to performing analyses because we frequently have large amounts of data that need to be condensed and two very common methods of dimension reduction are Principal Component Analysis (PCA) and Factor Analysis (FA). Although the back-end algorithms differ, the front-end results are mostly similar. (FA is slightly easier to interpret.)
Having the flexibility to compare multiple techniques that solve the same problem means we are no longer hamstrung (pun intended, again) by the limitations of each one individually. We are now able to take full advantage of all the tools at our disposal and boost the results for the organizations we support.
In data science (as in any discipline) being successful requires more than one skill or competency. It’s important to take an inventory of your current capabilities and ask yourself: Do you need to improve your computation speed and scalability? How balanced in complexity are the offerings you provide? Do you have a solid understanding of various data science techniques? As with any endeavor, being able to identify and bolster areas of weakness improves the overall value of data science in your organization. And hey, if you apply them to your fitness too, you may just improve your own physical well-being along the way.