The utilization of data in the industry has exploded. Artificial Intelligence, Machine Learning, Big Data and Data Science are some of the most expansively used yet confusing terms used in businesses all across the world today. However, in most companies, the scientific curation, analysis and utilization of data has taken two major forms.
The first major use of data is more from a product engineering perspective. This revolves around building product features such as recommendation engines and machine vision applications using iterative machine learning techniques such as deep learning. The focus in these kind of applications is to learn from user behavior/consumption and adapt accordingly. In the terminology I usually use, I call these teams Machine Learning and AI teams.
The second type of team commonly found in companies is the Decision Science perspective. I call these teams decision science because most of the applications tend to focus on making a business decision rather than building iterative applications that provide a customer experience. These teams focus more on the statistical and econometric analysis, experiment design, analytics, forecasting and simulations. The reason I love Decision Science so much is that it can be very easy to implement but also generate massive financial results for a company. In addition, Decision Science projects almost always provide the business case foundation to inject more advanced ML and AI into a business process.
This post (or this series of posts) focuses on Decision Sciences. I have seen a ton of articles and attention provided to deep learning and other ML techniques.Most executives and business leaders find the AI/ML projects exciting as they are the buzz. Decision Sciences is remarkably less “sexy” but is an important partner to AI/Ml teams as decision science is where the money is really made. My goal is to encourage readers of this post toward this side of the fence and hopefully provide a greater appreciation for what Decision Sciences can do for a business.
OK. Lets Dive in !
Data Science problems should ideally start with a problem statement. Any data scientist worth their salt will tell you that this is the most important part of the data science project. In decision science the problem statement IS the DECISION that needs to be made. While it sounds simple, you will be surprised at the number of times a client is unable to describe the exact decision that needs to be made. Like defining a problem statement for ML, it is hard(er) to nail down the right decision you are trying to optimize.I have found great success in using Design Sprint methods to extract this information. I will detail this method in an upcoming post.
The core of any Decision Science project is the utilization of statistics for two kinds of objectives : prediction and inference. Decision Science problems usually focus on inference rather than prediction. For example, a marketing executive might be more interested in how the different marketing levers influence a customers purchase rather than the actual prediction of whether a customer would buy or not. Similarly, a CFO might be interested in how macroeconomic factors affect the revenue of a company. All of these are inference problems. In any business, it's always hard to find all the data that can explain the variance in a response variable. However, there is usually enough data to make inferences on how known predictors influence a response variable.
The second big difference between Decision Science and AI/ML is the fact that Decision Science projects may not involve predictive analysis at all. Decision Science focuses on providing information in a clear and comprehensible manner that allows a client to make a decision with confidence.
Let me explain with an example.
Most transactional data these days display a long tail distribution. As a result there are some outliers. Lets take the case where the company buys a new sales intelligence tool and tests it with some of the sellers. An analyst, with limited understanding of statistics, pulls the average of the data and declares that the tool is a success. The executive in charge sees the results and expands the program across the org. However, sales drops and the executive reaches out to the data scientists to help with the decision to continue or roll back the software. In no time, the data scientists do a descriptive analysis, see the distribution and pull the median (which is not impacted by outliers as the mean). As a result, the program is shown to be a failure and is rolled back.
This is a prime example of where there was no algorithm involved but the impact was huge given the effort it would have taken to draw a histogram and pull the right metric. All it involved was selecting the right KPI for the initiative.
I firmly believe in the importance of Decision Science as a component of any data science initiative. Sure, its not about tensorflow or keras, and deals more with simple regression and classification. However, the results can be huge, as is evident in the examples above.
So how do you build good skills in doing decision analysis? The answer is simple - good knowledge of statistics, good business acumen, good programming and visualization skills and most importantly common sense.
In the next few posts in this series, I will equip you with the tools you will need to successfully help your organizations make better decisions. These posts will include
How to functionally decompose a business scenario to a decision that needs to be made
A Case study on descriptive analysis
How to deal with bias variance trade-off in decision analysis
A case study on regression
A case study on classification
Building clear visualizations
Making clear and lucid presentations for statistical analysis