Book Review: Data Science for Business – What is data science? What is data science agency? What problems can a company face when hiring a data scientist?
The authors of Data Science for Business, Foster Provost (Professor of Information Systems at New York University’s Leonard N. Stern School of Business, PhD in Computer Science at the University of Pittsburgh) and Tom Fawcett (PhD in Machine Learning) do not just provide answers to these and many other questions, they derive fundamental solutions relevant not to a specific data analysis method or algorithm, but to looking at the problem as a whole, to applying such analysis to the real world.
Table of Contents
Data as an asset
Many companies today face an urgent need to collect and process information. However, some make the mistake of treating data as some abstract thing to begin with a data science course. Data is an asset that requires an investment: you need to invest in its collection, preparation, processing, and analysis.
On the other hand, if you have a set of data, you have to see it as an asset that already has value – you can increase that value by processing the data and finding some important facts in it.
When you view data as an asset, you have the right approach to the problem of analysis as a whole.
Expected value
Most business problems can be viewed through the lens of expected value. Its essence is that there is a set of events and a set of probabilities with which they can happen. Consider the case of spam. For example, companies are faced with the question: does it make sense to do a mailing? To give an answer, it’s not enough to simply predict how many people will respond to it. You need to consider customers who:
- will receive and respond;
- will receive but won’t respond;
- will not receive but respond;
- will not receive and do not respond.
Taking all options into account is the right thing to do – it’s what will help you come to the right conclusion.
In a nutshell, the idea goes like this: any data analysis problem can be reduced to a single financial metric using the concept of expected value.
Ask the right questions
To better understand how the process of collecting, processing and analyzing data will go, you need to ask the professional offering his services a few key questions:
- What exactly is the business problem to be solved?
- What business specifics is it related to?
- What data will the training be based on?
The last question is something the businessperson needs to pay special attention to. The problem may be that the model will adjust to the data on which it was trained. That’s why it’s important to train on some data and test on other data.
Don’t be afraid of the numbers – be afraid of being fooled
The book, among other things, contains information that, at first glance, may be useful only to technical experts. Yes, businessmen can really flip through such chapters, but there are a few key points that are important to know in order not to become a victim of deception.
Let’s imagine a situation: you hire a team of analysts without knowing the algorithm and data analysis models. For example, you are faced with the task of determining the percentage of bank fraudsters. Let’s assume that their number is 3%, while the remaining 97% of customers are clean before the law. But the analytics team offers you a model that works with 85% accuracy. At first glance, this may seem like a high percentage. However, even that model, which will always say that the person is honest, will be more accurate. Yes, it will be wrong in 3% of cases and will not call a single person a cheater, but it will be 97% accurate, not 85%. So you need to understand what data you are dealing with and what it already contains.