Machine Learning: Replacing Traditional Prediction Models?

3 February 2017

This term crops up everywhere, and an event about analytics wouldn’t be complete without a workshop focusing on this technology. Of course, we’re talking about Machine Learning. Everyone in the data science world is talking about it but they certainly aren’t all applying it (yet). What is Machine Learning really? And as a Marketing Intelligence Data Scientist are you lagging behind if you are not using it yet? This blog provides an introduction to the world of self-learning algorithms and their application in a Data Scientist’s day-to-day work.

 

Artificial Intelligence

Machine Learning (ML) has its origins in the field of Artificial Intelligence and its foundations were laid by Alan Turing in his article Computing Machinery and Intelligence (1950). In this article, Turing asked the question of whether machines can think and soon came to the conclusion that it would be impossible to answer this question without starting with definitions of the terms machines and thinking. Rather than defining these terms, he devised the Imitation Game, which has been known as the Turing Test ever since.

 

The Imitation Game is played by a man (A), a woman (B) and an interrogator (C). The interrogator sits in a different room from the man and the woman and can send messages to put questions to the 2 other people, whom he knows as X and Y. The aim of the game is for the interrogator to work out who is the man and who is the woman. It is also the idea that A will try to mislead the interrogator (i.e. the man pretends to be the woman) and B will try to help the interrogator. Turing then asks whether machines (computers) could take on the role of the man (= the imposter). Would the interrogator then still be mistaken as much as when the role of A is played by a man?

With this, Alan Turing created the basic principle of artificial intelligence and to this day the Turing Test remains the most important test in this domain. These days, for example, we still see this principle in the form of chatbots and popular applications such as Siri on the iPhone. These applications are becoming increasingly intelligent, which continues to make our lives easier. But what technology is actually behind them and how does it work? Welcome to the world of ML.

 

Machine Learning

ML is a sub-area of artificial intelligence. The other sub-areas include Natural Language Processing (a computer understanding a language), Knowledge Representation (a computer being able to rely on knowledge it acquired earlier in the “conversation”) and Automated Reasoning (a computer links acquired knowledge to reach new conclusions). ML goes one step further and concentrates on the ability to adapt to new situations. Machine Learning enables computers to adapt and to learn without being explicitly programmed for this. The algorithms for Machine Learning are intended to constantly process new input and thereby improve the algorithms. This means the algorithms are constantly evolving and the technology therefore has major advantages over traditional technologies.

But what actually is ML and how different is it really from traditional data mining technologies? ML is a system that uses a few complex algorithms to turn large volumes of confusing data into user-friendly, useful information. Instead of analysts extracting those insights from data, ML uses the data to improve the system itself and develop it. That is the major difference between ML and traditional techniques. A data scientist will normally develop a model based on insights from the data, but in ML this process is carried out mostly by the technology itself. By the way, this doesn’t mean that the data scientist is redundant here, as the quality of the algorithms still depends on the quality of the input and that still constitutes a major task for a data scientist. The data scientist is just less intensively involved in developing the model itself. Machine learning means that the application of the data is much less dependent on (although not independent from) the data scientist.
In traditional techniques for predictive modelling purposes, the data scientist develops a model to take a certain input to a specific output (e.g. the probability that the customer will cancel their contract with X). This means that the data scientist collects data, prepares it so it can be used for model-building, builds several models, validates and evaluates these models and selects the best model and writes a code or program to score this model so that it can be used, e.g. for a marketing campaign.

This process is different with ML. The model is developed and improved by the ML algorithms themselves, and not by the data scientist. Both input data and output data are supplied to the algorithms which then give hypotheses about it in order to achieve a certain model performance. The feedback from this performance is then captured by the algorithms, in order to generate and test new hypotheses or decision rules to improve the performance.

 

It is a continuous process, constantly looking at whether the algorithms can be improved based on new data. This also highlights the main advantages and disadvantages of ML. Major advantages are that ML can offer a solution for complex data (e.g. unstructured data or very dynamic data) or complex applications, e.g. fine-tuning your marketing message based on specific online customer behaviour. Another major advantage is that you always have access to the most up-to-date algorithms, maximising its predictive power. Traditional technologies lose their predictive power and after some time you have to update the model. ML is doing that constantly. Particularly if management relies strongly on the outcomes of the algorithms, it is extremely important to regularly review the calculation rules. A major disadvantage is that ML imposes heavy demands on the data infrastructure in order to get maximum efficiency out of the technology. Consider the constant stream of online and offline data to continuously improve the algorithms based on new data. That’s why ML – at least as far as Marketing Intelligence applications are concerned – is still mainly applied in e-commerce companies as they have dynamic online data and websites that lend themselves well to ML. Finally there is a risk that by automating so much you may miss insight into the “why”. Insight that might have been spotted if the data was studied in more depth and which could have led to game-changing insights.

So is ML only suitable for dynamic data? Not necessarily, but the advantage of it is often greater than for more static data. With traditional technologies it can be time-consuming to identify specific customer behaviour based on transaction data from a bank. ML can be much more efficient and effective for doing this because it can handle complex and large quantities of data with relative ease and identify patterns in it by itself. On the other hand, ML does possibly offer less added value than traditional technologies when a prediction model needs to be developed based on purely static data in a saturated market. So in many cases ML may be the data scientist’s next step in predictive modelling, but it doesn’t necessarily have to mean you are lagging behind if you are not working with it yet.

In addition to its commercial applications, ML can also be especially interesting and fun for creative data scientists who also want to put their skills to the test outside the Marketing Intelligence domain. See a great example of this below.

 

Video Source: The Analytics Lab
Other sources: Turing, A.M. (1950). Computing Machinery and Intelligence. Mind 49: 433-460.
Delsing, K. (2014). Onmenselijke voorspellingen doen met ‘machine learning’. [Making Inhuman Predictions with ‘Machine Learning’]. Accessed on 7 January 2017.

Latest news

Find your “high risk files” according to GDPR using our DriveScanner

17 April 2023

In every company it’s a struggle to make sure we only keep the documents we want... read more

Nachos Hackathon 2022

7 September 2022

We don’t know if you’ve heard already, but there is yet another crisis on our horizon:... read more

What we learned from kaggle’s commonlit readability prize 

15 September 2021

Project-Friday At Cmotions, we love a challenge. Especially those that make us both think and have... read more

Subscribe to our newsletter

Never miss anything in the field of advanced analytics, data science and its application within organizations!