Supervised Learning: Clearly Explained

von | 10 August 2022 | Basics

What is Supervised Learning?

Supervised learning is a learning approach for algorithms to make predictions or classifications. For this purpose the Algorithm a Modelwhich can best solve the task given to it, such as a decision tree or a regression analysis.
A team trained through supervised learning Artificial intelligence (Artificial intelligence (AI) is capable of independently classifying texts or objects, for example, or making predictions (e.g. about price developments or the weather). In advance, it is guided by a large quantity of labelled Training data trained. This means that supervised learning requires a lot of time and effort to collect and prepare the necessary data sets.
It is called "supervised learning" because the algorithm learns through the training data as if it were being supervised by a teacher who already knows the correct answers. Supervised Learning is an essential part of the Machine Learning (machine learning) and is therefore also called supervised machine learning.

How does supervised learning work?

For supervised learning, an algorithm is trained with a large set of training data. The data set consists of input data and the correct output (solutions). The input values are labelled so that they belong together with the desired solutions. This enables the learning algorithm to create a model (for example, a Random Forest or a decision tree) by recognising the relationships between the input and output data. On this basis, it creates forecasts for a new data set.

The quality of the model is determined using test procedures such as cross validation, confidence probability, accuracy or hit rate. The more data sets available for practice, the better results the algorithm can deliver. The learning process is repeated until the model provides satisfactory solutions. Once the training phase is complete, the model can analyse unknown input data using its learned methods and make the correct prediction or classification.

Supervised Learning Algorithms

In supervised learning, algorithms are divided into two problems: in Classification or regression.


In the classification, the algorithm has the Aim to assign the input data to specific categories. To do this, it recognises certain features and patterns within the data set and tries to find similarities and differencesto make a corresponding classification. Categories can be, for example, "cat" and "dog" or "green" and "orange".

Examples of classification algorithms:

- Decision trees
- Random Forest
- Linear classifiers
- Naive Bayes classifier
- k-Nearest classification
- Support Vector Machine


In regression, the algorithm tries to find the Identify relationships between dependent and independent variables and refers to continuous data. Regression algorithms are used Primarily used for forecastssuch as election and purchase forecasts or the prediction of the price development of a property.

Examples of regression algorithms:

- Linear regression
- Logistic regression
- Polynomial regression

Why is supervised learning important?

Supervised learning is the most frequently used algorithm for machine learning. This is because he solve many tasks efficiently and unproblematically can. But not all areas of machine learning can be covered with it.
The great advantage of supervised learning is that one can determine the output of the model very concretely. Thus, it is always known what the goal of the model is. For example, a classification algorithm can be specifically trained to recognise a certain type of street sign in images. The disadvantage is the large amount of work involved in collecting data and making it available for training.

What are the advantages and disadvantages of supervised learning?


The goal, or rather The output of the algorithm is fixed from the beginning and can be directly determined and influenced.. The categories can be very specific and it is known in advance how many of them there are. Supervised learning, compared to the other types of machine learning, is relatively easy to understand. Once the model is trained, it needs no further training to produce results. It simply works according to the learned formula. Supervised learning is generally very good at solving classification problems.


The The volume of training data must be very largein order to achieve good results. For example, if a classification algorithm is trained with only the classes "bird" and "mouse" and is later asked to classify a picture with a cat, it will inevitably classify the picture incorrectly. The Data must be labelledotherwise the algorithm cannot classify them. Since supervised learning works quite simply, it is Unsuitable for complex machine learning tasks.

Areas of application of supervised learning

Supervised Machine Learning is used for classifications or forecasts, depending on the area of application. It is particularly popular for:

- Text classification
- Image classification
- spam detection
- Face recognition
- Tumour detection
- Drug detection
- Predictive Maintenance
- forward-looking analyses such as house prices or stock market prices
- Customer sentiment analysis
- Weather forecasts

Examples of supervised learning in practice

Supervised learning in medicine

Machine learning is being used in more and more areas in the medical sector. For some years now, supervised learning has also been in the early detection of cancer and its prognosis applied. This is used to develop models that can predict the course and treatment of cancer. In addition, trained algorithms can recognise important features from complex data sets and thus facilitate the work of human specialists.

Supervised learning in the industry

In industry, supervised learning is used, among other things in the field of predictive maintenance is used. This is used to determine necessary maintenance in order to prevent failures. For example, it can be used to monitor motors with sensor data and condition indicators in such a way that wear is measured and the remaining useful life is calculated. The model used for this purpose then provides information on possible maintenance work in addition to the data collected in order to increase the service life and prevent failures.

What are the differences between supervised learning and unsupervised learning?


The crucial difference between supervised and Unsupervised Learning lies with the training data. In supervised learning, the input data is labelled and belongs to matching output data. In unsupervised learning, there is only input data without features and without the matching solutions. Thus, a model must be developed by recognising patterns in the data itself. This is why it is called unsupervised learning, because there is no "teacher" who has the right answers.


The objective of both learning approaches is also different: in supervised learning, the type of output is already known and must be predicted for new unknown input data. In unsupervised learning, the goal is to gain knowledge from a large amount of new data. No specific output is predicted, which often makes the training procedure very complex.


Of course, the application areas of supervised and unsupervised learning also differ. Supervised learning is used for classification and regression in labelled data sets. Applications such as text and image recognition or price and weather prediction are among its most common uses.
Unsupervised learning, on the other hand, works with Clustering and associations, for example to detect anomalies, predict customer behaviour or remove noise from a data set.


Inevitably, the algorithms used in supervised and unsupervised learning differ because different tasks need to be accomplished with them. The most commonly used algorithms in supervised learning are decision trees, random forest, linear classifiers, naive Bayes classifier, k-nearest-neighbour classifier and support vector machine for classification. Linear and logistic regression, as well as polynomial regression are used for regression analyses.
In unsupervised learning, for example, K-Means clustering and hierarchical clustering are used for clustering. And for association problems, among others, the Apriori or the Eclat algorithm.

What is Semi-Supervised Learning?

Semi-Supervised Learning is a mixture of supervised and unsupervised learning and Combines labelled and unlabelled data sets for training. Semi-supervised learning is used when a large amount of data is available but only a small amount of it is labelled. Then the algorithm is first trained with the labelled data, as in supervised learning. Once the model performs well, it is used to predict the remaining unlabelled data and label it with the appropriate solutions.
Then training with the complete data set of labelled and "pseudo-labelled" data is possible.

What are the differences between supervised learning and reinforcement learning?

Learning principle and data

The principle of the Reinforcement Learning (This is fundamentally different from supervised learning. While in supervised learning the training data already contains the answer, in reinforcement learning there is no predetermined correct answer. The agent trained by reinforcement learning decides for itself how to proceed and learns only through its own experience. Accordingly, it must find suitable measures to maximise its reward and solve the given task. In principle, the agent learns through trial-and-error, i.e. by making mistakes and not repeating them, it steadily improves and finds the appropriate solution.

Aim and applications

As already described, supervised learning can be used for classification and prediction. The goal is clearly defined, for example in the detection of spam emails.
Reinforcement learning can be used for much more complex tasks, for example when the agent can only learn with the help of interactions in its environment. This is the case, for example, when learning the board games Chess, Go and Shogi. Very famous artificial intelligences that have mastered these games, sometimes through reinforcement learning, are AlphaZero and AlphaGo from Google DeepMind.

<a href="" target="_self">Patrick Kinter</a>

Patrick Kinter

Pat has been responsible for Web Analysis & Web Publishing at Alexander Thamm GmbH since the end of 2021 and oversees a large part of our online presence. In doing so, he beats his way through every Google or Wordpress update and is happy to give the team tips on how to make your articles or own websites even more comprehensible for the reader as well as the search engines.


Submit a Comment

Your email address will not be published. Required fields are marked *