Transfer Learning in Machine Learning: Simply explained

from | 2 October 2024 | Basics

Transfer learning has developed into a very valuable method in machine learning, which makes it possible to transfer knowledge from an already learnt task to a new, often related task. This offers a significant advantage, especially when only limited data or computational resources are available for the target task. Instead of training a model from scratch, transfer learning uses pre-trained models to achieve better results faster and more efficiently. This technique has proven itself in various fields and is revolutionising the way machine learning models are developed and applied.

What is Transfer Learning?

Transfer learning describes a Machine learning technique in which a model that has been pre-trained on a specific task is applied to a new but related task. The aim is to use the knowledge that a model has learnt in the source task for the target task and thus accelerate and improve the learning process. The aim is to transfer the knowledge, so to speak. This not only saves computing time, but also enables use in areas where data is scarce or difficult to collect. Instead of training a model from scratch, an existing model is used in order to achieve better results more quickly.

Transfer learning vs. fine-tuning

On the one hand, fine-tuning represents a differentiation from transfer learning; on the other hand, it can also be considered a special type of transfer learning in which a pre-trained model is adapted to a specific task. While transfer learning often uses the entire source model (or parts of it), fine-tuning allows the model to be further trained and optimised by only using certain layers (e.g. the last layers of a neural network) for the target task. The model is "tuned" to the new task, which is often necessary if the nature or data of the target task differs slightly from the original task.

Fine-tuning the CLIP Foundation Model for Image Classification, Dr Bert Besser, Principal Data Engineer, Alexander Thamm GmbH, Tech Deep Dive

In our article, we analyse how the CLIP model performs after fine-tuning to specific datasets compared to traditional models such as ResNet50.

Fine-Tuning the CLIP Foundation Model for Image Classification

Transfer Learning vs. Domain Adaptation

Similar to the fine-tuning described above, domain adaptation also reveals certain differences, but also similarities and possible combinations with transfer learning. In domain adaptation, the model is used in a scenario in which the source and target domains (i.e. the data sets) are different. In contrast to traditional transfer learning, where the tasks are similar or related, the characteristics of the data can differ greatly between the two domains in domain adaptation. The aim here is for the model that has been trained on the source domain to be able to adapt to the target domain without extensive retraining.

Types of transfer learning

There are several types of transfer learning that differ in their approach, application, type of data, tasks and models:

Supervised Transfer Learning refers to the transfer of knowledge between two tasks (both source task and target task), both of which are labelled/labeled. Data and are therefore assigned to a category or class. A model is first trained on a large amount of labelled data for a source task. This pre-trained model is then applied to a target task that also has labelled data. The advantage of this method is that the model has already learnt generalised features that are also relevant for the target task.

A classic example is the use of pre-trained image classification models that have been trained on large data sets and then transferred to specific image processing tasks such as the detection of medical anomalies.

Supervised Learning: compactly explained, teacher in front of her class and the blackboard

For more information on supervised machine learning, read our basic article for beginners and experts:

Supervised Learning: Clearly Explained

3. Unsupervised Transfer Learning a model is trained on unlabelled data for a source task and then applied to a target task that also contains unlabelled data, i.e. is not assigned to a category or class. This approach is used when no or very little labelled data is available. The model can learn general patterns or structures in the data, which are then useful in the target task.

One example is the application of word embedding, which can be trained on huge amounts of unlabelled text and then used for various tasks in natural language processing.

Unsupervised Learning: compactly explained, children running haywire in a classroom

Unsupervised machine learning is a powerful tool for gaining valuable insights from data. Read our basic article to find out which algorithms are used to perform tasks such as anomaly detection or data generation.

Unsupervised Learning: Clearly Explained

Semi-supervised transfer learning combines labelled and unlabelled data. Here, a model is first trained on a large amount of unlabelled data and then fine-tuned on a smaller amount of labelled data. This approach is useful when only a limited amount of labelled data is available for the target task, but large amounts of unlabelled data are available.

One example is image classification, where unlabelled images are used to learn general visual patterns, which are then refined using a small amount of labelled data for the target task.

3. Multi-Task Transfer Learning a model is trained simultaneously on several related tasks. Here, the model divides its capacity between different tasks, learning from the different tasks and combining this knowledge to improve overall performance. The transfer of knowledge between these tasks often leads to a better generalisation capability.

A typical example is the simultaneous implementation of object recognition and image segmentation in the Computer Visionwhere the model can benefit from the overlapping visual information.

Zero-shot transfer learning is a method in which a model is applied to tasks for which it has not seen any specific data during training. The model uses the general understanding it has learnt from the source task to master the target task, even though no direct examples are available. This technique is mainly used in the natural language processing and computer vision.

An example would be a trained model to generate general image descriptions, but is able to recognise objects that it has never seen during training.

Related to zero-shot transfer learning is the Few-Shot Transfer Learning. This method aims to train a model with only a few examples for the target task. It is useful when only a small amount of Training data is available. By pre-training on a large source task, the model can develop the ability to generalise well even with few examples of the target task.

One example is the classification of rare diseases where only a few medical images are available, but the model is supported by pre-trained networks on other medical images.

Benefits and challenges

BenefitChallenges
Less data requiredRelevance of the source and target task is critical
Faster training timesAdapting the model to the target task can be difficult
Utilisation of knowledge already gainedPossible overfitting to the source task
Better performance with limited resourcesLack of flexibility with large data differences
Comparison of the benefits and challenges of transfer learning

Benefit

  • By using a pre-trained model, less data is often required to achieve good results.
  • Transfer learning speeds up the training process considerably, as pre-trained shifts are used.
  • It enables the use of domain knowledge learnt in similar tasks and often leads to better results with limited resources.

Challenges

  • One of the biggest challenges is to ensure the right similarity between the source and target tasks.
  • In some cases, the model needs to be heavily customised to fit the new task, leading to a more complex fine-tuning process.
  • Models that are too closely customised to the source task may have difficulty transferring to a new task.
LLM Use Cases, a graphical representation of a conveyor belt in an industrial plant with numerous superimposed geometric shapes

Large Language Models (LLMs) increase efficiency and productivity. Discover in our blog post how LLMs can optimise processes and offer your company real added value:

Large Language Models: Use Cases for Businesses

Areas of application of Transfer Learning

Transfer learning has become one of the most important tools in machine learning, where large amounts of data or long training times pose a challenge. It is often used to improve the performance of models and speed up the training process by transferring previously learnt knowledge structures to new, related tasks. Two of the most prominent application areas of transfer learning are computer vision and natural language processing (NLP).

Computer Vision

Typical computer vision tasks are Image classification, object recognition and image segmentation. Training deep neural networks from scratch for these tasks would often require huge amounts of labelled data and extensive computational resources, which can be a major challenge in practice. Transfer learning is therefore often used in computer vision by applying pre-trained models to specific image processing tasks. These models have already been trained on huge image datasets containing millions of images and thousands of classes. The models have already learnt common visual features, such as edges, shapes and textures, which are also useful in many other visual tasks.

The process of transfer learning in computer vision is usually as follows. A pre-trained model is used as the basis. For the target task, such as the Recognition of certain medical anomalies on X-ray imagesthe last layers of the network are replaced by new layers that are specially adapted to the target task. These new layers are trained while the remaining layers of the pre-trained model remain unchanged or are minimally adapted. If the target task differs greatly from the original task, some of the deeper layers of the model can be further trained as part of fine-tuning to better fit the new data.

The main advantages of using transfer learning in computer vision are the Saving time and resources as well as a good generalisation option. By using pre-trained models, the training time can be shortened and applied to specialised areas.

Natural Language Processing

Another area of application is natural language processing. NLP deals with the processing and analysis of natural language by machines and typical tasks include Text classification, machine translation, text summarisation and sentiment analysis. Traditionally, NLP models required extensive training data sets to work well. However, the introduction of pre-trained language models has fundamentally changed the approach to NLP tasks. These models are trained on huge text corpora to learn general language patterns that can then be applied to specific tasks.

The process of transfer learning in NLP is similar to that in computer vision, but includes some NLP-specific techniques. The starting point is a model that has been trained on extensive text corpora. The pre-trained model is then adapted to a specific NLP task. During fine-tuning, the already learned language representations of the model are finally used and further optimised for the specific task. This is usually done by training on a specific data set that is relevant to the target task.

Advantages from the use result from a efficient use of language knowledgewhich the pre-trained models have already acquired, as well as the ability to work with smaller, specific data sets.

Natural Language Processing

The natural, spoken language of humans is the most direct and easiest way to communicate. Learn how machines and algorithms use NLP in innovative ways:

Natural Language Processing (NLP): Natural language for machines

Achieve better results faster with transfer learning

Transfer learning makes it possible to transfer previously learnt knowledge structures of a model to new tasks, which means that less training data and computing resources are required. Different approaches, such as supervised, unsupervised and semi-supervised transfer learning, offer flexible application options for different tasks and data availability. Despite the many advantages, such as time savings and improved model performance, there are challenges such as adapting to different domains or data distributions. Transfer learning is widely used, especially in areas such as computer vision and NLP, where it accelerates the development of powerful models.

Author

Patrick

Pat has been responsible for Web Analysis & Web Publishing at Alexander Thamm GmbH since the end of 2021 and oversees a large part of our online presence. In doing so, he beats his way through every Google or Wordpress update and is happy to give the team tips on how to make your articles or own websites even more comprehensible for the reader as well as the search engines.

0 Kommentare