An Introduction to Shot-based Learning 

from | 16 September 2024 | Basics

If you have studied artificial intelligence and machine learning, you probably know that obtaining good training data is one of the main problems in creating good models. And in many cases, we need a lot of training data to create models that generalise well, i.e. models that work well both during training and on new examples once the model is deployed.

But what if you could train a model with just a few examples? Or even with just one example? Or without any new training examples at all? Under the right conditions, you can.

In this article, I explain a set of techniques called shot-based learning that allow us to create accurate models with minimal training data.

What is shot-based learning? 

To put it simply, shot-based learning is a Approach for training ML and AI models, where we use a very small number of examples - which we call "shots" - to train the model (more precisely, "shots" are the number of training examples per class). 

Within this approach to model training, there are specific types of shot-based learning - such as zero-shot, one-shot and little-shot. These specific techniques refer to the exact number of examples we use to train the model, although in zero-shot learning we don't actually use new examples to train the model. Instead, in zero-shot learning, we use the existing capabilities of the model to generalise it to new, previously unknown classes. I will write more about zero-shot, one-shot and little-shot a little later in this article.

Why do we need shot-based learning?

At the traditional Machine learning applications we often need a large quantity of Training data. Even with very simple methods such as linear regression and logistic regression, we often need a large amount of training data. And The more complex the problem, the more training data is usually required.

With an insufficient amount of data, machine learning models often encounter problems such as overfitting, where a model performs very well in training but does not perform well on new test examples or in production (i.e. a lack of Data usually leads to a lack of "generalisation").

Obtaining a sufficient amount of high-quality, well-labelled data is therefore often one of the main problems in the development of ML and AI systems.

But what if you could train a model with just a few examples?

Or with an example?

Or even without new examples? 

No new training examples?!

Well, it is possible under the right conditions.

... if you use shot-based learning.

Shot-based learning allows you to train a model with limited data. Essentially, shot-based learning enables the creation of models that can be easily generalised if they are trained with minimal data.

Shot-based learning is therefore particularly useful in situations where high-quality, labelled data is expensive, scarce or time-consuming to obtain. In medical diagnostics, for example, we can use shot-based learning to train models to recognise rare diseases for which we only have data examples for a handful of cases. Or in natural language processing (Natural Language Processing, NLP for short), we can imagine a situation where we are trying to analyse a rare language or dialect for which there are only a few training examples. In such cases, we can use shot-based learning to train models that work precisely even with very limited data.

Natural Language Processing

The natural, spoken language of humans is the most direct and easiest way to communicate. Learn how machines and algorithms use NLP in innovative ways:

Natural Language Processing (NLP): Natural language for machines

Types of shot-based learning 

As already described, there are three main types of shot-based learning:

  1. Few-Shot-Learning
  2. One-shot learning
  3. Zero-shot learning
FrameworkFeatures
Few-ShotUsually requires 2-10 examples. Best used in situations where there are few data examples but we may be able to get a handful of high quality labelled examples.
One-ShotRequires 1 training example. Best suited for tasks where it is extremely difficult to obtain multiple examples and where we only have a single example on which to train the model.
Zero-ShotRequires 0 new training examples. Is used if a model is to be used for tasks for which it has not been explicitly trained.
Overview of the 3 main types of Shot-Based Learning

Few-Shot-Learning

With few-shot learning, a Model trained with a small number of examples for each class.

Few-shot learning therefore usually uses between 2 and 10 examples per class. To achieve this, we usually need to use meta-learning techniques or prior knowledge within the model so that the model can generalise effectively.

Few-shot learning is best suited to situations where there are only a few examples of data, but perhaps a handful of high-quality labelled examples, e.g. in medical diagnostics for a rare disease.

One-shot learning

One-shot learning goes one step further and trains the model on only one example per class

Despite the limited amount of data, in one-shot learning the model still needs to be able to correctly (i.e. accurately) classify new examples, and we often need to use specialised techniques to achieve this (which I will explain later in this article).

One-shot learning is important for tasks where it is extremely difficult to obtain multiple examples and where we only have a single example to train the model on. Face recognition is an example of a task where we can use one-shot learning.

Zero-shot learning

Finally, zero-shot learning takes the concept of shot-based learning to the extreme by no instances of the target class are used to train the model become.

In other words, zero-shot learning first learns the classes and the relationships between the classes from an initial data set using techniques such as pre-training (to learn general features, attributes and relationships in the data space) or by knowledge transfer from related tasks. These pre-training and knowledge transfer techniques then help the model to generalise to new classes that were not included in the training data.

For example, we could train a computer vision model for dogs, cats and wolves and then try to use the common features learned from this model (such as legs, ears, fur, etc.) to generalise them to foxes. This can work because the features learnt in the training data (legs, ears, fur) are present in both the training classes (dogs, cats, wolves) and the new, previously unseen class (foxes). In terms of applications, we often see zero-shot in tasks such as NLP, where we can use zero-shot learning to enable a model to understand words or concepts without explicit training examples for those words or concepts.

Applications of shot-based learning

Now that we have discussed what shot-based learning is and what techniques we can use to implement shot-based learning, let's look at some high-level applications.

There are a variety of ways to use shot-based learning, but we'll look at a few specific applications in business and industry, namely in:

  • Healthcare
  • Marketing
  • Customer service
  • Industry and manufacturing
  • Natural language processing
  • Finance

Healthcare: Identification of rare diseases

One of the most important applications of shot-based learning is in the field of medical imaging and diagnostics.

In particular, we can use shot-based learning to identify rare diseases for which there is little data and few examples on which to train a model. In this area, we can use few-shot techniques to train classification models to accurately identify specific medical conditions. This use of few-shot learning improves diagnostics and our ability to recognise rare diseases when little data is available.

Marketing: Marketing for niche groups and new trends

In marketing, we can use shot-based learning to address niche customer groups or to initiate new marketing campaigns for new trends.

Let's assume the case where we use market segmentation to target smaller customer subgroups. If a particular segment is very small, it may be difficult to create predictive models to help us market to that small segment due to lack of data. In such a situation, we can use shot-based learning to create predictive models or use other techniques of Artificial Intelligence (AI) even with limited data.

In the case of a new market trend, there may be very little data available due to the novelty of the trend. In this case too, we can use shot-based learning to make predictions about how we can market this new trend to customers, even with limited data.

Customer care: Adaptation to new service requests

Since the advent of Large Language Models (LLMs) A few years ago, there was a trend towards automating customer service with LLM-based chatbots or digital assistants. With Chatbots or digital assistants may result in new or unique service requests from customers that were not included in the training data for the model.

In such a scenario, zero-shot and little-shot techniques can help these models adapt to new problems and customer questions without the need for extensive retraining. This improves the responsiveness and flexibility of these automated customer support tools.

Industry and manufacturing: Detection of anomalies and defects

In industry, we can use shot-based learning to recognise anomalies and defects and to support Predictive Maintenance use.

In particular, you can use few-shot techniques to detect critical events such as machine failures. Such a use case could facilitate the early detection of problems, which in turn could reduce factory downtime, increase safety and improve operational efficiency.

Natural language processing: translation and sentiment analysis

Shot-based learning has proven increasingly useful in natural language processing (NLP) applications such as sentiment analysis, text classification and translation.

In this NLP environment, NLP systems can accurately perform tasks with limited training data, such as rare languages and dialects, thanks to few-shot, one-shot and zero-shot techniques.

Finance: Fraud detection

In finance, we can use shot-based techniques for tasks such as fraud detection and risk analysis.

For example, some types of financial or transaction fraud may be extremely rare, making them difficult to detect using traditional analysis methods. In such a task with limited training data, we can use shot-based methods to create more accurate models that can be generalised from a very small set of training examples.

LLM Use Cases, a graphical representation of a conveyor belt in an industrial plant with numerous superimposed geometric shapes

Large Language Models (LLMs) increase efficiency and productivity. Discover in our blog post how LLMs can optimise processes and offer your company real added value:

Large Language Models: Use Cases for Businesses

Challenges when using these frameworks

Finally, I would like to briefly discuss some of the challenges and considerations that need to be taken into account when using shot-based techniques.

The most important areas I will cover here are:

  • Data scarcity and quality
  • Model generalisation
  • Evaluation and test procedures

Data scarcity and quality

One of the biggest problems with shot-based learning lies in the nature of the problem itself: the scarcity of data and the Data quality.

Shot-based methods inherently work with limited data, and as mentioned earlier, this is because we use shot-based methods in situations where data examples are rare, expensive or difficult to obtain.

Therefore, at least in the case of one-shot or little-shot models, we need to ensure that we are able to obtain data samples of sufficient quality to enable us to create these models.

Model generalisation

With shot-based techniques, we often have considerable problems with model generalisation due to the small number of training examples.

With only a few examples per class to train (or even zero examples), the model may have difficulty generalising. In other words, even if you can train the model with a few examples, once you deploy the model, it may perform poorly. This problem is usually referred to as "overfitting", i.e. the model performs well on the training data but performs much worse (i.e. it cannot be generalised) when new data is available.

To better generalise a model, we can use techniques such as meta learning, data augmentation and transfer learning. With these techniques, models can learn and adapt better, even for tasks with limited data. 

Evaluation and test procedures

Model validation and testing pose a particular challenge in shot-based learning due to the limited amount of data.

With traditional machine learning methods, we often have large data sets available for model validation and testing. But due to the limited data samples, the traditional tools and techniques we could use for validation and testing are essentially unavailable.

With shot-based learning, it is therefore usually much more difficult to evaluate the performance of the model and recognise overfitting. To mitigate this problem during evaluation and testing, we often need to use techniques such as cross-validation or few-shot benchmarking.

Project Management [at] Scale, Tech Deep Dive, Alexander Thamm GmbH

Effective project management links strategy and operations, promotes collaboration and ensures competitiveness.

SAFe® 6.0 Framework - Project Management [at] scale

Maximum model performance with just a little data

In this article, I have tried to give you a comprehensive overview of shot-based learning: what shot-based learning is; different types of shot-based learning (such as few-shot, one-shot and zero-shot); different applications in areas such as marketing, healthcare and finance; and some of the challenges of shot-based learning.

Ultimately, shot-based learning is a powerful tool in our AI development toolkit that allows us to build accurate models even when data is scarce, and in turn helps us solve difficult problems when training examples are rare or expensive to obtain.

Author

Patrick

Pat has been responsible for Web Analysis & Web Publishing at Alexander Thamm GmbH since the end of 2021 and oversees a large part of our online presence. In doing so, he beats his way through every Google or Wordpress update and is happy to give the team tips on how to make your articles or own websites even more comprehensible for the reader as well as the search engines.

0 Kommentare