LLM Explainability: Why the "why" is so important

from Dr Luca Bruder | 28 February 2024 | Tech Deep Dive

"Explainable AI" (XAI) is crucial for understanding the decision logic of highly complex AI models such as LLMs.

Suppose you have applied for a loan from a bank. In a conversation with the bank representative, you are informed that your loan application has been rejected. The reason for this is that, based on his experience, the representative does not consider you to be a reliable borrower. Would you be satisfied with this explanation? Probably not. Especially in a context where decisions are made on the basis of your personal data, you would demand a reasonable explanation.

Let us now assume that the representative makes the decision with the help of a Algorithm who came to this conclusion on the basis of your data. This means that the bank representative himself does not even know the actual reason for your rejection. Would that be acceptable? Certainly not! Until recently, however, the use of AI algorithms has led to such scenarios playing out in many areas.

Complex AI algorithms are often fed with personal data and deliver important decisions without the user or even the developer of the algorithm being able to explain how the decisions were made. Artificial intelligence has come to a conclusion. The information about the decision-making process of an AI algorithm is often contained in an incredibly complex binary yes-no game that is initially opaque to humans. Understanding this process in such a way that we can understand and describe it is an important challenge for users and developers of AI models, because AI applications are becoming an increasingly important part of our everyday lives.

Inhaltsverzeichnis

LLMs, legislation and the need for explainability

Large Language Models (LLMs), such as the infamous ChatGPT from OpenAI or its German counterpart Luminous from Aleph Alpha, are very current and prominent Examples of complex AI systems. These are based on gigantic mathematical models that are optimised with large amounts of data over months and weeks. The result is the breathtaking language understanding and production capabilities that have catapulted LLMs to the centre of public and commercial attention.

With the almost infinite possibilitiesGiven the potential that these new models open up, current AI systems are capable of disrupting previously established business processes. However, the lack of a clear explanation as to why an AI model gives a certain answer to a prompt can limit the number of possible applications.

This has been a problem for years for AI applications in highly regulated industries such as finance and medicine. However, the problem will also apply to more and more industries as soon as the European "AI Act" comes into force.

With the AI-Act the EU is responding to the technical achievements of these complex AI models and the socio-economic challenges they pose. At the same time, this regulation is intended to set the course for the The future of digitalisation in Europe and become a driver of innovation for a globally active "AI made in Europe".

An integral part of this new legislation is that AI algorithms that are used in connection with personal data must make comprehensible decisions. This is intended to Protection against misuse of personal data ensure.

XAI - Explainable AI in LLMs

This raises the question of how organisations can use cutting-edge AI technology in areas such as finance, medicine or human resources while complying with strict and important regulations. The field of research that seeks to answer this question is called "Explainable AI" or the abbreviation XAI (Explainable AI). In XAI, researchers are trying to find a way to understand the decision-making logic of highly complex models such as LLMs. The methods used to achieve this are as varied as the different AI models. In general, they are categorised into two major approaches:

At the global explainability attempts to depict and interpret an entire model. The aim is to understand the general decision-making logic of a model.
The local explainability analyses the decision of a model for a specific input in order to understand how this specific input leads to the corresponding output.

In scenarios like the one outlined above, we try to explain why a model has given a certain response to a certain prompt. We are therefore interested in local explainability. How would we go about creating this for a large and complex language model? The first step is to create models ourselves to easily understand their responses and avoid this problem altogether. While this process is certainly possible for many simpler AI applications, the size and complexity of LLMs prevent this approach. An alternative way to create explainability for LLMs is the Use of disruption-based methods. The idea behind this is that we first get the input for a model and its answers. We can then experiment with different variations of the input and observe the effects these have on the answers.

In order to implement this strategy, it is not necessary to know the exact inner workings of the model. Therefore, to a certain extent, we can treat it as a black box. This can be done for LLMs by manipulating the salience of the prompting. This is done by changing the attention that the model assigns to them. By make certain parts of the prompt stand out more or lesswe can analyse the effects of reinforcement or suppression on the generated response. The result is a clear visual representation of the individual parts of the prompting that had the greatest influence on the response given.

With this information, we can account for which parts of the prompt prompted the LLM to create its response. To put it in practical terms:

When an LLM is asked to answer the text "Hello, my name is Lucas. I like football and maths. I've been working on..." and the answer is "... my degree in computer science", we can empirically show that the male name "Lucas" and "maths" have a strong influence on the answer. As explained above, we do this by experimenting with the salience of the different words. For example, if we start taking "maths" out of the sentence, the answer might shift to "football" and read "... improving my game, and I've been playing football for a while." If we do this very often in different combinations, we can draw the conclusions described above. This solution has been integrated into the Aleph Alpha ecosystem and is called AtMan (Attention Manipulation).

Other XAI methods

Methods that experiment with input into an AI model are not unique to LLMs. Going back to the credit scenario described above, we could also imagine a model that takes into account a number of characteristics of an applicant, such as age, salary, wealth and marital status, and predicts whether a loan should be granted.

By using a method called SHAP (Shapley Additive ExPlanations) we can derive the decision logic of the AI model without having to explicitly know the model itself. Here, too, we want to derive the influence that each characteristic of the loan applicant has on the model's reaction.

SHAP works by permuting the characteristics of a person in order to analyse how important certain characteristics are for the AI model's reaction. Permutation in this case means that the Information for some characteristics replaced by other people's values be analysed. By repeating this process many times in many different combinations, we can estimate the impact that each feature has on the response of the AI model.

However, a word of caution is in order. XAI methodslike those described here explain the behaviour of AI models, but do not describe how the characteristics influence the real world. If an AI model, from the data provided to it, has made an incorrect association between a person's characteristics and their creditworthiness, this association will also show up in the XAI method's explanations. However, if a model performs well and learns correct association XAI methods, it can also help us to better understand business problems.

The future of AI: explainability, trustworthiness and legal requirements

In general, it can be said that there are ways to create explanations for the behaviour of even the most complex AI models, such as LLMs. These explanations can help to deploy AI solutions in compliance with current legislation, such as the GDPR, and future legislation, such as the AI Act. However, XAI can only be part of a broader response and must be considered in conjunction with other aspects such as data security, robustness and control.

Trustworthy AI is the guiding principle of the AI Act. It would generally be desirable to establish processes that standardise the development of AI solutions in accordance with guidelines. In recent years, AI experts have increasingly pushed for platform solutions that guarantee these standardisations and possibly also include applications that provide the explanations generated by XAI approaches. To summarise, we are strongly encouraging the development of platforms for AI development, including XAI. In doing so, we want to facilitate compliance and help companies develop innovative AI applications that are not hindered by constant regulatory concerns.

Author

Dr Luca Bruder

Dr Luca Bruder has been a Senior Data Scientist at Alexander Thamm GmbH since 2021. Luca completed his doctorate in the field of computational neuroscience and was able to gain experience in AI and data science consulting alongside his doctorate. He can draw on a wide range of experience in the fields of statistics, data analysis and artificial intelligence and leads a large project on the topic of Explainable AI and autonomous driving at Alexander Thamm GmbH. In addition, Luca is the author of several publications in the field of modelling and neuroscience.