LLM Explainability: Why the "why" is so important

  • Published:
  • Author: Dr. Luca Bruder
  • Category: Deep Dive
Table of Contents
    LLM Explainability
    Alexander Thamm GmbH 2024

    “Explainable AI” (XAI) is crucial for understanding the decision-making logic of highly complex AI models such as LLMs.

    Suppose you have applied for a loan from a bank. During your conversation with the bank representative, you are informed that your loan application has been rejected. The reason for this is that, based on their experience, the representative does not consider you to be a reliable borrower. Would you be satisfied with this explanation? Probably not. Especially in a context where decisions are made based on your personal data, you would demand a reasonable explanation.

    Now let's assume that the representative made the decision using an algorithm that arrived at this conclusion based on your data. This means that the bank representative does not even know the actual reason for your rejection. Would that be acceptable? Certainly not! Until recently, however, the use of AI algorithms in many areas has led to such scenarios.

    Complex AI algorithms are often fed with personal data and deliver important decisions without the user or even the developer of the algorithm being able to explain how the artificial intelligence arrived at a result. The information about the decision-making process of an AI algorithm is often contained in an incredibly complex binary yes-no game that is initially opaque to humans. Understanding this process in such a way that we can comprehend and describe it is an important challenge for users and developers of AI models, because AI applications are becoming an increasingly important part of our everyday lives.

    LLMs, legislation, and the need for explainability

    Large language models (LLMs), such as OpenAI's infamous ChatGPT or its German counterpart Luminous from Aleph Alpha, are very current and prominent examples of complex AI systems. These are based on gigantic mathematical models that are optimized with large amounts of data over months and weeks. The result is the breathtaking language comprehension and production capabilities that have catapulted LLMs into the center of public and commercial attention.

    With the nearly infinite possibilities these new models open up, current AI systems are capable of disrupting established business processes. However, the lack of a clear explanation as to why an AI model gives a particular response to a prompt can limit the number of possible applications.

    This has been a problem for AI applications in highly regulated industries such as finance and medicine for years. However, the problem will also apply to more and more industries once the European AI Act comes into force.

    With the AI Act, the EU is responding to the technical achievements of these complex AI models and the socio-economic challenges they pose. At the same time, this regulation is intended to set the course for the future of digitalization in Europe and become a driver of innovation for globally active “AI made in Europe.”

    An integral part of this new legislation is that AI algorithms used in connection with personal data must make traceable decisions. This is intended to ensure protection against misuse of personal data.

    XAI – Explainable AI in LLMs

    This raises the question of how companies can use state-of-the-art AI technology in areas such as finance, medicine, or human resources while complying with strict and important regulations. The field of research that attempts to answer this question is called “explainable AI” or XAI (Explainable AI). In XAI, researchers are trying to find a way to understand the decision-making logic of highly complex models such as LLMs. The methods used to achieve this are as diverse as the different AI models themselves. In general, they are divided into two broad approaches:

    • Global explainability attempts to map and interpret an entire model. The goal is to understand the general decision-making logic of a model.
    • Local explainability examines a model's decision for a specific input to understand how that specific input leads to the corresponding output.

    In scenarios such as the one outlined above, we try to explain why a model gave a particular response to a particular prompt. We are therefore interested in local explainability. How would we go about creating this for a large and complex language model? The first step is to create models ourselves so that we can easily understand their responses and avoid this problem altogether. While this process is certainly possible for many simpler AI applications, the size and complexity of LLMs prevent this approach. An alternative way to create explainability for LLMs is to use disruption-based methods. The idea behind this is that we first obtain the input for a model and its responses. We can then experiment with different variations of the input and observe the effects these have on the responses.

    To implement this strategy, it is not necessary to know the exact inner workings of the model. Therefore, we can treat it as a black box to a certain extent. For LLMs, this can be done by manipulating the salience of the prompt. This is done by changing the attention the model assigns to them. By making certain parts of the prompt stand out more or less, we can investigate the effects of reinforcement or suppression on the generated response. The result is a clear visual representation of the individual parts of the prompt that had the greatest influence on the given response.

    With this information, we can account for which parts of the prompt caused the LLM to generate its response. To put it in practical terms:

    If an LLM is asked to complete the text “Hello, my name is Lucas. I like soccer and math. I have been working on...” and the response is “...my computer science degree,” we can empirically show that the male name ‘Lucas’ and “mathematics” have a strong influence on the response. As explained above, we do this by experimenting with the salience of different words. For example, if we start removing “math” from the sentence, the answer might shift to soccer and read “... improving my game, and I've been playing soccer for a while.” If we do this many times in different combinations, we can draw the conclusions described above. This solution has been integrated into the Aleph Alpha ecosystem and is called AtMan (Attention Manipulation).

    Other XAI methods

    Methods that experiment with input into an AI model are not unique to LLMs. Returning to the credit scenario described above, we could also imagine a model that takes into account a number of characteristics of an applicant, such as age, salary, assets, and marital status, and predicts whether a loan should be granted.

    By applying a method called SHAP (Shapley Additive ExPlanations), we can derive the decision logic of the AI model without having to know the model itself explicitly. Here, too, we want to derive the influence of each characteristic of the loan applicant on the model's response.

    SHAP works by permuting a person's characteristics to analyze how important certain features are to the AI model's response. Permutation in this case means that the information for some features is replaced by the values of other people. By repeating this process many times in many different combinations, we can estimate the impact each feature has on the AI model's response.

    However, a word of caution is in order. XAI methods, such as those described here, explain the behavior of AI models, but they do not describe how the characteristics influence the real world. If an AI model has drawn an incorrect correlation between a person's characteristics and their creditworthiness from the data provided to it, this correlation will also appear in the explanations provided by the XAI method. However, if a model performs well and learns correct association XAI methods, it can also help us to better understand business problems.

    The future of AI: explainability, trustworthiness, and legal requirements

    In general, it can be said that there are ways to create explanations even for the behavior of the most complex AI models, such as LLMs. These explanations can help ensure that AI solutions are used in accordance with current legislation, such as the GDPR, and future laws, such as the AI Act. However, XAI can only be part of a broader response and must be considered in conjunction with other aspects such as data security, robustness, and control.

    Trustworthy AI is the guiding principle of the AI Act. It would be generally desirable to establish processes that standardize the development of AI solutions in accordance with guidelines. In recent years, AI experts have increasingly pushed for platform solutions that provide these standardizations and possibly also include applications that provide the explanations generated by XAI approaches. In summary, we strongly support the development of platforms for AI development, including XAI. In doing so, we aim to facilitate compliance and help companies develop innovative AI applications that are not hampered by constant concerns about legislation.

    Author

    Dr. Luca Bruder

    Dr. Luca Bruder has been a Senior Data Scientist at Alexander Thamm GmbH since 2021. Luca completed his doctorate in computational neuroscience and, in addition to his doctoral studies, has already gained experience in consulting for AI and data science. He has a wealth of experience in statistics, data analysis, and artificial intelligence and is leading a large project on explainable AI and autonomous driving at Alexander Thamm GmbH. Luca is also the author of several specialist publications in the fields of modeling and neuroscience.

    X

    Cookie Consent

    This website uses necessary cookies to ensure the operation of the website. An analysis of user behavior by third parties does not take place. Detailed information on the use of cookies can be found in our privacy policy.