Why we need to develop large AI models in Europe as well - a guest article in Handelsblatt

from JÖRG BIENERT | 14 June 2022 | [at] News

Large AI models like GPT-3 or PaLM are revolutionising the market for artificial intelligence (AI). Germany and Europe are missing out on this development because European developers lack the right framework conditions. In order not to become dependent on American solutions - as is already the case in other digital areas - business and politics must finally react. First and foremost, there is a need for a dedicated AI supercomputing centre where large AI models can be researched and developed. The LEAM initiative aims to create an ecosystem around such a supercomputer and to make large European AI models a reality.

Advancing into a new phase of AI development

In June 2020, the American company OpenAI presented the Generative Pre-trained Transformer 3, or GPT-3 for short. With 175 billion parameters, the AI model is more than a hundred times larger than its predecessor GPT-2 and was the largest AI language model of all time at the time of its release. This release was the starting signal in the race for large AI models.

Within a very short time, developers were using the algorithm in a wide variety of applications. GPT-3 applications reliably compose emails and journalistic texts or function as chatbots. They summarise documents, fill them out automatically or recognise certain features. In one highly regarded application, GPT-3 even turns text into programme code.

Less than a year after its release, OpenAI announced that over 10,000 developers are already using GPT-3 and have published over 300 applications. The big beneficiary of this success story is Microsoft, which officially licensed GPT-3 in September 2020. Since then, developers can no longer openly access the GPT-3 code, but must use an API provided by Microsoft.

GPT-3 was just the beginning

Microsoft, together with NVIDIA, took the next step in the development of large AI models at the end of 2021. The Megatron-Turing NLG model has 530 billion parameters and is thus around three times as large as GPT-3. However, the two American companies only held the record for the largest AI language model until a few weeks ago, when Google presented its Pathways Language Model, or PaLM for short, the largest language model to date with 540 billion parameters. Most recently, Facebook's parent company Meta also entered the race for large AI language models.

At the same time, the Beijing Academy of Artificial Intelligence in China was working on a solution to compete with the American models. The result is called WuDao2.0, consists of text as well as image data and is 1.75 billion parameters in size. It thus far surpasses other multimodal models such as OpenAI's Dall-E or Google's MUM. The special feature of these multimodal models is that they combine text and image data and create completely new application possibilities. For example, they create images from text or recognise the content of videos.

But the database of large AI models does not stop at image and text data. For example, DeepMind, a Google company, uses protein structures for its AlphaFold programme. It allows the 3D structure of a protein and thus its function to be recognised. It single-handedly solves one of the greatest challenges of modern biology.

In addition to the large American tech companies and Chinese state organisations, it is mainly developer communities such as HuggingFace and EleutherAI that are currently working on large AI models. One looks in vain for a large European model.

In order to train large AI models, three things are needed above all: data, well-trained developers and computing capacity.

There is a lack of computing power In order to train large AI models, three things are needed above all: data, well-trained developers and computing capacity.

With GPT-3, the data comes from the Common Crawl, a publicly accessible database on the internet, as well as other public databases. WuDao 2.0 adds Chinese databases to this mix. European languages are not included in either. Yet there are also large data sets in Europe that reflect the linguistic diversity of the continent. These include, for example, the EuroParl database, which contains the deliberations of the European Parliament since 1996.

In addition to data, skilled developers and AI researchers are needed to work on algorithms. Many basics are freely available, but new and improved algorithms are constantly being developed.

Germany and Europe have good training opportunities for future AI developers, and German AI research can hold its own in international comparison.

Despite sufficient data and well-trained specialists, there is currently no comparable large European AI model. This is mainly due to a lack of computing capacity in Germany and Europe. The training of GPT-3 took eleven days on NVIDIA's supercomputer Selene. Microsoft and NVIDIA even needed six weeks to train MegatronTuring NLG. German AI developers do not have the possibility to calculate for such a long time. The only comparable supercomputer, JUWELS in Jülich, is used jointly by physicists, climate researchers and biologists. The development of large AI models plays only a subordinate role. The OpenGPT-X project, which is funded by the Federal Ministry for Economic Affairs and Climate Protection and aims to develop a European language model, recently had to go to great lengths to apply for computing power. An effort and loss of time that Europe cannot afford.

Germany needs a dedicated AI supercomputing centre

This situation has consequences for the AI landscape, but also for business and citizens in Europe. Non-European models are often not openly accessible and transparent. Biases that can be discriminatory, for example, are not disclosed. In addition, these models do not support any or only a few European languages. Especially for smaller language communities, it is unlikely that non-European models will implement the languages in their model. In addition, it is not clear what happens to the data that European citizens and companies provide to these models.

If the development of large AI models in Germany and Europe does not pick up speed soon, users will be faced with a choice: Either become dependent on American solutions and move valuable data abroad, or not use the immense advantages of the technology.

To prevent this scenario, Germany and Europe must provide their own large AI models. These must be openly available and form a basis for applications by companies and start-ups. The models should take European languages and European ideas of data protection into account and be developed as transparently and distortion-free as possible. The development should also focus on climate neutrality.

Only if Europe succeeds in making its own competitive applications based on large-scale AI models available to the economy and citizens will European values also be reflected in these applications and digital sovereignty in the field of AI be guaranteed.

Only when the AI community, business and politics pull together can large AI models be realised.

LEAM: The lighthouse project for the development of large European AI models

In order to realise the development of large AI models in Europe and to prevent dependence on non-European solutions, the AI Bundesverband launched its initiative LEAM, Large European AI Models, last year. Supported by numerous renowned research institutes, corporations, associations and start-ups, LEAM is a lighthouse project in the European AI landscape.

The core of the project is the establishment of a dedicated AI-oriented data centre where research and development can be carried out in an uncomplicated manner. A thriving, autonomous AI ecosystem of research, start-ups, SMEs and industry is to be built around this data centre. At the same time, the project serves as an accelerator for OpenData initiatives, which can efficiently exploit their data at LEAM. Since the launch of LEAM, the network has been growing steadily, but only if the AI community, business and politics pull together can the idea of large European AI models be realised.

We must not repeat old mistakes

We are currently at the beginning of a major shift in the research and application of AI. In parts, this situation is reminiscent of search engines and Google's position in the early 2000s. At that time, no one recognised a successful business model at Google. A few years later, European countries tried to build an alternative. As we know today: without much success. Similar patterns exist in social media, operating systems for smartphones and currently also with cloud providers. We must finally learn from these mistakes.

Therefore, it is important to react now and enable Germany and Europe to develop large AI models according to European standards. If this does not happen, we will lose another part of our digital sovereignty and in a few years we will only build application-specific front ends for AI from other states.

Disclaimer:

This article originally appeared in the Handelsblatt Journal "Künstliche Intelligenz - AI Experience" (June 2022) as a greeting from Jörg in his role as President of KI Bundesverband e.V.

Author

JÖRG BIENERT

Jörg Bienert is partner and CPO of Alexander Thamm GmbH, Germany's leading company for data science and AI. At the same time, he is co-founder and chairman of the KI-Bundesverband e.V. and a member of the Advisory Board Young Digital Economy at the BMWI. Furthermore, he is a respected keynote speaker and is regularly featured in the press as a data & AI expert. After studying technical computer science and holding several positions in the IT industry, he founded ParStream, a Big Data start-up based in Silicon Valley that was acquired by Cisco in 2015.

0 Kommentare

Search

Career