There is ever-increasing competitive pressure to develop innovative AI models. One year after OpenAI was awarded the GTP-3 model had been able to land a huge leap in development and had sent the world into turmoil, researchers from the Beijing Academy of Artificial Intelligence (BAAI) now presented Wu Dao 2.0 at the beginning of June 2021 - 10 times larger than GPT-3 and now the world's largest neural network model.
From a tech perspective, this is fascinating news. For European and American politics as well as industry, a warning signal not to fall completely behind. Or, to put it another way: a signal for China's ambition to become the world leader in AI development.
Wu Dao 2.0 puts GPT-3 and Google Switch Transformer in the shade
It was only in March 2021 that the BAAI released the predecessor model Wu Dao 1.0. Just one month later, the research group around industry partners such as Xiaomi, Meituan and Kuaishou had the updated version of the multimodal model presented.
Wu Dao 2.0, which literally means "understanding of the laws of nature", has 1.75 trillion parameters. Thus, it surpasses GPT-3 by a factor of ten and breaks the size record previously set in May by Google's Switch Transformer AI language model (1.6 trillion parameters) by 150 billion parameters.
In line with last year's increased development towards multimodal AI systems, Wu Dao 2.0 also learns from image and text data and can flexibly process complex tasks based on both types of data. That is, it masters capabilities such as natural language processing, text generation, image recognition and image generation, and can even predict 3D structures of proteins, much like DeepMinds AlphaFold.
Special features of Wu Dao 2.0.: Size and robustness
The model was trained using 4.9 TB of text and image data, which makes the GPT-3 training set (570 GB of clean data from 45 TB of curated data) look shockingly small in comparison. This data consists of 1.2 TB of Chinese text data, 2.5 TB of Chinese graphics data and 1.2 TB of English text data.
Comparable multimodal approaches are OpenAI's DALL-E and CLIP or Google's LaMDA and MUM. Only the Chinese model is much more complex in terms of scale and achieves a robustness that, according to the Researchers at the BAAI outperformed the current state-of-the-art (SOTA) in nine widely used AI benchmarks:
- ImageNet (zero-shot): OpenAI CLIP
- LAMA (factual and commonsense knowledge): AutoPrompt
- LAMBADA (cloze tasks): Microsoft Turing NLG
- SuperGLUE (few-shot): OpenAI GPT-3
- UC Merced Land Use (zero-shot): OpenAI CLIP
- MS COCO (text generation diagram): OpenAI DALL-E
- MS COCO (English graphic retrieval): OpenAI CLIP and Google ALIGN
- MS COCO (multilingual graphic retrieval): ahead of UC² (best multilingual and multimodal pre-trained model)
- Multi 30K (multilingual graphic retrieval): before UC².
Wu Dao 2.0. and FastMoE
Anyone who now asks the question about usability and commercialisation possibilities will probably receive FastMoE as an answer. This open-source architecture, which is similar to Google's Mixture of Experts (MoE), was used for Google's Switch Transformer. There, certain information is only ever routed to one expert network within the large model. This reduces the necessary computing power, since only certain sections of the model are active depending on the information being processed. Hyperscaling, efficiency and high precision are thus ensured. In addition, FastMoE is more flexible than Google's system because it was trained by supercomputers as well as on conventional GPUs and thus does not require proprietary hardware.
It should be noted that a scientific publication on Wu Dao 2.0 is still pending. However, it seems that Wu Dao 2.0 can generate noteworthy results in the most important benchmarks across tasks and modalities.
Application of Wu Dao 2.0. - on the way to the AI Grid
One goal being pursued, according to Tang Jie, deputy director of the BAAI, is the development and implementation of cognitive abilities in machines (Turing tests).
This was demonstrated during the presentation of Hua Zhibing, a virtual student who has learned to compose music, write poetry, paint pictures and code on the basis of Wu Dao 2.0. In contrast to GPT-3, Wu Dao 2.0 seems to approximate human memory and learning mechanisms, as forgetting what was previously learned no longer occurs.
Apart from this playful avatarisation, however, Wu Dao 2.0. is much more to be understood as the next milestone for the future of an area-wide transformative AI industrial infrastructure, similar to an electricity grid. This connects AI applications with each other and intelligently controls capacities. This will be reinforced by the fact that providers will use the data provided by customers via the interfaces to expand the training set in order to contribute to continuous improvement of the overall system.
Wu Dao demonstrates the status quo of China's AI strategy
The fact that the Chinese government has been using the potential of AI as a strategic advantage in international competition for several years is certainly not a new insight. The first fruits from the AI and Innovation Plan, which envisaged the establishment of 50 new AI institutes by 2020, are being harvested with Wu Dao 2.0. Whether this was already the "big breakthrough", as China paraphrases its strategic goal for 2025, would probably be hopeful from a European perspective, but also naïve.
Because already in 2018 and 2019, the Beijing government put over $50 million into the Beijing Academy of Artificial Intelligence.
From a research perspective, China can now consider itself the world leader in AI publications and patents. The global share has shifted in recent years from 4% in 1997 to 28% in 2017, and the trend is rising. This trend also points to the power China can unleash in the field of AI-enabled businesses, such as voice and image recognition applications.
Challenge for Europe
As a consequence of this prevailing development, offerings from Chinese providers that have already followed the AI transformation will exert enormous market pressure on European companies and states. A prominent example that has recently sparked geopolitical dynamics is the Chinese social media platform TikTok.
Another effect that should not be underestimated is that AI models also always express the data and biases of their programmers. In concrete terms, this means that if developments towards English and Chinese language models manifest themselves, other cultures will have to fight to have their languages and values taken into account.
It is all the more important to underline that AI models are an informal indicator of continental or national progress and a central dimension of technological competition between China, the US and Europe.
According to a Study by the European Investment Bank Around 80 per cent of investments in AI and blockchain technologies are made by the USA and China, while Europe accounts for only 7 per cent of the investment sum, around 1.75 billion euros.
The latest developments around Wu Dao 2.0 raise fears that Europe is facing the situation of losing its digital sovereignty in the field of AI.
Strengthening Europe's AI position needed
In April 2021, seven European AI industry associations, including Germany, Austria, Sweden, Croatia, Slovenia, the Netherlands, France and Bulgaria, approached the EU Commission to draw attention to the situation and propose measures to develop large-scale AI models in Europe.
Because if Europe does not react quickly, there is a danger that oligopoly or monopoly markets controlled by China and the USA will form. The forces and resources made available for AI at the German and European level must be bundled and invested more strongly in moonshot projects - this is the only way to avoid losing out.