What is ChatGPT?
ChatGPT was published by its developer OpenAI in November 2022 and is considered the successor to the InstructGPT models. OpenAI is an American company that researches artificial intelligence and is supported by Elon Musk and Microsoft, among others. The non-profit organisation, founded in 2015, also published the GPT-2 and GPT-2 language modules, among others. GPT-3 and the programme DALL-E and its successor, DALL-E 2, which are capable of Machine learning to create images on the basis of text descriptions.
How does the language model work?
While ChatGPT is traded as a sister model of the aforementioned InstructGPT, the algorithm is built on a Model of GPT-3, specifically the GPT-3.5 series.
The language model uses what is known as "reinforcement learning from human feedback (RLHF)", whereby the foundations of the model are laid through supervised learning (supervised learning) are to be laid. For this purpose human trainers used to Training data by taking on the role of both the user and the AI assistant.
In the second step, they assisted in the creation of reward models for reinforcement learning (reinforcement learning) of the model by evaluating the responses generated by the trainers. Based on this, the Reward models through proximal policy optimisation be refined.
ChatGPT can currently be downloaded from the OpenAI website can be called up and used. After registration by means of an OpenAI account and successful login, the model can be currently be used and tested free of charge during the so-called "research preview.
OpenAI hopes to receive feedback from users at this stage, as well as user testing of the tool's strengths and weaknesses. The user agreements clarify that the language model may not be used for purposes that infringe on the rights of individuals to discover source code, develop other large-scale models that compete with OpenAI, or declare the data output to be human-generated when it is not.
The language model is designed to communicate with users in dialogue format. It should also be able to answer follow-up questions correctly within a conversation. This is possible because ChatGPT stateful is and is Reminds you of previous promptsThis allows the user to refer to it and it is understood by the language model.
ChatGPT should also be able to reject inappropriate and illegal requests and refuse replies. Limitations in the function the company states in that way by pointing out that the chatbot sometimes generates plausible-sounding but wrong and nonsensical answers. The causes of this behaviour are discussed and justified with the fact that during reinforcement learning there is no source of truth, in supervised learning the knowledge of the human trainer is decisive, and a conservative or more cautious answer policy leads to questions remaining unanswered although the system could answer them correctly. Furthermore, slight changes in the input can lead to a change in the output answer or, in the case of ambiguous queries, the model tries to guess and answer the intended question instead of asking a query.
ChatGPT often provides very extensive responses, as these have been preferred by the trainers and are therefore rewarded more highly. Although the language model is trained to prevent inappropriate requests, this cannot be entirely prevented.