What is Natural Language Generation?

Natural Language Generation or text generation is the automatic production of natural language by a machine. NLG is a special form of artificial intelligence. A generation process uses different description models and technical terms under different methods and perspectives. The statements should be free of contradictions. According to a proposal by Ehud Reiter, an architecture can be used that consists of a text planner and a sentence planner as well as a surface realiser by default.

The theory of rhetorical structures is used for discourse relations. A text is coherent if it can be consistently represented by a tree of rhetorical relations and elementary text units. There are relations between main and subordinate clauses and thereby the connecting links:

  • LIST
  • and more

Generation requires two components according to M. Hess. A strategic component determines what is to be said. This is how information selection, content selection and range planning take place. Artificial intelligence search and planning strategies are used for this component. There is also the tactical component. In this component, it is determined how something should be said. This is how the linguistic form is planned. A grammar tailored to the generation aspect is used here.

Ulrich Gaudenz Müller developed a text generation system called SARA (Satz-Random-Generator) together with the Germanist and computer linguist Raimund Drewek. The prerequisite for generation is that the information is available in a formal, computer-linguistic form and can be extracted from databases or knowledge representations, for example. There are areas of application in robot journalism, chatbots and content marketing.

What are the fields of application for Natural Language Generation?

Natural Language Generation (NLG) can be used wherever structured data needs to be generated, for example in e-commerce or on the stock exchange. NLG can also be used in reporting for sports, business and weather. The aim is to create reader-friendly texts.

What technologies are used for NLG?

"Big Data" can be included in the text creation. In this way, facts can be presented and figures can be interpreted in detail. The output can be passed on to a CMS. The popular programming language Python can be used to generate, translate and edit texts. Deep Learning to be used.

What are the levels of NLG?

There is content analysis, data comprehension, document structuring, sentence composition and grammatical structuring and language presentation. In content analysis, data is filtered to determine what should be included in the content. In data understanding, the data is interpreted, often with the help of machine learning. Document structuring is planned with the document and a narrative structure is selected.