Classification procedure

What is a classification procedure?

Classification procedures are methods and also criteria that serve to divide (classify) objects and situations into classes. Many methods can simply be implemented as an algorithm and are referred to as machine or automatic classification. Classification procedures are always application-related and many different methods exist. Classification procedures play a role in pattern recognition, in the artificial intelligence, in documentation science and information retrieval.

What are types of classification procedures?

There are classification methods with different properties. There are automatic and manual methods, numerical and non-numerical methods, statistical and distribution-free methods, supervised and non-supervised methods, fixed-dimension and learning methods, and parametric and non-parametric methods.

At Data mining For the classification of objects, decision trees, neural networks, the Bayes classification and also the nearest neighbour method are used. Most of the time, the classification procedures have a two-stage structure. There is a learning phase with Training data and finally the classification phase.

Decision trees

In this procedure, data runs through a decision tree. The characteristic values of objects are checked at each individual node and it is determined which path in the tree should now be followed further. Finally, a leaf node is always reached and this is then the class of the object. The decision tree is basically created with the help of training objects. A recursive divide-and-conquer algorithm is used. The advantage is that all the rules determined can be interpreted quite easily. A Cluster analysis can be better understood with the classes identified by applying decision trees.

Neural networks

The Neural networks consist of different nodes (Neurons), which are connected to each other. Such a neural network consists of several layers. These nodes of all individual layers are connected to each other at the layer transitions. Each connection has its own edge weight. At the beginning of the training, such weights are determined randomly. The edge weight can decide to which node an object can go next to be finally assigned to an output node. Each output node in the output layer represents a class. Depending on the activation path from an object, a certain output node becomes active. Finally, learning takes place through verification by comparing actual and target results with the training data. Errors are easily fed back into the neural network and thus edge weights are successively adjusted. Outliers in the data are detected particularly well. The classification results, on the other hand, are determined in a hardly comprehensible way.

Bayes classification

In Bayesian classification, a class is assigned on the basis of the probabilities of all characteristics. Each object is assigned to its class by determining the probability of occurrence of the respective feature combination. Each occurrence is approximately estimated by the respective training data. The advantage is that a high accuracy of the classification is achieved when this method is applied to large amounts of data. The disadvantage, however, is that in the case of a wrongly assumed distribution or feature independence, the respective results become inaccurate and completely falsified.

Next-Neighbour Procedure

With this method, objects can be compared precisely with each other and finally assigned to a class. A comparison is made with similar training objects. The basis for comparison is the previously defined distance or similarity measure. Now, the most frequently occurring class in which the object comparisons occur is considered the result class. An advantage is the applicability to corresponding qualitative and quantitative characteristics of the objects. A disadvantage is the extremely time-consuming classification phase, because the entire training data must always be used for each comparison.

Examples from the field of data science

At Area of data mining are analyses of Big Data is carried out. In this way, large amounts of data are processed efficiently and reliable and easily interpretable results are to be achieved. A short processing time is the goal. It should be possible to process different types of data structures, such as text analyses, image processing, numbers, coordinates and the like.

Text mining is used to extract interesting and non-trivial knowledge from completely unstructured or weakly structured texts. Information retrieval and data mining play a role here, machine learning, statistics and computational linguistics. Text analyses such as cluster analyses, classification of texts and the construction of a corresponding question-answer system are used in text mining.

What is the difference between classification and regression?

Regression is the prediction of continuous values. Training is carried out with the help of Backpropagation. This is an optimisation procedure that uses a gradient method to directly calculate the error of a forward propagation and adjust the weightings against the error. By carrying out backpropagation, the "correct" weightings are obtained. In classification, on the other hand, group membership can be predicted.

Mathematically, regression and classification do not differ too much from each other. In fact, many classification methods can also be used for regression with only a few adjustments, and vice versa.

Artificial neural networks, nearest neighbour methods and decision trees are examples of these being used in practice for both classification and regression. What is different in any case, however, is the purpose of the application: With regression, one wants to predict continuous values (such as the temperature of a machine) and with classification, one wants to distinguish classes (such as "machine overheats" or "does not overheat").

The most common method in which classification problems can be tackled in supervised machine learning is logistic regression.

Cognitive architecture

What is cognitive architecture?

As humans, we have many different cognitive abilities. There is memory, language, perception, problem solving, mental will, attention and other such abilities. The goal in cognitive psychology is to explore the characteristics of such abilities and, if possible, to describe them in formal models.

Accordingly, cognitive architecture is understood to mean the representation of the different skills of cognitive psychology in a computer model. Also the Artificial intelligence (AI) strives towards the goal of fully realising cognitive abilities in machines. In contrast to cognitive architecture, artificial agents use strategies that are not used by humans.

What are criteria of cognitive architecture?

Cognitive architectures have certain criteria. These include the appropriate Structures of data representation, the support of classifications and the support of the Frege principle. In addition, the Criteria of productivity, performance, syntactic generalisation, robustness, adaptability, memory consumption, scalability, independent knowledge gain, such as logical reasoning and correlation detection. to this.

Triangulation, namely the merging of certain data from different sources, is also one of the criteria of cognitive architecture. Another important criterion is compactness with a basic structure that is as simple as possible. A high-performance system that fulfils all these characteristics is IBM's DeepQA.

Cognitive systems are already indispensable in many areas today and they will influence industrial and economic sectors to an ever greater extent in the future. Cognitive systems are the basis for future technologies, such as autonomous driving and other autonomous systems, Industry 4.0 and also the Internet of Things.

Cognitive systems are technical systems that are able to independently develop solutions and suitable strategies for human tasks. These systems are equipped with cognitive abilities and understand contextual content as well as interacting, adapting and learning. In cognitive architecture, it is important that flexible and adaptable software architectures work together in an overall system.

What theories can be found in cognitive science?

The SOAR (state, operator and result) architecture is a problem space search in which operators are applied to states to obtain results. This problem space search is done in the central working memory. Temporary knowledge is managed there. In order to be able to use knowledge, it is retrieved from the long-term memory into the working memory. Knowledge in the long-term memory is completely stored associatively through productions. Matching knowledge units are written into the working memory (with execute) and permanent experience-based learning (chunking) is applied in the process.

Marketable cognitive architectures, as well as artificial intelligence methods and algorithms, are also being used in flight mechanics and flight guidance. machine learning used. Systematic further developments are successfully used in highly automated flight systems.

In addition to SOAR, there are other cognitive architectures. For example, a step-by-step simulation of human behaviour during ACT-R was carried out. The empirical data for this comes from the experiments of cognitive psychology.

Clarion cognitive architecture stores both action-oriented and non-action-oriented knowledge with an implicit form using multilayer neural networks and in an explicit form using symbolic production rules. There are also the architectures LEABRA, LIDA, ART and ICARUS. Each architecture has its particular strengths, but also technical limitations.

Collaborative filtering

What is collaborative filtering?

Collaborative filtering involves Behavioural patterns of individual user groups evaluated to infer the interests of individuals. It is a special form of Data mining. This makes explicit user input superfluous.

Collaborative filtering is often used for particularly large amounts of data. Areas of application are in the financial services sector with an integration of financial sources and also in e-commerce and Web 2.0 applications.

The overall goal is to automatically filter user interests. To achieve this, information about behaviour and preferences is continuously collected from as many users as possible. An underlying assumption in collaborative filtering is that two people with the same preferences for similar products will behave identically for other similar products.

How does the algorithm work?

Collaborative filtering often takes place in two steps. First, users are searched for who have the same pattern of behaviour as the active user. Then their behavioural patterns are used to make a prediction for the active user.

Also popular is article-based collaborative filtering. Here, popular articles are presented separately and prominently. A similarity matrix is created and relationships between the articles are determined. Based on this matrix, the preferences of the active user are then derived.

There are other forms of filtering based on implicit observation of user behaviour. Here, the behaviour of a single user is compared with the behaviour of all other users. This data can then be used to accurately predict the user's future behaviour. These and similar technologies are extremely practical for today's users who can no longer keep track of all the offers on the market.

What are forms of collaborative filtering?

There are many different approaches to creating an algorithm for collaborative filtering:

User-based collaborative filtering where a user item matrix is used and similar persons are assigned to the active user. The characteristics are compared with corresponding similarity functions and similar persons are used for further calculations.

On the other hand, there is the item-based collaborative filteringThis is a process in which an independent model is created rather than working directly in the user-item matrix. In this way, different "items" are proposed in the process based on the results of the model.

In addition to these two approaches, there is also the content-based collaborative filtering. This algorithm refers to content-based recommender systems that primarily focus on the content of the object and process the attributes of the interacting entity. Collaborative filters are computed to enrich a generic frequency list with behavioural data.

With the help of Python can neural collaborative filtering which replaces product-based content of the filter with artificial neural networks and uses data-based filters with arbitrary self-learning algorithms.

AI Accelerator

What is an AI accelerator?

AI accelerators (or AI accelerators) are hardware components that enable the acceleration of AI computing tasks. Accelerators are turbo processors that allow specific tasks such as pattern recognition, analysis of unstructured data, Monte Carlo simulations, streaming tasks or the construction of neural networks.

For AI tasks in general, conventional processors have not been sufficient for a long time and significantly faster graphics processors (GPUs) are used in many data centres. The computing operations of image processing are similar to those of Neural networks and therefore appropriate GPU use is worthwhile. However, such GPUs are not specifically designed for tasks of Deep Learning and therefore they quickly reach their limits.

The hardware thus forms a throughput bottleneck. In the meantime, however, many chip manufacturers are developing accelerators that can greatly increase the system's computing speed. AI accelerators are mainly available from the manufacturer Nvidia. Google, for example, uses the "Tesla P100" and the "Tesla K80" GPUs in its "Google Cloud Platform". High-performance system units are coming onto the market and there are "neuro-optimised" ASICs (Application-Specific Integrated Circuits). These are used in end devices such as smartphones, data glasses and IP cameras as well as in small devices. Such chips are only suitable for specific functions and are designed for this purpose. Special chips show their advantages in deep learning and highly accelerated supercomputers help with extensive AI calculations. Google's Tensor Processing Unit (TPU) in particular can boast its ASIC architecture for AI acceleration.

High performance computing (HPC) and hyperscale also bring more performance for AI calculations. Great hopes also lie in the Quantum computing - the computers of the future. Also promising for the future are neuromorphic microchips.

AI accelerator with add-on card or GPU?

Kontron now offers a new concept for use in artificial intelligence. The "Kontron Industrial AI Platform" offers high performance and with an add-on card it accelerates the calculations. Thus, the latest Smarc module will use the GPU to get more performance.

Artificial intelligence is gaining significant importance in the Intelligent Edge in industrial automation. The TPU (Tensor Processing Unit) supports small and low power applications with only 1 Watt for 2 TOPS. Thus, a simple USB camera without TPU offers only 6 frames per second and one with TPU offers five times the speed of 30 frames per second.

Industry 4.0 applications require a lot of computing power. Object recognition, classification and quality inspection of objects as well as predictive maintenance are used and are based on AI algorithms. Artificial intelligence is becoming increasingly important for point-of-sales applications. Advertising and relevant information should be placed in a more targeted manner. Add-on cards offer high performance and are ideal for special applications. GPUs, on the other hand, are inexpensive and generally useful for calculating AI tasks.

What AI accelerators are there?

The question is which hardware should be used that is as fast and efficient in operation as possible? There are two major application areas that play a role in AI. On the one hand, there is the particularly computationally intensive training of neural networks and, on the other hand, inferencing, i.e. drawing conclusions from incoming inputs, the actual AI performance.

Through training, the machine learning system learns from a variety of processed sample data. The quality of the inference of AI models can continue to improve over time. After the learning phase is completed, the AI system is even ready to assess unknown data. The framework TensorFlow for Deep Learning is used for the machine learning process. In the end, the AI application can classify production parts according to good parts and rejects.

Important AI accelerators are graphics cards from NVIDIA. GPUs specially optimised for AI can implement hundreds of parallel calculations and create a computing power of over 100 TeraFLOPS. AI users can choose between standard servers, GPUs and AI chips. Depending on the needs, appropriate hardware devices can be used. NVIDIA is really fast and only needs 75 watts in operation. Inferencing runs at a low power consumption. Recommended for training machine learning models is a Fujitsu NVIDIA GPU with Volta cores - such as a Tesla V100. Such cards are really big and occupy two slots. They consume a lot of power and have a higher price. For demanding requirements, there is the DLU for Deep Learning.

Artificial neuron

What is a neuronal cell?

A biological, natural neuron is a nerve cell that can process information. The human brain has a particularly large number of such neurons. These cells are specialised in the conduction and transmission of excitation. All neurons together form the nervous system. The nerve cell has a cell body and cell processes, the dendrites and neurites with the axon. Dendrites can receive excitations from other cells. Voltage changes can be achieved by short ionic currents, through special channels allowed in the cell membrane. Axon terminals are located above synapses, which communicate chemically by means of messengers. The human brain consists of nearly 90,000,000,000 nerve cells.

What is an artificial neuron?

Artificial neuronal cells, on the other hand, form the basis for a model of the artificial neural networks. This model of neuroinformatics, which is based on biological networks, enables intelligent behaviour. An artificial neuron can process multiple inputs and react in a targeted manner via its activation. Weighted inputs are passed to an output function. This calculates the neuron activation. The behaviour is generally given by learning using a learning procedure.

The modelling of artificial neuron networks began with Warren McCulloch and Walter Pitts in 1943. It could be shown that logical and arithmetic functions can be calculated with a simplified model of such neuron networks. In 1949, Hebb's learning rule was described by Donald Hebb. During learning, active connections between neurons are repeatedly strengthened. The generalisation of such a rule can also be used in today's learning procedures.

The 1958 convergence theorem about the perceptron was also important. Frank Rosenblatt proved that with a given learning procedure it is indeed possible to learn all solutions that can be represented with the model. In 1985, John Hopfield showed that Hopfield nets are capable of solving optimisation problems. Thus even the problem of the travelling salesman could be handled. A boost in research was given to the development of the Backpropagation process. Today, neural networks are used in many research areas.

What is the goal of artificial neural networks?

Based on the understanding of biological neurons, artificial neural networks have been modelled since the 1950s. Thus, input signals into the artificial neuron are linearly summed in the neural network and certain values are output by a corresponding activation function.

Even individual neurons consist of very complex structures. These fulfil different local input-output functions. The goal of research on Artificial intelligence is to replicate these natural neurons in the brain. In this way, the electrical processes are to be simulated, with the aim of learning the meaning of language and developing artificial neural networks for object recognition. The big vision is this, Mimic the functionality, ability and diversity of the brain in all respects.