Blockchain

What is a blockchain?

Blockchain is a Decentralised and unalterable recording systemthat allows transactions and assets to be tracked on a network. It is often referred to as an account book, transaction book or ledger. This system can track and trade both tangible (e.g. houses, cars, money, land) and intangible assets (e.g. intellectual property, patents, copyrights, trademarks).
Blockchain, for example, is the Technology that underlies cryptocurrencies such as Bitcoin and and also finds Application in the NFT market.

Blockchain is considered a new and innovative technology and is widely recognised as secure and transparent. Method for storing and transmitting data recognised. However, like any technology, it is not perfect and there are some concerns about its scalability and potential use cases.

There are numerous online resources for learning blockchain, such as tutorials, courses and forums. Many universities and colleges now also offer courses on blockchain and cryptocurrency. In addition, there are many blockchain development communities on the internet where you can learn a lot about blockchain development, ask questions and exchange ideas with other developers.

How does a blockchain work?

Each transaction is recorded as a block of data that represents the movement of an asset, which can be tangible or intangible. This block can include information such as who, what, when, where, how much and even conditions, such as the temperature of a food delivery, as required. This Blocks are linked together and form a chainThe asset moves from one location to another or ownership changes.
Each block confirms the exact time and sequence of the transactions and the Blocks are securely linkedso that no block can be changed or inserted between two already existing blocks. This chain of blocks is the blockchain, which is the Traceable and unchangeable at any time and increases security with each additional block. The immutability of the blockchain rules out manipulation and creates a digital account book that can be trusted.

The benefits of blockchains

The benefits of blockchains also stem from the way they work: they are Extremely secure and transparent for data storage and transmission. This is done by using cryptography to link each block of data to the previous block, creating a chain of blocks. Because the data is stored throughout the network, it is difficult to hack or manipulate without changing the consensus of the network. In addition, each block is linked to the previous one, making it easy to track and verify transactions.

This makes them useful for a variety of applications, including financial transactions, supply chain management and online identity verification.

Examples for the application of blockchains

Online casinos

One example of how blockchain is being used in the real world is online casinos. Blockchain casinos use smart contracts to automate the process of betting and paying out winnings. This not only makes the process more efficient, but also ensures that the games are fair and that players can trust the casino to pay out their winnings.

Finance

Another example is the use of blockchain technology in finance. Blockchain platforms such as Ethereum are being used to create digital assets and new financial instruments such as stablecoins that are linked to the value of traditional currencies. This makes it easier for people to invest in digital assets and creates new financial markets.

Supply chain management

Finally, blockchain technology has enormous potential for use in supply chain management. By providing a secure and transparent way to track products as they move through the supply chain, blockchain can help increase efficiency, reduce costs and improve transparency. This makes it easier for companies to trace the origin of products, ensure they are ethically sourced and identify and resolve issues more quickly.

1. Bootstrapping

What is bootstrapping?

The bootstrapping process is about a method in statistics for estimating distribution functionsin which the so-called resampling procedure is used. Resampling describes a procedure of repeatedly drawing sub-samples from an initial sample in order to obtain conclusions about variables of the original distribution function such as mean or standard deviation.

Bootstrapping is used where the distribution function of a statistic is not known and must therefore be estimated. A prerequisite for the mode of operation is a certain size of the initial sample.

The name bootstrapping is derived from the English word for bootstrap and has its origin in the story of Baron von Münchhausen. Since, according to the story, Baron von Münchhausen pulled himself out of the swamp by his own boots, this metaphor is also taken up for the method, in that in the bootstrap method the sub-sample is drawn from one's own sample.

What are statistical methods in bootstrapping?

Since bootstrapping merely describes a general procedure in statistics, different concrete bootstrapping procedures also developed for different areas of application. At i.i.d Bootstrap (independent and identically distributed), for example, the procedure is to repeatedly draw a certain sub-sample size from the initial sample with reserve. After drawing the defined number of sample repetitions, the approximated distribution can be used to generate a confidence interval.

Since the i.i.d bootstrap cannot represent a temporal correlation of the data, the following is used for such cases Block bootstrap is applied. In this method, the data are divided into contiguous blocks in a preparatory step. By dividing the trend time series function into a trend and residual component, the basis for carrying out the method is created. The residual part corresponds to the difference between the respective measurement and estimation point. Using drag and drop, residual blocks are repeatedly drawn until the length of the original signal or the initial sample is reached and then added to the trend time series. By repeatedly applying this sample repetition, a temporal correlation function can finally be represented with this procedure by means of a bootstrap.

In contrast to the previously described methods of non-parametric bootstrapping, parametric bootstrapping assumes a certain distribution of the initial sample. Non-parametric bootstrapping takes advantage of the fact of non-parametric statistics that for its application no assumptions are made about the distribution, as this only emerges from the sample under consideration. At parametric bootstrapping one concentrates on estimating the parameters of the assumed distribution.

Where is the process used in machine learning?

In the area of the machine learning the procedure is carried out within the framework of the so-called Bagging (short for "bootstrap aggregating") is used. Bagging is used in particular for Regressions- and Classification trees Bootstrapping is used to reduce variance and thus improve prediction values. Bootstrapping is used in bagging to draw samples (bootstrap), which are trained with the prediction model and finally aggregated to a prediction value (aggregating). Also in the area of Temporal Difference Learning in the reinforcement learning environment, the procedure is used by iteratively optimising the objective function through variance reduction.

The Programming language R offers an implementation for non-parametric bootstrapping. By specifying the parameters, individual variables or vectors can be calculated. In a next step, the associated confidence intervals can also be determined.

Statistical software such as SPPS from IBM or Stata also offer largely automated bootstrapping application procedures. SPSS even provides its own module with many functionalities. After entering the individual parameters, the sampling distribution is estimated using the method described.

Bagging

What is bagging?

Bagging is an abbreviation of the term "bootstrap aggregating" and represents a procedure for variance reduction when using different classification and regression trees in the context of the machine learning dar.

Besides this increase in the accuracy of Classification- and Regression problemsbagging is also used to solve the well-known problem of the Overfitting to solve. The results of the algorithm are particularly good when the individual learners of the classification and regression trees are unstable and have a high variance.

According to the word components, this method involves bootstrap aggregating in two process steps. Bootstrapping basically describes a procedure in statistics in which random samples are repeatedly drawn from a defined data set in order to identify an unknown distribution function of the data set. Thus, this bootstrapping procedure can be classified as resampling, since sub-samples are repeatedly drawn on the basis of a sample (data set). These individual samples are then trained with the prediction model or weak classifiers and then aggregated to a predicted value.

This is where the name bootstrap aggregating comes from, as data is initially drawn through repeated sampling (using the bootstrapping procedure) and then the prediction models are unified (aggregated). Thus, it is possible that this methodology leads to an information fusion and thus increases the classification or regression performance.

How does the ensemble method work?

An ensemble method or ensemble learning is basically when several (weak) learners or classifiers are connected together and run through, thus creating a so-called ensemble. In this respect, ensemble methods are also referred to as a meta-approach to machine learning, since several models are combined to form a prediction value.

As described at the beginning When bagging (bootstrap aggregating), multiple samples of a data set are taken and the same algorithm is then trained and tested in parallel with the sample data.. This usually involves drawing random samples of the data set, but it would also be possible to distribute the entire data set and generate the distribution of the data from this. When the data is selected by random sampling, it corresponds to the "draw with reclamation" model. This means that certain data points can be included in the model several times (via multiple random selection), while others cannot be included at all.

After generating the sample, the learning algorithm is applied to each ensemble member. This is done in parallel with each other. Finally, the individual predictive models are aggregated, resulting in a final ensemble classifier. The individual models or algorithms can either flow into the classifier with equal weights or have different weights.

What is the difference between bagging and boosting?

In addition to bagging, the so-called Boosting an ensemble method in machine learning dar.

Thereby In contrast to bagging, the (weak) classifiers are not run through in parallel, but sequentially.. In both methods presented, a basic sample is drawn at the beginning. Due to the iterative and sequential approach of the ensemble method, it is possible that the findings from the previous steps are applied to subsequent steps. This is achieved by weighting incorrectly classified iterations differently from correctly classified iterations.

The aim of boosting is to create a strong classifier from a large number of weak classifiers. While weights can also be used in principle in bagging, they differ in boosting in that their size depends on the previous sequential progress, whereas the weights in bagging are already defined in advance, as the process runs in parallel.

Another difference between the two methods is the objective. The aim of bagging is to reduce the variance of the individual classifiers by combining them, while boosting aims to reduce the systematic error or bias of the distribution. In this sense, bagging can help solve the overfitting problem, whereas boosting does not.

Both methods can be combined with Python implement, whereby the scikit-learn library provides an implementation for ensemble methods and can thus be implemented relatively easily.

BERT

What is BERT?

BERT stands for "Bidirectional Encoder Representations from Transformers" and describes an algorithm that Google uses for search queries. In their so-called core updates, Google continues to develop the algorithm for search queries in order to achieve ever better search results for users' search queries.

BERT was introduced at the end of 2019 and has the purpose of better understanding the context of the search query. Special attention was paid to prepositions and filler words in the search query, which Google often ignored in search queries in the past. In addition to the use of the algorithm, BERT also introduced so-called "featured snippets". These are highlighted search results that are intended to provide the user with a brief answer to the search query.

Since BERT is based on speech and text recognition (Natural Language Understanding) as well as their processing, the algorithm is based on Natural Language Processing (NLP) in the area of neural networks. NLP has become the The aim is to make natural human language processable by computers.so that they understand the meaning of the language.

BERT uses a special field in the area of machine learningThis is known as transfer learning. In principle, machine learning concepts are based on the fact that training and test data originate from the same feature space and the same distribution. However, this has the limitation that if the distribution is changed, the original data will be lost. Training data cannot be used any further. In transfer learning, however, it is possible that training data from a "non-subject" data set can be drawn upon and used to find solutions. This reduces the number of training data required and, if necessary, also the training time. While transfer learning has its origins in image recognition, BERT uses this methodology for text processing, since search queries are very individual and specific training data is not always available.

How is the language model structured and what functions does it include?

The BERT language model is based on calculation models, so-called transformers, which place a word in relation to all other words in a sentence. and thus tries to better understand the meaning. The transformers function in such a way that input signals are converted via so-called encoders into a processable form of vectors with which mathematical operations can be carried out. In the so-called "self-attention layer", each word of the input is weighted according to a value scale. This value scale evaluates each word in relation to the other words in the input. The values are then normalised and weighted using the so-called softmax function in such a way that the sum of all values adds up to 1. They are then passed on to the next layer.

Both the encoders and the decoders are designed as Feed-Forward-Neural-Network constructed. This means that there is no feedback to previous layers within the neural networks, as is the case with recurrent networks. In the decoder, a self-attention layer is applied, the values are normalised and the processed input data are merged in the so-called encoder-decoder-attention layer. Afterwards, a neural feed-forward network is implemented and a linearisation of the values and the softmax function are applied in order to finally output the most probable solution.

BERT also works like most algorithms on the basis of probabilitieswhich is used as a basis for finding a solution.

Black Box

What is a black box?

As a black box each system of the deployed artificial intelligence Designates whose inputs and operations are not visible to the user. In general, a black box is an impenetrable system.

At Deep Learning black-box development is usually performed. So the algorithm takes millions of data points, processes that input and correlates certain data features so it can produce an output. In the Data mining on the other hand, it is an algorithm or even a technology that cannot give any explanation for how it works.

A black-box model for developing software with artificial intelligence is an adequate development model for testing software components. This is not the case with search algorithms, Decision trees and knowledge-based systems that have been developed by AI experts, are transparent and offer comprehensible solution paths, from white-box processes.

A black box in the Machine learning is a model of a purely statistical nature. White-box models, on the other hand, denote analytical and physical descriptions for which modelling is often very elaborate. Finally, grey-box models combine both approaches and can unite the respective advantages.

What are typical methods?

A Black box testing is always used when there is no knowledge of the inner workings and implementation of the software. Only the outwardly visible behaviour is included in the test.

A successful test is not a sufficient indication of a successful and error-free system. Thus, a non-requested functionality or a massive security gap may remain undetected. Therefore, one test procedure is usually not sufficient, since structural tests cannot detect missing functionality and functional tests only insufficiently consider the existing implementation. The best approach is a combined procedure of functional testing with limit analysis or random testing, structural testing of the sections that were not covered and regression testing after error correction.

Functional tests can only insufficiently consider the implementation at hand. Test methods include functional tests (black box test) with a test case selection based on a specification. Thus, equivalence class tests are carried out, limit values are calculated and the test is narrowed down via special values. State tests can be implemented on this specification basis.