NumPy

What is NumPy?

NumPy (short for Numerical Python) is a powerful Library for performing mathematical operations on large data fields. She is in Python and builds on Python's numerical library, making it easy to integrate with other Python libraries and tools. NumPy is widely used in the scientific and data science community and is considered a Fundamental tool for many data-intensive applications.

One of the main features of NumPy is the Ability to work with arrays of data. An array is a data structure that stores a collection of elements of the same type in a contiguous block of memory. NumPy arrays are similar to Python lists, but they are much more efficient for certain types of operations, such as mathematical calculations.

NumPy is often used to perform mathematical operations on large arrays of data. It is also commonly used for other tasks such as reshaping, flattening and appending arrays.

Examples for the use of NumPy

Creating a NumPy array

A NumPy array can be created from a Python list using the numpy.array() function. For example, an array of 10 evenly spaced values between 0 and 1 is generated using the numpy.linspace() function.

import numpy as np
a = np.linspace(0, 1, 10)
print(a)

Transforming an array

The shape of a NumPy array can be changed with the reshape() function. For example, a 1D array with 10 elements can be converted into a 2D array with 5 rows and 2 columns.

import numpy as np
a = np.arange(10)
print(a)
b = a.reshape(5, 2)
print(b)

Perform mathematical operations with arrays

Mathematical operations can be performed with arrays, e.g. addition, subtraction, multiplication and division, by using the standard mathematical operators. For example, two arrays can be added and the sum of all elements in an array can be calculated.

import numpy as np
a = np.array([1, 2, 3, 4, 5])
b = np.array([5, 4, 3, 2, 1])
c = a + b
print(c)
print(c.sum())

Random choices

In NumPy, the numpy.random.choice() function can be used to randomly select elements from an array or a given 1-D array-like object.

Here is an example to randomly select three elements from an array:

import numpy as np
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
b = np.random.choice(a, size=3, replace=False)
print(b)

The above code randomly selects 3 elements from the array "a" without replacing them and assigns them to the variable "b".

The probability can also be specified for the selection of each element with the parameter p. For example, if three elements are to be randomly selected from the array "a", but the element with the value 5 is to have a higher probability of being selected, you can use the following code:

import numpy as np
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
p = [0.1, 0.1, 0.1, 0.1, 0.3, 0.1, 0.1, 0.1, 0.1, 0.1]
b = np.random.choice(a, size=3, p=p)
print(b)

In this example, the element with the value 5 has a probability of 30% of being selected, while all other elements have a probability of 10%.

Another useful function for random selection is numpy.random.shuffle, which shuffles the array along the first axis of a multidimensional array. This function changes the input in place and returns None.

import numpy as np
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
np.random.shuffle(a)
print(a)

NumPy vs. Python

NumPy is faster than Python listsbecause it uses a more efficient array storage layout that allows faster access to elements. In addition, NumPy offers a Wide range of built-in mathematical functionswhich are optimised for use with arrays.

NumPy vs. Pandas

Compared to pandas NumPy a lower level library, which focuses on providing efficient array operations. Pandas builds on NumPy and offers additional functions for working with tabular data, such as data frames and series. While NumPy is useful for performing mathematical operations on large arrays of data, Pandas is suitable for data manipulation and analysis tasks.

NumPy is not as easy to learn as Pandas, as Pandas provides a high-level interface for working with tabular data that is more user-friendly than NumPy's array operations. However, NumPy is a fundamental tool for many data-intensive applications and is widely used in the scientific and data science community.

In terms of performance NumPy faster than Pandaswhen it comes to performing array operations. However, Pandas offers additional features for working with tabular data that can make it slower for certain types of operations.

Named Graph

What is a Named Graph?

The Named Graph is one of the basic ideas of the semantic web. The semantic web in turn stands for the idea that the internet can be understood as a "gigantic global graph".

A Named Graph is addressed by calling a Uniform Resource Identifier - this is the name of the graph, abbreviated to URI - and can contain additional metadata besides the actual content. The Named Graph extends the RDF model, which in turn consists of three parts: a resource, the property of this resource and the values of this property. This structure can also be aptly compared to the grammatical parts of a sentence, namely subject, predicate and object. The data systems that work with such so-called triples are more complex than relational data systems. Databases very performant - therefore they are often used in the field of Artificial intelligence used. Here it is important to be able to access enormous amounts of data quickly. This means that in addition to the World Wide Web, Named Graphs are also used to model knowledge representations, for example. Knowledge representation is also a sub-area of artificial intelligence and can be, for example, a digital lexicon.

What does the Named Graph have to do with HTTP?

A graph consists of nodes and edges; if you transfer this model to the web, these are web pages and their connections. Tim Berners-Lee - the inventor of the internet - described his invention as a "Giant Global Graph". As already described, the subgraphs contained in it (i.e. the web pages) can be reached via URIs. Now these are Subgraphs in reality HTML documents that can be accessed via HTTP requests; this explains why the Hypertext Transfer Protocol and hyperlinks are the edges in the graph model.

In order to be able to use the Resource Description Framework Schema (abbreviated RDFS) efficiently, there are RDF documents. The language RDF/XML, which was specially developed for this purpose, allows web pages to be described with triples. If, for example, a music album is presented on a page, the RDF document provides information such as title, band or year of production. Of course, the corresponding hyperlink is also included as the naming element of the Named Graph. The RSS feed, which informs recipients about changes on websites, uses this technique for itself, for example.

How can RDF and ontologies help us?

The terms ontology and RDF schema are often used synonymously. In practice, it is a database whose data structure consists of triples - the so-called triplestore. Just as the database language SQL exists for relational databases, a query language is also available for an RDF store (as a triplestore can also be called); data objects are accessed with SPARQL. Many social networks are based on this kind of data management. Thanks to the FOAF ontology, queries via SPARQL in such database systems provide not only information about the queried person (such as name or origin) but also information about the connection with other members of the network.

Natural Language Programming

What is Natural Language Programming?

When Natural Language Programming (also abbreviated as NLP) is used for the development of software, the computer programme consists of mostly English-language texts (structured by sections).

The sentences that make up these texts are closely based on human language. In order for them to be processed by the computer, the so-called NLP documents are written in deeper programming languages, such as 'language'. Python translated.

Natural Language Programming has a number of analogies to the seminal Natural Language Processing (processing of spoken language); both fields deal with the control of computers through human language. However, Natural Language Programming only plays a role for experts, because the syntax of the different programming languages must first be learned. The latter field is also of interest to end users, since only the ability to speak is required of the user.

What is the syntax of NLP?

With a few exceptions, programming languages - developed for Natural Language Programming - are aimed at according to the syntax of the English language. They often differ from each other only in nuances. Therefore, programmes can be read out to any person and executed by a computer at the same time.

Apart from the syntax, it is as a developer, it is important to be aware of the ontologywhich underlies each of these languages. The resulting generic system makes it possible to write computer programs based on NLP. Sentences within an ontology are characterised by the fact that they always either establish a context, perform an action or answer a question.

What are examples of Natural Language Programming?

  • Transcript is one of the best-known programming languages in Natural Language Programming. LiveCode, the integrated development environment based on it, is used for learning purposes in a third of Scottish schools.
  • Inform 7 is an object-oriented language intended for the development of text adventures. In this computer game genre, the player navigates through the game using text input. Thanks to different language versions, it is also possible to "programme" in German.
  • AppleScript, as part of Apple's macOS operating system, shows how widespread NLP was and still is. It can be used, for example, to automate repetitive tasks, but has not been updated for some time.
  • HyperTalk is also developed by Apple and is equipped with the possibilities of procedural programming (i.e. for/while/until, if/then/else, as well as function handlers are available).
  • SenseTalk is another representative of the high level scripting languages and can be written both object-oriented and procedural.
  • Various frameworks allow well-known languages such as Python to be used for Natural Language Programming. Quepy makes database queries out of Python code, for example.

Neuroscience

What is neuroscience?

The term "neuroscience" refers to the scientific study of the nervous system. It is composed of the words "neuron" (nerve) and "science".

The Neuroscience deals with all scientific aspects of the nervous system, including the molecular, cellular, functional and structural elements as well as the evolutionary, medical and computational aspects.

The nervous system is a collection of interconnected neurons that communicate with each other and with other cells through specialised synaptic connections. The neurons project long filaments called axons that can reach distant parts of the body and transmit signals that influence neuronal and muscular activity at their endpoints.

All elements of the nervous system are thus studied by neuroscientists, to understand how it is structured, how it works, how it is formed and how it can be changed.

Some examples of relevant areas are:

  1. Neuronal signal transmission and axonal networking patterns
  2. Neuronal development and biological function
  3. Formation of neuronal circuits and functional role in reflexes, perception, memory, learning and emotional response
  4. Cognitive neuroscience, which deals with psychological functions related to neuronal circuits
  5. Imaging of the brain in the diagnosis of diseases

What are neuroscience methods?

fMRI:
Functional magnetic resonance imaging measures neuronal activity by detecting changes in blood flow to the brain. Differences in the magnetic properties of haemoglobin are detected under a strong magnetic field in a scanner.

PET:
In positron emission tomography, radiotracers injected into the bloodstream are absorbed by the body and the gamma rays they emit are detected by the system.

TMS:
In transcranial magnetic stimulation, an electromagnetic collar held on the skull generates electrical currents in the underlying brain region and modulates neuronal activity to study its functioning and connectivity.

Optogenetics:
Optogenetics is a combination of concepts from optics and genetics and is a technique that uses light to precisely modulate neurons that have been genetically engineered to express a light-sensitive molecule in animal models.

Electrophysiology:
Patch-clamp recording and its representatives use a microelectrode to study the ion channel properties of a "patched" cell membrane and record the characteristics of individual neurons in the brain. The recent development of multi-electrode arrays has enabled researchers to record the activity of many neurons simultaneously.

Which disciplines belong to the neurosciences?

Neuroscience can be divided into the following overlapping areas for understanding the underlying mechanisms of the human brain:

  • Behavioural / cognitive neuroscience
  • cellular and molecular neuroscience
  • Systems Neuroscience
  • translational and clinical sciences
  • Neuroinformatics

Neuroscientists are essentially basic researchers who usually have a PhD in neuroscience or a related field. They can then work in post-doctoral research or continue their training as a medical doctor and later specialise in neuroscience.

These often help to understand the genetic basis of many neurological diseases, such as Alzheimer's disease, and to identify strategies for cure and treatment. Neuroscientists can also be involved in research into mental disorders such as schizophrenia or behavioural disorders.

Naive Bayes

What is Naive Bayes?

Naive Bayes is a tried and tested tool in artificial intelligence (AI) with the Classifications can be made. Thus, the Bayes classifier is a machine learning technique. Objects such as text documents can be divided into two or more classes. By analysing special training data where correct classes are given, the classifier learns. The naive Bayes classifier is used when probabilities of classes are made based on a set of specific observations.

The model is based on the assumption that variables are conditionally independent depending on the class. To define the Bayes classifier, one needs a cost measure that assigns costs to every conceivable classification. A Bayes classifier is the classifier that minimises all costs arising from classifications. The cost measure is also called a risk function.

The Bayes classifier minimises the risk of a wrong decision and is defined via the minimum-risk criterion. If a primitive cost measure is used that incurs costs practically exclusively in the case of wrong decisions, then a Bayes classifier minimises the probability of wrong decisions. Then the classifier is said to be defined via the maximum-a-posteriori criterion.

What are the applications for Naive Bayes?

Naive Bayes is often used for spam classification. For example, spam filters often use the naive Bayes classifier. The class variable indicates whether a message is spam or wanted. All words in this message correspond to the variables, where the number of variables in the model are determined by the corresponding length of the message.

Which variants are available?

There is the:

  • Gaussian Naive Bayes
  • Multinomial Naive Bayes
  • Bernoulli Naive Bayes
  • Complement Naive Bayes
  • Categorical Naive Bayes

How does Naive Bayes work?

The technique uses all the given attributes. There are two assumptions about these attributes. On the one hand, all attributes are assumed to be equally important. On the other hand, the attributes are statistically independent, which means that knowing one value says nothing about the value of another attribute. However, this independence assumption is never true. Nevertheless, this method works well in practice! Moreover, it can work well with missing values.

An example is a training data set of weather and the possibility of playing a sports game in nice weather. The first step is to convert the data into a frequency table. A probability table is then generated in the second step by searching for probabilities such as overcast weather (0.29) and the probability of playing (0.64). In the third step, the Naive Bayes equation is used to calculate the posterior probability for each class. The class with the highest posterior probability is the result of the prediction.