What is Naive Bayes?

Naive Bayes is a tried and tested tool in artificial intelligence (AI) with the Classifications can be made. Thus, the Bayes classifier is a machine learning technique. Objects such as text documents can be divided into two or more classes. By analysing special training data where correct classes are given, the classifier learns. The naive Bayes classifier is used when probabilities of classes are made based on a set of specific observations.

The model is based on the assumption that variables are conditionally independent depending on the class. To define the Bayes classifier, one needs a cost measure that assigns costs to every conceivable classification. A Bayes classifier is the classifier that minimises all costs arising from classifications. The cost measure is also called a risk function.

The Bayes classifier minimises the risk of a wrong decision and is defined via the minimum-risk criterion. If a primitive cost measure is used that incurs costs practically exclusively in the case of wrong decisions, then a Bayes classifier minimises the probability of wrong decisions. Then the classifier is said to be defined via the maximum-a-posteriori criterion.

What are the applications for Naive Bayes?

Naive Bayes is often used for spam classification. For example, spam filters often use the naive Bayes classifier. The class variable indicates whether a message is spam or wanted. All words in this message correspond to the variables, where the number of variables in the model are determined by the corresponding length of the message.

Which variants are available?

There is the:

  • Gaussian Naive Bayes
  • Multinomial Naive Bayes
  • Bernoulli Naive Bayes
  • Complement Naive Bayes
  • Categorical Naive Bayes

How does Naive Bayes work?

The technique uses all the given attributes. There are two assumptions about these attributes. On the one hand, all attributes are assumed to be equally important. On the other hand, the attributes are statistically independent, which means that knowing one value says nothing about the value of another attribute. However, this independence assumption is never true. Nevertheless, this method works well in practice! Moreover, it can work well with missing values.

An example is a training data set of weather and the possibility of playing a sports game in nice weather. The first step is to convert the data into a frequency table. A probability table is then generated in the second step by searching for probabilities such as overcast weather (0.29) and the probability of playing (0.64). In the third step, the Naive Bayes equation is used to calculate the posterior probability for each class. The class with the highest posterior probability is the result of the prediction.