Vector Space

A Vector Space is an algebraic method where text documents and any other object, in general, are represented as vectors of identifiers. The vectors, just like in mathematics, are manipulated using various operations on a computer to process data and give out the desired information. Since this method is faster in data processing, it is used in indexing, retrieving information, relevancy ranking, and filtering information.
Concepts Commonly Used in the „Method and are Important to Note

  • Vector: These are lists of numbers, one column matrix, or any element found in the equation.
  • Space: just like in mathematics, space refers to a set that contains a list of elements that are interpreted by a list of axioms or instructions regarding how the elements relate to each other within the set.
  • Elements: These are lists of objects, things, or even complex numbers used in the method. When elements are grouped, they form a set, and they are usually encased in curly brackets. One element cannot be repeated within a set.
  • Documents: This represents the general text as a whole or a small fragment of a text, which can be a sentence, a group of sentences, and even full paragraphs. Generally, it refers to articles.
  • Feature items: This refers to the contents of a text which can be represented by specific texts of basic linguistic units such as different words and phrases. These texts can represent many feature items. Below is an example of a text represented as a vector space
  • Document=D (t1, t2…, tn)
  • Where ti stands for each item, 1 ? i ? n. In short, these items form a vector space where each item represents a dimension.
  • Item weight: Weight W is assigned to each feature item in a text. This shows or indicates the importance of a feature item in the text. The weight of items is generally derived from the frequency characteristic, and this is the basis for calculation.

Data Navigator Newsletter