This is data that is too large or too complex to be stored or processed using conventional technologies. These typically arise from the three V's with which Big Data is associated: 

  • Volume: The amount of data that is collected, stored and processed. 
    • Big Data = a large amount of data. 
  • Velocity (velocity): The speed at which data must be collected, stored and processed for a particular use case, often in (near) real time.
    • Big Data = data requires rapid processing. 
  • Diversity: The type and structure of the data.
    • Big Data = data in any form, including unstructured data (in the past, data was almost exclusively structured). 

Regardless of how one ultimately defines Big Data, the explosion in the generation, storage and processing of data has had profound consequences in various areas. On the one hand, the availability of so much data has opened up completely new possibilities for using this data, e.g. with Machine Learning Methods to the creation of AI. So it has enabled a whole new range of data use cases. On the other hand, it has also put a lot of pressure on data storage and processing technologies. Since traditional technologies have not been able to handle Big Data, the pressure to innovate and improve has led to a whole new range of technologies. 

An alternative definition of Big Data: 

Data storage and processing use cases that require the use of parallel computing on computer clusters to store, process or analyse data. 

The work with Computer clusters instead of single-node servers or computers usually makes data analysis and processing much more complex.