Visual Data Exploration in Data Science Projects

by | 24 July 2018 | Basics

The projects in the field of data science are becoming increasingly specific, the amount of data to be analysed is growing and the tools or toolkits for data processing and analysis are becoming more and more complex. Data visualisation more diversified. The technology development and the emerging needs in the projects pose a growing challenge in handling data for Data Scientists The applicability of the methods The applicability of the methods, Algorithms and tools requires a fast and effective execution of the individual steps for data exploration. One approach to this is: Visual Data Exploration

Characteristics of Visual Data Exploration

The Data Exploration Process aims to find and analyse the implied, valuable information in data. In the Visual Data Exploration process, our visual perception plays a prominent role. Since the visual perception speed of humans is 10 times faster than that of other perception channels, it makes them a gifted and agile receiver of information. The concept of Visual Data Exploration therefore aims to integrate humans into the data exploration process and to use their perceptual abilities in the analysis of large amounts of data.

The basic idea of Visual Data Exploration is to present the data in a visual form. This gives the Data Scientist quickly gain insight into the data and can draw conclusions very quickly because he interacts directly with the data. In this context, the boundaries between Visual Data Exploration and Visual Analytics. The visual analytics process involves the interaction of the user with the data, visualisations and models in order to discover the knowledge hidden within.

The Visual Data Exploration Approach

The classic approach to visual data exploration is based on Shneidermann's well-known three-stage paradigm:

  1. Overview
  2. Zoom and filter
  3. Details-on-demand

This three-part scheme is also known as "Information Seeking Mantra" known. The first step is to get an overview of the data. Then the data scientist focuses on anomalies and interesting patterns in the data, in order to finally be able to analyse the patterns in the data in more detail.

At the same time, the Data Scientist has the Visual Data Exploration Process in order to combine the automated analysis methods with the interactive visual representations. Therefore, the Information Seeking Mantra can be extended or completed as follows:

"Analyze first - show the important - zoom, filter and analyze further - details on demand"..

Particularly in the case of larger amounts of data - often referred to in this case as Big Data - the problem is exacerbated. The challenge of getting an overview of the data to be visualised without losing anything interesting is growing. Therefore, it is necessary to define the question and analyse the data according to its value of interest in order to show the most important aspects of the data. At the same time, the data scientist should be enabled to perform further analyses by being able to quickly access the additional data needed.

The role of tools in the visual data exploration process

Equally important in the context of visual data exploration are the Visualisation tools. These help the data scientist in placing the data in appropriate representation to understand it better. Through the above-mentioned functionalities of the software, such as through "Filtering", "Zooming or "Drill down", the data can be analysed more quickly.

With visual data exploration, you dive deep into the data. Seamlessly, you move from interesting insights to closer examination or removal of data that is irrelevant. You filter the data to see it from different perspectives and come to new insights.

The state of mastery experienced in this process can be described as "Analytical Flow (analytical flow). The term "flow" was coined by psychologist and happiness researcher Mihaly Csikszentmihalyi and describes the state of complete immersion and complete absorption in an activity. A well-designed visual analysis software should therefore ideally be easy to use, not cause attention fragmentation and not discourage complete data immersion.

Challenges in Visual Data Exploration

Since the Analytical Flow experience is mainly based on the software used, the major challenge in tool selection is its Usability. The use of a range of different exploratory tools that are not compatible with each other can also be an obstacle in the exploration process. The amount of data also plays a significant role in this case.

Larger Data volumes (Big Data) need special treatment. The difficulties already arise when navigating through Big Data. In addition, appropriately adapted visualisation techniques are needed for the visualisation of Big Data to enable orientation in the data. Furthermore, the Complexity of the data is very challenging. While the data is packaged in the visualisation, the complexity can lead to confusion and thus distort the visual perception.

Tools for Visual Data Exploration

In summary, these problems are aggregated today, especially as challenges with Big Data. Visual data exploration is valuable in this case, as it greatly facilitates the insight into the data and the derivation of knowledge.

The tools are the enablers of the methods described and often enable visual data exploration via compatible software. Business Intelligence-Tools such as Qlikview, Tableau or others enable analytical flow. They are characterised by high usability, although they have certain weaknesses when dealing with big data.

In contrast to this Open source tools such as Kibana or Datameer deal very well with the larger amount of data. However, these tools are not very user-friendly and do not offer an analytical flow for the data scientist.

At the moment, there is no optimal standard solution. In other words, this is an extremely exciting field that gives hope for further developments in the future.

<a href="https://www.alexanderthamm.com/en/blog/author/at-redaktion/" target="_self">[at] EDITORIAL</a>

[at] EDITORIAL

Our AT editorial team consists of various employees who prepare the corresponding blog articles with the greatest care and to the best of their knowledge and belief. Our experts from the respective fields regularly provide you with current contributions from the data science and AI sector. We hope you enjoy reading.

0 Comments

Submit a Comment

Your email address will not be published.