Why data science projects will benefit from the GDPR in the long term

from | 6 March 2018 | Basics

Which company or organisation does not collect, collate and store data today? The new General Data Protection Regulation (GDPR) is therefore causing uncertainty among many companies. Does the new General Data Protection Regulation really affect all companies or only individuals?

While small and medium-sized enterprises in particular are afraid of the announced consequences, we at Alexander Thamm GmbH consider the entry into force of the GDPR to be good news in principle: because every company is thereby required to consciously deal with its data behaviour. First and foremost, this is an opportunity to secure the handling of data and to open up new application possibilities.

In this blog post, we address the three currently central questions: What is the GDPR exactly? What are the consequences for Data science projects? And to what extent are, for example, also Machine data or anonymised data affected by the new regulation?

Link tip: Are you interested in data science projects? Then take a look at our Data Science Training an.

The new EU General Data Protection Regulation

Strictly speaking, the new EU General Data Protection Regulation (GDPR) came into force almost two years ago. On 25 May 2018, the two-year transition period ends, after which the GDPR will also apply. The new basic regulation is primarily concerned with the Fundamental rights and freedoms of natural personsin particular their right to the protection of personal data (Art. 1 GDPR).

As a result, companies and public administrations that are active within the EU and process data of Europeans will have to make adjustments in the future. The most important, concrete consequences that follow from the GDPR are:

  • Personal data must be before Damage or loss protected
  • Data must be monitored and Data leaks must within 72 hours reported become
  • Every EU citizen has the right to Copy of the data that are stored about him or her. It must therefore be possible to find data quickly and pass it on (Data portability).
  • Minimise data: Only data relevant to the pre-determined purpose may be kept - all others must be deleted on a regular basis.

The GDPR provides for drastic fines which is why it is feared by many companies. In any case, these should be "effective, proportionate and dissuasive" (cf. Art. 83 GDPR). Depending on the classification of the violations of the GDPR, fines of up to 20 million euros (or 4 % of global turnover) are possible for companies.

The GDPR and data science projects with personal data

For data science projects that work with personal data, the GDPR means that in the future Algorithms must be partially disclosed. This is at least to the extent that citizens' rights to information must be fulfilled. This means that it is explicitly not the source code of algorithms that must be disclosed, but the individual factors that are taken into account, for example, when granting a bank loan and that have an influence on the result. These must also remain comprehensible at a later date. This also applies to all areas of application where it is a matter of Profiling, scoring and screening goes. The project set up for this purpose also Data Lakes, must meet the new requirements.

A special feature is the regulation for the Merging data from different sources. From the perspective of data protection law, this process is considered a new data collection evaluated. The combination of data can lead to new information or significantly change the information content of a data set about a person. This is why this process will in future be a Category requiring authorisation for the processing of personal data If data is to be used for another purpose. If data once collected is to be used for another analysis purpose, a company must explicitly obtain the permission of the persons concerned. For the same reason, the collection or retention of personal data for no specific purpose is generally not permitted.

The consequences of the GDPR for data science projects with machine data

Anonymised data are initially not affected by the GDPR, but explicitly only personal data. At the same time, a grey area emerges here within the new regulation, as it includes certain terms such as "Big Data", "Data Intelligence", "Advanced Analytics", "Data mining" or "Text mining"does not know. In the definitions of the regulation, procedures for analysis and forecasting are referred to under the somewhat vague term "Profiling"summarised. The respective practice in the collection and processing of data must therefore be considered and evaluated individually.

Machine data, for example, which is collected within the framework of the Industry 4.0 or the Internet of Things are initially not affected by the GDPR. However, there are borderline cases here, such as the recording of location data. These can be traced back to data on natural persons by linking them to identification numbers that identify the machine operators. Machine data that can be used in this way to Personal reference fall under the provisions of the GDPR.

There are other borderline cases with machine data: Machine data can also be used, for example, to prove how personal data has been handled. What data was used or deleted, how and when? In this respect Machine data also an important tool to meet the requirements of the GDPR.

In addition, the insights gained from machine data can also help to quickly detect security breaches, investigate them and assess their scope. Companies can benefit from the Evaluation of machine data last but not least, comply with their obligation to report a data leak in good time.

GDPR as a compliance risk and data governance task

The new requirements of the GDPR bring increased Compliance risks with it. In order to meet the new requirements, some preparations and adjustments are needed. The goal must be to be able to process data quickly and use it efficiently. If damage occurs, companies are obliged to prove that they are not responsible in any way for the cause.

This can be achieved, for example, through a Documentation of all safety-relevant actions happen. Machine data also provide the information for this in this case. They can serve as proof to authorities that appropriate Safety precautions met and became the Minimising the risk were used.

All companies that collect and process personal data in one form or another will inevitably have to deal intensively in the future with the topics of Data governance and Data Custodianship The company has to deal with the issue of the distribution of responsibilities within the company.

Acute need for action, but no reason to panic

Due to the new GDPR, almost all companies and organisations are forced to deal with the issue of what customer-related data have them and to ask themselves what they use them for. In future, the handling of personal data must be documented accordingly.

According to a recent Bitkom study according to the survey, there is still an enormous need for action here. Only 13 percent of the companies have started to take care of the new requirements due to the GDPR or state that they have already completed this process.

The Bitkom study also states that every third company uses personal data to improve its own processes. At 42 percent of the companies even the entire business model is based on personal data.

The goal of these companies must be to meet the increased requirements of the GDPR as quickly and sustainably as possible. This is precisely why the GDPR represents a Great opportunity for companies and administrations, because they have to deal intensively with their data. This is an ideal opportunity to gain confidence in dealing with data and to improve their Data Journey to start.


Michaela Tiedemann

Michaela Tiedemann has been part of the Alexander Thamm GmbH team since the early start-up days. She has actively shaped the development from a fast-moving, spontaneous start-up to a successful company. With the founding of her own family, a whole new chapter began for Michaela Tiedemann at the same time. Hanging up her job, however, was out of the question for the new mother. Instead, she developed a strategy to reconcile her job as Chief Marketing Officer with her role as a mother.

0 Kommentare