5 reasons why data science projects hardly generate any added value from data

by | 19 March 2019 | Basics

Data permeates almost every business process today. In recent years, companies have increasingly piloted data products in data science projects. Nevertheless, in many cases it is not possible to bring them into productive use and thus to use data profitably in the long term. What exactly is the difficulty when it comes to generating added value from data?

The reasons for this are manifold and in many cases it is not only a reason for failure. Problems often start with the underlying data, affect the skills and know-how of the staff and finally lead to technological difficulties. We have taken a closer look at the 5 most common reasons that lead to failure in the context of Data science projects fails to generate added value from Data to generate.

1) The data itself.

When it comes to initiating data projects in very concrete terms, ensuring a high Data quality one of the central keys to project success. A rule of thumb among data experts is that 60-80 per cent of the time in a data science project must be spent on preparing the raw data. The data from the source systems must be processed before the subsequent Data analyses first be cleaned, enriched and pre-processed.

By means of these manual processing steps, it is possible within the framework of a prototype project to identify deeper-lying Data quality issues often conceal. In productive operation, this is often associated with disproportionately high efforts. Even if data is available in high data quality, there may be difficulties with access rights, ownership rights or data sovereignty. Not infrequently, this involves lengthy and sometimes highly formalised approval processes. Sometimes this only means a delay in the course of data science projects. In some cases, however, this can also lead to the failure of a project.

Link tip: In our blog article about Data qualityWe have compiled the 5 most important measures.

Lack of data quality often indicates that in traditional enterprises Data Science at the time when the Data generation processes themselves were designed and set up did not play a role. Often, data was originally collected for purposes other than creating value through data analytics.

2) Lack of experience with data-driven products and processes

The second reason that we encounter again and again is the omnipresent uncertainty about Data protection and Data securitywhich has arisen, among other things, but not exclusively, in the wake of the GDPR. The catch with the new regulations is that most of them can be ignored at first or temporarily disregarded until proof of concept. For example, data access difficulties are circumvented by working with deductions from source systems. This strategy is perfectly fine if a plan for automated access exists. However, the approach can also lead to serious problems during go-live if this plan does not exist.

Link tip: In our Whitepaper on the new General Data Protection Regulation we have dealt with the most important aspects of the GDPR.

3) The human factor

As in most other areas of business life, data science projects are not primarily about Data, Algorithms or Technologiesbut about People. We have seen promising proofs of concepts that have stagnated only because they did not receive the necessary attention and support from the management level. Part of the problem is that many organisations do not have a clear strategic direction for their data science activities (data strategy).

This makes it difficult for data science teams to align their projects with the overall strategy of the company and its key value drivers. Furthermore, the understanding of Data science topics still in its infancy in many traditional companies. In this respect, it is hardly surprising that in the midst of the current hype about artificial intelligence and Machine Learning sometimes difficult to keep an eye on the essentials.

Besides a Basic understanding and a Appreciation for Data Science today, experts with complementary skills are required above all. Multidisciplinary teams are an important prerequisite for successfully carrying out data science projects. A particularly important role for the development of successful data products is that of the Data Engineer. So far, far too little attention has been paid to its combination of data and software development know-how.

4) Organisational hurdles

The digital transformation has profound implications for the way traditional organisations operate today. In particular, the role of IT departments is changing fundamentally. Many companies are just discovering what this transformation process means for their Organisational structure, Business processes and the Cooperation means. Driven by the popular narrative about the need for "two-speed IT", the last five years have seen a proliferation of innovation and Data Science Labs across all industries. The underlying idea was to create a separate infrastructure for fast, agile digital projects and innovations.

However, many companies found that this structure was not conducive to a separation between "traditional IT and Innovation Lab caused enormous friction. As a result, in many cases the use of data science could not be anchored deeply enough in the business processes due to the organisational gap. Companies must therefore build bridges between business departments, data science and IT in order to be able to create added value from data in the long term.

5) Technological limits

First of all, let's be clear: Technology itself is never the root cause of a problem. The rapid technological development around Big Data and machine learning, however, can be quite challenging. With the plethora of tools available - ranging from those for data infrastructure to open-source machine learning software and purpose-specific enterprise applications - any data task can be solved.

Big Data Landscape 2018
The Big Data Landscape for 2018 shows how extensive the solutions have become in the meantime. (Source: Matt Turck)

However, the speed of development leads to two key challenges. -

  1. Traditionally, many IT departments have worked with a proven set of standard technologies. Since the pace of innovation in the open source sector has now increased rapidly, IT has to adapt faster than ever before.
  2. There is also often a discrepancy in the technologies used between the development phase of a data science project and the permanent go-live phase. Data scientists appreciate scripting languages like Python or R. But when it comes to Speed, Performance, large Data volumes and Stability compiled programming languages such as C++ or Java are preferred. While the technology landscape naturally also offers answers to this challenge, standard procedures for the technical transfer of data science prototypes into operation still need to be established.

Creating added value from data

These are the 5 main reasons why it is difficult to generate added value from data in many data science projects. However, this also gives companies and organisations 5 starting points to create the conditions for the success of data projects:

  1. Data governance: Processes and responsibilities to ensure data quality, clarity about data access and legal certainty about its use facilitate both prototype development and commissioning
  2. Data Skills: The combination of fundamental data understanding throughout the organisation and interdisciplinary teams of experts allows data science to be deeply embedded in business processes
  3. Data RolesData strategists and data product managers, in addition to data scientists and engineers, make decisive contributions to the success of data science projects.
  4. Data-driven Company: The removal of organisational separation and a common data strategy facilitates cooperation between specialist departments, data science teams and IT department
  5. Data Lake: A central, universally accessible point of access for data and a modern IT architecture can be the difference between a promising proof of concept and a productive data product

Through targeted measures along these 5 dimensions, companies not only create the foundations for individual data science projects. They are also structurally preparing themselves for the dawning Data age and ensure that they have the ability to generate added value from data in the future.

Would you like to learn more about the successful implementation of Data Products? Download our free white paper "7 Best Practices for Deploying your Data Products" down.

<a href="https://www.alexanderthamm.com/en/blog/author/at-redaktion/" target="_self">[at] EDITORIAL</a>

[at] EDITORIAL

Our AT editorial team consists of various employees who prepare the corresponding blog articles with the greatest care and to the best of their knowledge and belief. Our experts from the respective fields regularly provide you with current contributions from the data science and AI sector. We hope you enjoy reading.

0 Comments