The data catalogue - the basis for data-driven use cases

from | 14 August 2018 | Basics

When it comes to the concrete implementation of data-driven projects, the question of the existence, completeness and availability of all relevant data inevitably arises. In order to be able to conceptualise data-driven use cases in a targeted manner, all employees involved in a project need to have a Overview about the data. Starting with all existing data sources, through their respective origins to the responsible contact persons. For a productive implementation of data-driven use cases, your employees must be able to retrieve information about content-related and descriptive data quality, utilisation history and access rights at any time. The Data catalogue respectively Data Catalog is the solution concept for these challenges.

What is a data catalogue?

A Data catalogue (also "data catalogue" or "Data Dictionary") is a centralised information register for data spectra of different sizes. This is why it is sometimes colloquially referred to as a "data directory". This directory lists all relevant information on the existing data and data sources. This makes a data catalogue one of the most important tools for managing, checking and locating data for further processing.

Even though there are very different types of data catalogues in detail, the overall goal that a data catalogue pursues is always the same. Employees should be informed about the existence, physical location, access rights and utilisation history as well as the quality and content of the data sources.

Link Tip: A data catalogue is somewhat related to a data lake. Read our blog article about everything companies need to know about the data lake. Data Lake need to know.

The advantages of a data catalogue

  • Save time
  • Knowledge is made available
  • Responsibilities are clarified

A data catalogue acts as a Productivity catalyst. When it comes to both the conception and implementation of data-driven use cases, a data catalogue fulfils an important function. Descriptive metadata saves staff valuable time in understanding and organising data sources. And vice versa: unstructured data and incomplete, incorrect or ambiguous data sets make the work before and during an analysis project much more difficult.

Visualisation of a data set with different variants for the attribute "unknown".
Visualisation of a data set with different variants for the attribute "unknown".

The creation of a data catalogue also creates added value for other reasons. On the one hand, all data is reliably recorded, i.e. also previously inaccessible data. Knowledge is made available. On the other hand, all data will be made more easily accessible. This opens up the possibility of numerous other Use Cases for the existing data sources. Last but not least is the drastic saving of time often spent searching for Data sources This is an important economic factor in favour of a data catalogue.

Link Tip: Metadata is only one component for a strategy that leads to Optimal data quality That's why we've summarised all the relevant ones in this blog post.

The Data Catalog and sustainable knowledge management

A Data Catalog can be used as collaborative information platform be designed. This means that all employees are given the opportunity to enter the knowledge they have gained into the system. In this way, the knowledge is preserved for the environment in the long term and through the resulting Productivity gain your employees will be further motivated. For this to succeed, the responsibilities for handling data must be clarified in principle. The basis for a successful transition to a data-driven organisation is: Data governance.

Filling the data catalogue in 3 steps

We understand the setting up of a data catalogue from the Alexander Thamm GmbH as a standard process. We have successfully carried out this process in our more than 500 data-driven use cases. The filling of a data catalogue always takes place in 3 essential steps:

  1. Advice on the selection of suitable software environments or the provision of a template for operating a data catalogue.
  2. The initial filling of your data catalogue in cooperation with your employees.
  3. Training your staff in the administration and use of the data catalogue.

It is only through the interaction of all three steps that the data catalogue becomes a sustainable instrument that lays the foundations for Data-driven use cases creates.


Michaela Tiedemann

Michaela Tiedemann has been part of the Alexander Thamm GmbH team since the early start-up days. She has actively shaped the development from a fast-moving, spontaneous start-up to a successful company. With the founding of her own family, a whole new chapter began for Michaela Tiedemann at the same time. Hanging up her job, however, was out of the question for the new mother. Instead, she developed a strategy to reconcile her job as Chief Marketing Officer with her role as a mother.

0 Kommentare

Submit a Comment