Project: Metadata Management in Complex Enterprise Data Landscapes

Rebecca Eichler, M.Sc.
Email | LinkedIn | Personal Website
Project status: finished

Problem Statement

Enormous amounts of data are generated today within companies through, for instance, the Internet of  Things (IoT) and Industry 4.0 initiatives. This data contains a potential value which may lead to new  insights, the discovery of new business models or the expansion into new markets. The potential data value can, however, only be exploited if the company’s employees can find, access and use it for their respective use cases. However, it has been reported that up to two thirds of data in the enterprises remains unused. Therefore, data democratization initiatives with the goal of empowering and motivating employees to find, understand, access, use and share data across the company, are gaining importance. To drive democratization aspects such as data sharing across the company, the use of enterprise data marketplaces has been proposed. In general, data marketplaces are metadata-driven self-service platforms for trading data and data related services. The enterprise data marketplace is specifically designed to facilitate the exchange of data and data related services within a company between company employees. Most research, however, focuses on the use of marketplaces between institutions, i.e. external marketplaces as opposed to the enterprise/internal data marketplace. Therefore this research project aims to identify the characteristics and challenges of the enterprise data marketplace, and to propose marketplace concepts in which these challenges are addressed and the marketplace leverages existent infrastructure, such as the existent metadata management tools which provide functionality and metadata which is required in the marketplace.

Solutions

To this end, we have examined the current data consumers’ processes for finding, understanding, and accessing data in companies without a marketplace, and which challenges arise throughout these. Consequently, how a marketplace addresses the data consumer challenges was reviewed. Similarly, the data provider’s processes for sharing data were highlighted, as well as the challenges they pose and what role the marketplace could play in them. In addition, characteristics and requirements for the internal data marketplace were identified. Further results include a functionality framework, a platform architecture and an integration architecture that shows how the marketplace can be embedded in the existent enterprise system landscape. Thereby the focus was laid on metadata management within the marketplace, i.e., which metadata already exists in the company and can be leveraged in the marketplace, and which tools provide which metadata. This also includes how metadata can be exchanged between tools and how the marketplace can provide an integrated view on the existent metadata to provide a comprehensive understanding of the data to data consumers.

Before the data marketplace became the focus topic of this research project, research was conducted on metadata management in the context of data lakes, as an extension of the Project “Data Lake Architecture”. More specifically the topic how metadata can be modeled flexibly to support a variety of use cases in the data lake was addressed.

Key Publications

Context Data Marketplace:

  • Eichler, R., Gröger, C., Hoos, E., C., Stach, Schwarz, H.: “Establishing the Enterprise Data Marketplace: Characteristics, Architecture, and Challenges.” Presented at the workshop on data science for data marketplaces in conjunction with the 48th international conference on very large data bases (DSDM@VLDB 2022)
  • Eichler, R., Gröger, C., Hoos, E., Schwarz, H., Mitschang, B.: “From Data Asset to Data Product – The Role of the Data Provider in the Enterprise Data Marketplace.” In: Proceedings of the 16th Symposium and Summer School On Service-Oriented Computing (SummerSoc 2022). https://doi.org/10.1007/978-3-031-18304-1_7 
  • Eichler, R., Gröger, C., Hoos, E., Schwarz, H., Mitschang, B.: “Data Shopping — How an Enterprise Data Marketplace Supports Data Democratization in Companies.” In: Proceedings of the 34th International Conference on Advanced Information Systems Engineering (CAiSE 2022). https://doi.org/10.1007/978-3-031-07481-3_3
  • Eichler, R., Giebler, C., Gröger, C., Hoos, E., Schwarz, H., Mitschang, B.: “Enterprise-Wide Metadata Management – An Industry Case on the Current State and Challenges.” In: Proceedings of the 24th International Conference on Business Information Systems (BIS 2021). https://doi.org/10.52825/bis.v1i.47

Context Metadata Management in Data Lakes:

  • Eichler, R., Giebler, C., Gröger, C., Schwarz, H., Mitschang, B.: “Modeling metadata in data lakes—A generic model.” Data & Knowledge Engineering (2021). https://doi.org/10.1016/j.datak.2021.101931
  • Eichler, R., Giebler, C., Gröger, C., Schwarz, H., Mitschang, B.: “HANDLE – A Generic Metadata Model for Data Lakes.” In: Proceedings of the 22nd International Conference on Big Data Analytics and Knowledge Discovery (DaWaK 2020). https://doi.org/10.1007/978-3-030-59065-9_7