Project: Solution Method for Distributed Data Quality Management

Enterprises are grappling with increasingly vast amounts of data, where data fuels transformative technologies like machine learning and data-driven products. However, amidst this surge, ensuring data quality has become paramount. Data arrives from myriad sources, in varying structures, and at unprecedented speeds. Maintaining data quality is essential for unlocking the potential of data-driven technologies and data-intensive business models, particularly in distributed environments, where data suppliers and consumers operate independently from the data provider.

Continue reading →

Project: Efficient Data Generation for Simulations

Numerical simulations in engineering and science are often costly and time-consuming. While machine learning-based surrogate models offer a solution, they typically require large datasets. This project aims to develop adaptive models that can be repurposed for similar tasks, such as new PDE problems, using minimal data. By leveraging sparse data and techniques like transfer learning, meta-learning and few-shot learning, the project will enhance the efficiency and accuracy of these models, reducing the need for large datasets and enabling faster assessments.

Continue reading →

Project: Metadata Management in Complex Enterprise Data Landscapes

While there are many concepts, techniques and tools for metadata management, most focus on sub-aspects, e.g., metadata management with semantic technologies. There is no common understanding of what comprehensive metadata management in an enterprise entails and how it can be implemented. It is the goal of this project to design concepts and techniques for comprehensive metadata management across the entire enterprise data landscape.

Continue reading →

Project: Data Lake Architecture

As enterprises shift their business to be data-driven and incorporate initiatives such as industry 4.0, data lakes become increasingly popular as data management platforms for heterogeneous data. However, at the time of this project, data lakes were a new and thus not mature concept with various opposing definitions and only high-level considerations regarding its realization. The guiding question of this research project therefore is as follows: How can a data lake be set up and realized to support the needs of an enterprise?

Continue reading →