Submission accepted for SIGMOD 2023

The demo paper "Demonstrating MATE and COCOA for Data Discovery" has been accepted for publication at the SIGMOD 2023 conference.

The demo paper "Demonstrating MATE and COCOA for Data Discovery" by Jannis Becktepe, Mahdi Esmailoghli, Maximilian Koch, and Ziawasch Abedjan has been accepted for publication at the SIGMOD 2023 conference.

Abstract: One of the common use cases for data discovery is to enrich a given table with additional columns from related tables inside a data lake. We have recently introduced MATE and COCOA, two systems for joinability discovery and correlation calculation, respectively. By leveraging two novel index structures, a hash-based Super Key Index, and an Order Index, our system is capable of efficiently identifying tables that join on multiple columns and contain relevant features. We show how the data exploration and enrichment process benefits from our index structures by demonstrating MACO, a unified system on top of open web and large table corpora.

Links to the demontration video and Google Colab:

Video

Google Colab

Verfasst von Ziawasch Abedjan