Writing a Thesis in the Database and Information Systems Group

Abschlussarbeiten, die im Fachgebiet DBS betreut werden, bieten Studierenden anspruchsvolle Forschungsprobleme zur Bearbeitung und befähigen sie zur Zusammenarbeit mit nationalen und internationalen Partnern an realen, wissenschaftlichen und wirtschaftsrelevanten Themen. Abschlussarbeiten werden aufgrund der internationalen Ausrichtung des Fachgebietes häufig in Englisch erstellt, können jedoch auch in deutscher Sprache verfasst werden.

Typische Voraussetzungen sind:

  • Hervorragende Programmierkenntnisse in eine Programmiersprache
  • Erweiterte Kenntnisse in Datenbanksystemen, wie etwa PostgreSQL, IBM DB2 oder Oracle bzw. Big Data Analytics Systemen wie Hadoop, Flink, Spark


If you are interested in doing your thesis with us, read the following instructions carefully: Thesis Process


Open Thesis


Currently no thesis available.


Ongoing Thesis

  • Large Language Model-Driven Data Preparation Master (M.Sc.)
    • Haifa Lakhdher
  • Evaluation of alternative error weightings based on various established error detection methods (B.Sc.)
    • Pascal Luzina
  • LLM-Driven Data Enrichment (M.Sc.)
    • Christoph Schnell
  • Fuzzy Evaluation Metrics for Error Correction (B.Sc.)
    • Jenny Wu

Finished Thesis


  • Self-supervised Data Cleaning in Data Lakes (M.Sc.)
    • Sebastian Eggers
  • Indexing Large Vectors (B.Sc.)
  • Ordering datasets for efficient or effective error detection (B.Sc.)
    • Xinyue Gong
  • Error Detection in Data Lakes using Raha and Table Clustering (B.Sc.)
    • Youssef Kanoun
  • Parallelizing Raha and Baran using Dask (B.Sc.)
    • Yusuf Mandirali
  • Distributed Discovery of Functional Dependencies (B.Sc.)
    • Torben Eims
  • Quality-Driven Union Table Search (M.Sc.)
    • Mehdi Alijani
  • Will pre-indexing improve deep code search? (B.Sc.)
  • Memory Usage Optimization in Baran (B.Sc.)
  • Bias analysis in large ML training datasets (B.Sc.)
    • Manish Bhatta Kapadi
  • Large Scale ML Data Analysis (B.Sc.)
    • Ghareeb Jawish
  • Maintenance of large inverted indexes (B.Sc.)
    • Ede Becker
  • A unified data representation for few-shot learning (M.Sc.)
    • Ilyes Farhat
  • Near-Duplicate Table Detection (B.Sc.)
  • Will Index-Based Pre-selection Enhance Deep Code Search Efficiency while Preserving Effectiveness? (B.Sc.)
    • André Warnecke
  • Maintenance of large inverted indexes (B.Sc.)
    • Nils Martel
  • Maintenance of large inverted indexes (B.Sc.)
  • Comparison of Coherent Grouping Algorithms for Writing Style Suggestion (B.Sc.)
  • Investigate improving deep code search using information retrieval techniques (B.Sc.)



  • Column Splitter with Record-Matching (B.Sc.)
  • Improving Baran using Embeddings (B.Sc.)
  • Extracting unbiased text from large text corpora (M.Sc.)
    • Christoph Becker   
  • Effectively Sampling Validation Sets (B.Sc.)
    • Mohamed Mahdi Kanoun
  • ML Validation Set Mining (B.Sc.)
  • Validation Set Selection in Machine Learning (B.Sc.)
  • Detecting Duplicate Tables using Xash (B.Sc.)
  • Improving Primary Key Detection with Machine Learning (B.Sc.)
  • Investigating and improving variable names in data science projects with data mining (M.Sc.)
  • Deklarative sukzessive Halbierung (M.Sc.)
  • Performance Benchmarking of Database MAnagement Systems (B.Sc.)
  • Improving Label Propagation in the Data Cleaning System Raha (B.Sc.)
  • Declarative successive halving (M.Sc.)
  • Embedding Data Transformations in AutoML (B.Sc.)
  • Opinion mining in social media data based on neural networks to predict bitcoin prices (B.Sc.)
    • Marc Speckmann
  • Analysing the Influence of social media Influencers on the Bitcoin Price (B.Sc.)
  • Analyzing the relation between social media and Bitcoin’s price variation (B.Sc.)
    • Ahmed Malek Ghanmi
  • From mining Naming Conventions in Data Science Projects to suggesting Variable Names (M.Sc.)
  • Scalable Error Detection (B.Sc.)
    • Faical Aridal
  • Feature Analysis for Agglomerative Clustering (B.Sc.)
    • Malte Fabian Kuhlmann
  • Detecting table headers in heterogeneous tables (B. Sc.)


  • Instrumentierung von Datenreinigung mit AutoML (B.Sc.)
  • Mining social media to discover the factors that affect bitcoin price (B.Sc.)
  • Multi-attribute join search with map-reduce (B.Sc.)
  • Efficient join discovery from large data lakes (B.Sc.)
    • Akram Chorfi
  • Multi-attribute join search with map-reduce (B.Sc.)
  • Interleaving Data Cleaning and AutoML (B.Sc.)