Big data analysis and biomedical research meet in our lab: We develop novel data mining algorithms for detecting patterns and statistical dependencies in large datasets from the life sciences.

The ultimate goal in our work is to contribute to two big goals of science in the 21st century: To enable the automatic generation of new knowledge from big data through machine learning, and to help to gain an understanding of the relationship between diseases and molecular properties of patients, thereby enabling precision medicine.

Below you can find further information for some of our projects:


Machine Learning: Comparing Structured Data

We develop methods for comparing and classifying high-​dimensional objects. One prominent example are graph kernels, i.e. efficient distance functions between graphs.  


Machine Learning: High-​Dimensional Correlations

We develop methods for measuring statistical dependence between high dimensional variables, two-​sample tests to tell whether two samples were drawn from the same distribution, outlier detection algorithms to tell find "unusual" observations in a given dataset, and approaches that detect non-​linear dependence between variables.


Machine Learning: Significant Pattern Mining

We develop methods that discover significant patterns in high dimensional datasets while being runtime efficient and statistically sound. Our algorithms can be applied to graphs or collections of sequences and allow to account for dependencies between objects, to control the Family-​Wise Error Rate and to correct for categorical covariates.


Computational Biology: Genome-​Wide Association Studies

We develop efficient multivariate approaches for the genome-​wide discovery of genetic loci that are associated with a phenotype, thereby trying to elucidate the multicausal basis of complex traits.


Computational Biology: Genome Annotation

We have developed methods for detecting genomic insertions and deletions using next-​generation sequencing, and thoroughly assessed the difficulty of comparing the performance of variant pathogenicity prediction tools.


Computational Biology: Molecular Graph Classification via Graph Kernels

We developed new, fast and scalable similarity measures on graphs, so-​called graph kernels. Their prime purpose is to compare molecular graphs or protein structures and to classify them into functional categories.


Personalized Medicine

We have coordinated several national and international networks on personalized medicine:

Go to Editor View