Image
Examples of integrative low-rank factorization layouts and their corresponding graph structures
Examples of integrative low-rank factorization layouts and their corresponding graph structures.
Photo: Felix Held
Breadcrumb

Methods for finding structure in large-scale data

Published

Felix Held’s PhD thesis develops different methods for analysing data, primarily genetical data. The main research area is called unsupervised integrative methods, and an important part of the development is scalability and flexibility.

Unsupervised methods are typically used to search for structure in data. The goal is to find groupings, reduce data dimensions or extract interpretable knowledge. There is no target variable to guide this search, unlike in, for example, regression analysis, and a typical example is clustering. Instead, model assumptions about the data are made, which are then adapted to the data during parameter estimation used to gain insights.

Image
Felix Held
Felix Held
Photo: Setta Aspström

One part of the thesis further develops already existing models for integrative data analysis, that is, the joint analysis of several data sources at the same time. For example, it can be of interest to integrate gene expression data with drug perturbation experiments (the effect of medical drugs on gene expression) or methylation data (where the methylation controls which genes are turned on and off).

Software packages now available on GitHub

The other part is driven by a biological question and involves finding co-regulated groups of genes in an unsupervised fashion. The method groups genes in clusters and connects these with regulators (for example transcription factors, proteins involved in the process where genetic information in the DNA of the cell is translated to information in the RNA) which have similar stimulating or inhibiting effects on the gene expression in a cluster. Together, the methods in the thesis contribute to the development of efficient, flexible and scalable unsupervised methods for integrative data analysis. The methods have been implemented as software packages that are publicly available on GitHub, hoping to stimulate applications to future research.

Felix obtained his bachelor’s degree in mathematics with physics in Würzburg, Germany, and started studying the master’s programme there when he got the opportunity of an exchange year at Chalmers University of Technology. He liked it so much that he studied the entire master’s programme in Gothenburg, and then continued with a licentiate degree through Fraunhofer-Chalmers Centre for Industrial Mathematics, where he worked on pharmacokinetic/pharmacodynamic modelling.

– Gothenburg is nice, it has a relaxed vibe yet is big enough to offer plenty of activities, and then of course it is close to the sea. The master’s programme was good due to its focus on many applications of mathematics.

Started recurring workshop for young researchers

After the licentiate thesis, Felix wanted to learn more about statistics and went on to do a doctorate at the department. He enjoyed teaching and lectured the master’s course “Statistical learning for big data” for three years, a real challenge for a PhD student. The second and third years were lectured digitally due to the pandemic. Felix has also been involved in the PhD student council, for a time as its chair, and together with three other PhD students he started the national Workshop on Modelling in Biology and Medicine which is still held every second year for young researchers on the intersection between mathematics, biology and medicine.

– There are lots of good opportunities for PhD students to participate in at the Department of Mathematical Sciences. It was an insightful experience to take part in starting a recurrent workshop that attracted 70 people in its first year. Now, I am looking for new challenges in industry. Doing a PhD in mathematics can be quite a lonely journey. Due to the diversity of research fields, it is often difficult to engage in each other’s research fields, and I long for more teamwork.

Read Felix Held’s thesis

Felix Held will defend his PhD thesis Unsupervised methods for integrative data analysis on June 7 at 13.15 in lecture hall Pascal, Hörsalsvägen 1. Supervisor is Rebecka Jörnsten, assistant supervisor is Johan Jonasson.