Title

Predicting a gene mutation from DNA methylation profiles

Description

Atypical teratoid/rhabdoid tumors (AT / RT) are extremely malignant brain tumors that occur primarily in infants and young children. The causes for the development of these tumors are not yet fully understood. However, it is known that almost all AT / RT have an alteration in the SMARCB1 gene on chromosome 22. In most cases, the altered SMARCB1 gene is only detectable in the tumor cells themselves, caused by a spontaneous mutation in a somatic cell. However, in up to 30% of all patients, the cells of the germ line (germ cells) and thus all cells of the body are also affected (germ line mutation). The rapid detection of a germline mutation is extremely relevant for the further treatment of patients and genetic counseling of their relatives. In the context of neuropathological diagnostics, genome-wide DNA methylation profiles are used to classify tumors into 3 subgroups. The aim of this project is to predict the presence of a germline mutation based on the DNA methylation profile.

Specifically, the task is in the area of classification. Clustering methods as unsupervised learning methods have already been tested out and did not give promising results. However, the present dataset is also annotated accordingly. Thus, there is the possibility to use supervised learning methods.

Therefore, the goal of this thesis is to apply and evaluate different standard methods for learning a classifier. These methods are Support Vector Machine, regression / decision tree, naive Bayes classifier, and simple neural networks for a BA level project. For an MA level project, more elaborate methods, such as those that learn more complex network structures, are the focus of the investigation. Another challenge is that the dataset is relatively small but very high dimensional. So, a form of dimension reduction is probably necessary.

The work is performed at the Institute of Computer Science in collaboration with the Institute of Neuropathology at Universitätsklinikum Münster.

Requirements

Basic knowledge on common, simple ML approaches as well as programming skills in Python are helfpul.

People working on it

  • Michael Bucks: simple ML models (BA)
  • Thomas Tenberge: more complex NN architectures (MA)

Contact

Tanya Braun, tanya.braun@uni-muenster.de
Christian Thomas, christian.thomas@ukmuenster.de
Martin Hasselblatt, martin.hasselblatt@ukmuenster.de