5,000 genome project

Each person is different and not everyone responds the same way to a certain form of therapy. Our vision is to obtain an even better understanding of the basic genetic principles of the various leukemias and lymphomas and to better grasp the complexity behind each illness. It is only by doing so that we will be able to offer each patient the best possible diagnostics in the future. We hope to further refine the diagnostics in order to better categorize the wide range of different types of leukemias, thereby also uncovering new therapy options. Our curiosity motivates our continued research and us to take responsibility for directly translating the knowledge gained through research into benefits for our patients.

Our motivation

Each year, around 14,000 persons in Germany are diagnosed with leukemia. The success of a therapy depends greatly on the point in time and the accuracy of the diagnosis. While the diagnostics for leukemias and lymphomas based on the diagnosis guidelines of the WHO (World Health Organization) were still highly morphology-dependent a few years ago, an increasing number of genetic and molecular genetic markers are being used to identify forms of leukemia. For example, the current issue of WHO 78 includes various genetic changes (mutations, gene fusions, and overexpression), which require special diagnosis, or at least describe the clonality of the illness (Swerdlow et al., Blood, 2016; Arber et al., Blood, 2016).

Leukemia occurs in various forms, and while certain types of leukemia are highly uniform in their manifestation and molecular profile, there are other sub-entities with a significantly wider spectrum. It is precisely this diversity that not only makes diagnosis difficult, but also choosing the best form of therapy. However, the knowledge of illness-causing changes in leukemia cells is also increasingly enabling them to be tackled in a targeted fashion via therapies. One prominent example here are tyrosine kinase inhibitors (TKI), which are being used highly successfully to treat chronic myeloid leukemia (CML), where they specifically attack the illness-causing target BCR-ABL1. This type of targeted therapy is a prime example of customized medicine (personalized/precision medicine). For this purpose, it is important to possess as much knowledge as possible about the molecular processes that take place.

The project

This was the reason why the 5,000 genome project was launched at MLL. In order to gain as much knowledge as possible, we have begun to examine a diverse range of leukemia and lymphoma sub-groups in our project. With our Biobank, we also have the ability to include rare forms of leukemia and lymphoma, thereby allowing a very wide spectrum of various entities to be covered. We take advantage of the options offered by high-throughput sequencing and examine both the genome (WGS, Whole Genome Sequencing) as well as the transcriptome (RNA-Seq) of a patient in order to obtain as much genetic information as possible. Via the combination of WGS and RNA-Seq, we validate not just the variants found on both levels, but also pursue the question of whether the mutations found are transcribed and expressed and/or whether the translocations found also lead to a fusion transcript. Furthermore, we attempt to correlate the genotype with the expression profile in order to find out more about genetic changes and their impact on the cell. For example, which changes in the transcriptome do patients with mutations in one of the splicing genes exhibit?

Both with DNA as well as RNA profiles, classifiers, which allow a diagnosis to be predicted, can be trained. It is possible that this can be improved even further via a combination of both profiles. The analysis of expression profiles allows for the identification of changed cellular pathways. Based on this information, networks can be created, which can make conclusions regarding the function of the cell. In addition, the potential effects and success of possible therapeutic interventions can also be modeled in silico with these networks.

Infographic 5,000 genome project

Combining existing knowledge

In addition to the newly obtained genetic information, we at MLL have the ability to generate a good characterization of patients using data from routine diagnostics by utilizing morphology, immunophenotyping, chromosome analysis, and mutation analyses and examining them in relation to each other (orthogonal comparison). Furthermore, we have follow-up information and clinical data, which provide us with insight into the progression of the illness in each individual patient. By combining all the data, significantly more comprehensive risk stratification and prognoses can be calculated.

We process all this genetic information obtained – the genome data and the transcriptome data – with the latest analysis methods. Artificial intelligence and sufficiently large data processing capacities (cloud computing) have made it at all possible for us today to perform these comprehensive analyses.

Collaboration – Making data available in a controlled fashion

Because we are unable to cover all aspects of the various forms of leukemia equally well with our own research projects, and the data generated contains a wide range of different types of information, we make it possible for research groups located worldwide to work collaboratively on the data from the 5,000 genome project together with us here at MLL. These research groups send individual scientists in order to discuss the potential of various research ideas and the utility of our data for these projects, and then subsequently process them together. In order to ensure the security of our data in accordance with the new EU General Data Protection Regulation (GDPR), it does not leave our secure storage location. The scientists only work on the data from MLL (Data management/storage) for the purposes of examining and answering their various research questions. Thanks to the controlled accessibility of the data, we are able to make the 5,000 sequenced cases available to the hematological research community and learn as much as possible about leukemia and its mechanisms.

You may also be interested in

Big data

We provide scientists, researchers and physicians with browser-based tools for interpreting sequencing data for hematology diagnostics.

Learn more


Through a large number of our in-house scientific projects as well as national and international collaborations, we contribute to the progress of leukemia and lymphoma diagnostics and risk stratification of patients.

Learn more


As a rapidly growing, innovative medical laboratory, we are always looking for bright minds to help us bring new and more effective therapies to patients around the world.

Learn more