Storskalig maskinginlärning för single cell revolutionen
- Diarienummer
- BD15-0043
- Start- och slutdatum
- 170101-230331
- Beviljat belopp
- 27 160 000 kr
- Förvaltande organisation
- KTH - Royal Institute of Technology
- Forskningsområde
- Beräkningvetenskap och tillämpad matematik
Summary
Flera storskaliga biomedicinska tekniker genomgår just nu revolutionerande förändringar, i en sådan skala att vi kan tala om ”disruptiv” inverkan på ett antal medicinska områden. I synnerhet, är det idag möjligt att utföra storskaliga helgenoms studier av enskilda celler och t. ex. undersöka variation över celler i vävnadsprover, såsom tumörer, i diagnos, prognos och terapeutisk utformning. Vi ser även en framväxt av tekniker för att kartlägga den spatiala fördelningen av celler. Möjlighten att utnyttja dessa tekniker är dock beroende av att lösa de svåra beräkningstekniska utmaningar som står framför oss, främst på grund av volym, osäkerhet och komplexitet hos den här typen av data. I den här ansökningen tacklar vi de dessa utmaningar genom att utnyttja potentialen i maskininlärning för “big data” problem. Vi kommer att anta flera maskininlärningsmetoder, inklusive djupinlärning och modellbaserade strategier. Probabilistiska generativa modeller gör det möjligt för oss att göra analyser som involverar tids-, fysiskaliska- och utvecklings-mässiga aspekter av sannolikhetsbaserad inferens, vilket möjliggör integration och undersökning av moderna omik-set. Sådan analys har varit framgångsrik inom bioinformatik, men också ofta begränsad av den resulterande beräkningskomplexiteten från den relativt stora datamängden. Vi kommer att tillämpa nya beräkningsmetoder med potential att ta itu med “big data” och dra nytta av moderna högpresterande datorer.
Populärvetenskaplig beskrivning
Most people are aware of that their genes control their traits like their appearance and susceptibility to disease. The same, or at least very similar set of twenty thousand genes, resides in the DNA of all the about three billion cells of your body. Still the individual cells appear very different, both in terms of their appearance and what function they fill. The difference between the cells instead sits in how their genes are expressed, that is, how many times the genes’ DNA are read and used as templates for production of proteins, which are the actors in the cell that perform most of their functions. For instance, the genes for controlling your eye-color are present in all your cells, but are possibly only expressed in your eyes. Recent technological developments have made methods available for determining which genes individual cells are expressing, so-called single cell techniques. As the methods are measuring a very large and complex system, these technologies are very data intensive by their nature, The resulting datasets are so large and complex that traditional data processing methods are inadequate. Meanwhile, the development in a branch of computer science, known as machine learning, has made some spectacular progress the last couple of years. Machine learning offers methods for computers to learn from data presented to the methods rather than from explicit instructions from a programer. In a large set of different areas of modern life, machine learning algorithms are used for making predictions, and to help its users to interpret their surroundings. Such methods are already today a prerequisite for the interpretation of molecular biological data. However, in this application we suggest machine learning methods for the interpretation of single cell data, an area where we undoubtedly will see a deluge of data appearing the upcoming couple of years. Such techniques will be particular useful in cancer research, where the technology will help us understand the interaction between the different cells of a tumor, and how to distinguish different types and stages of cancers, which will form a great aid in clinical decisions. In regenerative medicine, such technology will help understand how to alter cells so that they can perform new tasks, potentially replacing damaged cells. For instance one could produce neurons for neurological impaired patients or insulin producing beta-cells for diabetics.