Large Scale Machine learning for the Single Cell Revolution
- Reference number
- BD15-0043
- Start and end dates
- 170101-230331
- Amount granted
- 27 160 000 SEK
- Administrative organization
- KTH - Royal Institute of Technology
- Research area
- Computational Sciences and Applied Mathematics
Summary
Several high-throughput biomedical techniques are currently revolutionized to an extent that can be expected to have a disruptive impact across a number of medical areas. It is, in particular, becoming increasingly feasible to perform large-scale genome-wide studies of individual cells and, e.g., use variation across cells of diseased tissue, such as tumours, in diagnosis, prognosis, and therapeutic design. Also techniques for mapping out the spatial distribution of cells are becoming available. Exploitation of these opportunities is, however, dependent on successful solution of severe computational challenges, mostly due to volume, uncertainty, and complexity of the associated data. This application aims to tackle these challenges by leveraging the potential of machine learning methodologies in the context of big data paradigms. We will adopt several machine learning methodologies, including deep learning and model based approaches. Probabilistic generative models will allow us to cast analysis involving temporal, spatial, and developmental aspects as likelihood-based inference, in principle allowing integration and investigation of modern genomic datasets. This principled approach has been hugely successful in computational biology, but also often constrained by the resulting computational complexity in combination with data volume. We will apply new computational methodologies with potential to address `big data’ and take advantage of modern high performance computers.
Popular science description
Most people are aware of that their genes control their traits like their appearance and susceptibility to disease. The same, or at least very similar set of twenty thousand genes, resides in the DNA of all the about three billion cells of your body. Still the individual cells appear very different, both in terms of their appearance and what function they fill. The difference between the cells instead sits in how their genes are expressed, that is, how many times the genes’ DNA are read and used as templates for production of proteins, which are the actors in the cell that perform most of their functions. For instance, the genes for controlling your eye-color are present in all your cells, but are possibly only expressed in your eyes. Recent technological developments have made methods available for determining which genes individual cells are expressing, so-called single cell techniques. As the methods are measuring a very large and complex system, these technologies are very data intensive by their nature, The resulting datasets are so large and complex that traditional data processing methods are inadequate. Meanwhile, the development in a branch of computer science, known as machine learning, has made some spectacular progress the last couple of years. Machine learning offers methods for computers to learn from data presented to the methods rather than from explicit instructions from a programer. In a large set of different areas of modern life, machine learning algorithms are used for making predictions, and to help its users to interpret their surroundings. Such methods are already today a prerequisite for the interpretation of molecular biological data. However, in this application we suggest machine learning methods for the interpretation of single cell data, an area where we undoubtedly will see a deluge of data appearing the upcoming couple of years. Such techniques will be particular useful in cancer research, where the technology will help us understand the interaction between the different cells of a tumor, and how to distinguish different types and stages of cancers, which will form a great aid in clinical decisions. In regenerative medicine, such technology will help understand how to alter cells so that they can perform new tasks, potentially replacing damaged cells. For instance one could produce neurons for neurological impaired patients or insulin producing beta-cells for diabetics.