Last Saturday (21st of April) Anuradha and I got the chance to attend a workshop on latest advances in DNA sequencing, bioinformatics and precision medicine, which was organized by the Sustainable Education Foundation SL and hosted by WSO2 Lanka (Pvt) Ltd. The guest speaker was Dr. Vladimir Kovacevic from Seven Bridges Genomics.
Dr. Kovacevic started his talk by telling the audience how he ended up being a bioinformatics engineer, despite his background in video and signal processing. Next, to get everyone on the same page, Dr. Kovacevic first dived into a bit of biology, where he explained about DNA, chromosomes, the central dogma of molecular biology, RNA, amino acids and proteins.
As people became more curious about genomics, scientists wanted to obtain the entire genome sequence of a human. The Human Genome Project began in 1990 and it took 13 years (by 2003) to obtain the first genome sequence along with a cost of 3 billion dollars. The technique used was Sanger sequencing, which was very costly and time-consuming as it used very short reads. At present, with the use of Next-Generation Sequencing (NGS) and Third Generation Sequencing techniques, a genome can be sequenced for around 1000 dollars and takes 1 day. During the last year, successful attempts were made to perform genome sequencing in space as sequencing machines became smaller and became more portable so that they could be carried around in space.
From the year 2000 to 2010, the amount of genome data has grown as more genomes have been sequenced, whereas the cost of sequencing has reduced.
Unfortunately, we cannot obtain the entire genome at once from sequencing machines. What we can obtain are short fragments called reads. We have to reconstruct the entire genome by determining the position of each read. There are 2 ways to reconstruct the genome.
- Alignment – Using reference genome to map the position of the reads
- Assembly – Reconstructing the genome by finding the links between the reads
In order to perform bioinformatics analysis, we have to use several different tools. We have to input data and send them through various processing steps to get the final desired output. We have to create a flow of processes, which results in a workflow. Defining workflows allows us to create and maintain complex workflows with a large number of tools easily and efficiently.
The Common Workflow Language (CWL) is used to define various aspects of a workflow including inputs, outputs, tools connected and their requirements. Use of CWL provides reproducibility, portability, revision management and versioning. CWL in Cloud allows us to use virtualization in the form of Docker containers. All the tools inside the analysis are executed inside these virtual Docker environments so that they will give the same results every time. These analyses utilize huge storage spaces and computational power. Such huge demands can usually be met only by Cloud services. Hence, most of the individuals and companies have moved on to Cloud.
Precision medicine is a medical model that proposes the customization of healthcare, with medical decisions, treatments, practices, or products being tailored to the individual patient.
Every medication may not suit everyone. Hence we need to tailor medication to suit individuals. Precision medicine designs drugs and medical treatments which are targeted to an individual patient. One of the ingredients in creating this medication is the information present in our DNA sequences.
Cancer is a disease which is unique for everyone. Every cancer is different. Hence, one of the major concerns of precision medicine has become cancer treatment. Mutations occurring during DNA replication can cause cancer, which can be resulted due to factors such as uncontrolled cell division or accumulation of undestroyed cells. The probability of mutations is increased by factors such as EM radiation, chemical agents, free radicals, genetic factors and infections.
Our body develops thousands of cancer cells every day. However, we do not get cancer on a daily basis. This is because our immune system consists of T-cells which check and identify infected cells. However, cancer cells can trick our immune system.
Major Histocompatibility Complex (MHC) is a set of cell surface proteins essential for the acquired immune system to recognize foreign molecules. MHC binds to proteins on the cell and it can be identified by T-cells.
We can compare the DNA from cancerous tissue and normal tissue mutations present in a tumour (somatic mutations) and predict what will be expressed and which proteins will be present outside the cells. Cancer cells consist of proteins called neoantigens. With the predicted information, we can train T-cells to recognize these cells with neoantigens and destroy them.
However, such treatments are very expensive and can result in negative effects such as the possibility of developing the autoimmune disease, where some of the trained t-cells can attack our own tissues in some parts of the body. Hence, the design of such treatments should be done very carefully and accurately. In a couple of years, or maybe decades, more and more companies will get involved in designing gene therapy drugs and treatments will be available for a wider population at an affordable cost.
Note: We thank Dr. Vladimir Kovacevic for giving us the slides of the session to refer and explore more details on what he talked about.