This module aims to provide the understanding, specialist knowledge and practical skills required to analyse large genomic datasets to answer questions about the genetic contribution to human health and disease. The module will particularly focus on processing, annotating and interpreting short-read data, and applying a number of different analytical approaches to both rare and common genetic variation in the context of different diseases and traits.
Module Learning Outcomes:
By the end of this module, students should be able to:
- Understand how high-throughput sequencing (HTS) data are generated, and the related quality control processes
- Analyse and interpret HTS data using command-line programming and bioinformatic approaches
- Use appropriate publicly available resources to annotate genetic variation and interpret it in the context of the patient’s clinical presentation
- Understand how the HTS data are utilised for providing genetic diagnoses for patients in clinical diagnostic setting, as well as answering research questions and generating new knowledge
- Understand and appraise the purpose, current benefits, and future opportunities of the large genomic datasets generated by sequencing the genomes and deep phenotyping of both patient and population-based cohorts
- Appropriately apply different analytical approaches to common and rare genetic variation
- Develop basic practical skills in the analysis of genomic data in a Cloud computing environment
Understanding of: the DNA structure, transcription and translation; different types of genetic variation (e.g. single nucleotide substitutions, whole gene deletions, chromosomal translocations) and their functional effects; common vs rare variants; inheritance patterns (e.g. dominant, recessive, x-linked); linkage disequilibrium and haplotypes. Ideally students would have completed the Genetic Epidemiology Module.
The practical aspect of this module will involve linux-based command-line software, and it is expected that students have completed linux training and are comfortable with writing commands. Students are encouraged to refresh their linux skills, e.g. by working through some of the many online tutorials (e.g. https://tutorials.ubuntu.com/tutorial/command-line-for-beginners#0; http://www.ee.surrey.ac.uk/Teaching/Unix/index.html). Note that all necessary commands will be provided in the practical sessions.
Students are required to do 2-3 hours preparatory work for each session (self-directed learning). They will be told in advance of any software or applications that they should download in preparation for each session.
The sessions will be a mixture of interactive lectures/workshops/group discussions used to consolidate and expand the knowledge gained during the preparatory work, and hands-on genomic data analysis practicals. Each session will end with a 5-10min primer for the following session and the associated preparatory material. The module will use only publicly available resources, databases and software.
Group presentation, 40%: Groups of 3 or 4 students work on a project and make a presentation on a selected research topic during the last session. The presentation consists of each student introducing their own contribution, and the group drawing these together to explain how their work has addressed the overall task. Group marks are given for the content of the overall presentation, answering the research question and responding to questions during a 10-minute Q&A session following the presentation. For the overall presentation mark, 50% is based on the average of the marks given by other students on the module, and 50% is awarded by the module leader or a similarly qualified assessor appointed by them.
Individual technical report, 60%: Each student submits a 1000-word technical report on their group project a week after the end of the module. The report will include a brief introduction to the context and methods used, as well as detailed results and their interpretation.
Module Length: 4 days