Module Aims:
This core module aims to introduce students to the R language and environment for statistical computing and graphics accessed through the R studio console. Introductory sessions will explain the fundamental concepts that underpin R and the basics of coding in R. Subsequent sessions will teach the principles of data carpentry, data summarisation and data visualisation. The importance of replicability and how to achieve this will be emphasised. The module also revises and applies some epidemiological concepts covered earlier in the course.
Module Learning Outcomes:
By the end of the module, students should be able to:
1. Understand the rationale for replicable data analysis
2. Understand the concept of object oriented programming
3. Understand the basics of vectors and matrices and how to manipulate them in R
4. Create a simple function
5. Import and clean a data file
6. Manipulate data using the key ‘tidyverse’ functions filter(), mutate(), select(), arrange(), group_by()
7. Learn principles of data summarisation and visualisation
8. Effectively summarise data/results using a table
9. Effectively summarise data in a graph using ggplot()
Pre-requisites:
R and R Studio installed on computers.
Basic skills in mathematics
Understand the principles of direct standardisation of incidence rates.
Teaching Strategy:
Lectures with embedded practical exercises used to develop basic data handling and analysis skills using the ‘tidyverse’ suite of packages from R implemented in R Studio.
Assessment:
Take-home assignment involving importing and manipulation of multiple data files in order to generate summary figures and tables.
Module Length: 5 days over 4 weeks