Section 1 Introduction

These notes are written for students on MAS61004 The Statistician’s Toolkit, and MAS6024 Statistical Data Science in R. Topics covered include

  • working with R and RStudio;
  • importing data, and getting data into a suitable format for analyses with R;
  • making plots with ggplot2;
  • writing reports with R Markdown;
  • making web apps with shiny.

We do not cover R programming (e.g. writing your own functions); this is included in MAS61006 Bayesian Statistics and Computational Methods.

1.1 About these notes

These notes will get you started on various topics, but are not intended to cover everything you might need to know. There are lots of excellent free, online resources for learning R, and links to further reading will be given where appropriate. After studying these notes, you should be able to find things out quickly for yourself, if necessary.

You don’t need to know everything straight away! Try to get a basic understanding of how things work, and what sorts of things are possible, and then search/study the details when you start working on a particular project.

1.2 Books

Although it can be easy to search for and find help online, you have to know what you are looking for. I strongly recommend that you browse some of the following books (the graphics/data visualisation books in particular), to get a broader understanding of what you could do with your data.

All the following books can be read online for free. (I have hard copies of most of these, which I prefer to study. I do not recommend buying books just for this module, but if a particular book looks useful to you more widely, I would recommend buying a hard copy.)

A good general reference book for MAS61004 and MAS6024 is

For R Markdown, I recommend

and if you plan to use R Markdown for your dissertation

For graphics/data visualisation, I recommend

Healy (2019) is a beautiful looking book (thereby proving his point!) and covers ggplot2. Wilke (2019) purposefully doesn’t include any code, but the discussion and advice is excellent. (He has made all the R code used to produce the book available here).

1.3 Acknowledgements

The bookdown package by Yihui Xie has been invaluable for producing these notes. The content of this course is dependent on the work of the R Core team, RStudio and numerous package developers, who have all made their work available for free; if I didn’t appreciate and enjoy using all these tools, I would not be teaching this course! I will cite authors as I go along, and a reference list is given at the end. Many thanks also to Allison Horst, for generously sharing her artwork.