10  Final reports

These are the course notes for the 2023 version of Fundamentals of Data Science
(MA7419 / MA3419)

10.1 Overview

In Week 10 you will be submitting your final reports.

10.2 Citing R and packages

The underlying programme is clearly a massive piece of work and you should give its authors credit by citing them in any work you do using R.

For example you might write:

Statistical analysis was done using R 4.1.1 (R Core Team 2022).

R makes it easy to generate the right reference to use because there’s a built-in function to do it.

citation()

To cite R in publications use:

  R Core Team (2022). R: A language and environment for statistical
  computing. R Foundation for Statistical Computing, Vienna, Austria.
  URL https://www.R-project.org/.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {R: A Language and Environment for Statistical Computing},
    author = {{R Core Team}},
    organization = {R Foundation for Statistical Computing},
    address = {Vienna, Austria},
    year = {2022},
    url = {https://www.R-project.org/},
  }

We have invested a lot of time and effort in creating R, please cite it
when using it for data analysis. See also 'citation("pkgname")' for
citing R packages.

You should also cite the main packages you’ve used. For example:

Data wrangling was carried out with dplyr (Wickham et al. 2023) and other packages from the tidyverse (Wickham et al. 2019); graphs were plotted using ggplot2 (Wickham 2016).

Again, you can use citation to generate the correct references. For example:

citation('dplyr')

To cite package 'dplyr' in publications use:

  Wickham H, François R, Henry L, Müller K, Vaughan D (2023). _dplyr: A
  Grammar of Data Manipulation_. R package version 1.1.3,
  <https://CRAN.R-project.org/package=dplyr>.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {dplyr: A Grammar of Data Manipulation},
    author = {Hadley Wickham and Romain François and Lionel Henry and Kirill Müller and Davis Vaughan},
    year = {2023},
    note = {R package version 1.1.3},
    url = {https://CRAN.R-project.org/package=dplyr},
  }

If you are wondering, BibTeX is a file format (and software) used to describe lists of references, often for use within LaTeX documents. The rmarkdown package (Xie, Allaire, and Grolemund 2018) provides methods to work with BibTeX references.

What to cite?

You should always cite R itself, but there is an element of judgement in deciding which individual packages to cite. For this module, you won’t lose marks as long as you have made a reasonable effort.

To make your work fully reproducible it is best practice to list all the packages you’ve used (e.g. in an appendix). You can generate the necessary information with the function sessionInfo(), but this is not necessary for FDS coursework.