USEFUL INFORMATION: Short introduction to Rmarkdown

UE NGS - ENS LYON

Author

NGS-team - 2023

1 Making your research reproducible

We have already made a case about reproducibility in this training. In this introduction, we will focus on one of the tools to enable and empower you to perform analysis reproducibly.

When you do lab work, you use lab notebooks to organize your methods, results, and conclusions for future retrieval and reproduction. The information in these notebooks is converted into a more concise experimental description for the Methods section when publishing the results. Computational analysis requires the same diligence! The equivalent of a lab notebook for computational work is a detailed log of the workflow used, the tools at each step, the parameters for those tools and last, but not least, the versions of the tools.

Image source: “Reproducible Research in Computational Science”, Peng 2011 https://doi.org/10.1126/science.1213847

2 RMarkdown for R analysis

Creating the “gold standard” code is not always easy depending on what programming language you are using. For analyses within R, RStudio helps facilitate reproducible research with the use of R scripts, which document all code used to perform a particular analysis. However, we often don’t save the version of the tools we use in a script, nor do we include or interpret the results of the analyses within the script.

In the first part of this session we will be learning about RMarkdown. RMarkdown is a file format in its most basic form, that can eventually be converted into a shareable document, e.g HTML, PDF and many others. It allows you to document not just your R (Python and SQL) code, but also enables the inclusion of tables, figures, along with descriptive text. Thus resulting in a final document that has the methods, the code and interpretation of results all in a single document!

To elaborate, you write a file using the Markdown language and within it embed executable R code chunks. The code chunks are paired with knitr syntax, so that once your document is complete, you can easily convert it into one of several common formats (i.e. HTML, PDF, PPT) for sharing or documentation.

Nothing better than an example to convince you !

Exercices #1
  1. Open a new Rmarkdown file (File > New File > R Markdown… ), a dialog box will open, add your name and keep the other values by default and click on OK.
  • A template file will be generated to give you a starting point to modify for your analyses.

  1. Save it as “default_template.Rmd” and “knit” the document to generate an HTML document

  1. A web page will pop-up (You may have to allow the page to pop-up)

  1. Change the title and knit a new time

  2. Add a sentence below the “## R Markdown” and knit once again

2.1 RMarkdown basics

Markdown is a lightweight markup language with plain-text-formatting syntax. It is often used for formatting README files, writing messages in online discussion forums, and creating rich text documents using a plain text editor. The Markdown language has been adopted by many different coding groups, and some have added their own “flavours”. RStudio implements an “R-flavoured markdown”, or “RMarkdown”, which has really nice features for text and code formatting.

The RStudio cheatsheet for Rmarkdown is quite daunting, but includes more advanced Rmarkdown options that may be helpful as you become familiar with report generation, including options for adding interactive plots RShiny.

2.1.1 Components of a .Rmd file

Let’s take a closer look at the “raw” file and understand the components therein.

1. A file header in YAML format

---
title: "Super title"
author: "Toto"
date: "2023-03-24"
output: html_document
---

This section has information listed in YAML format, and is usually used to specify metadata (title, author) and basic configuration information (output format) associated with the file. You can find detailed information about specifications that can be made in this section on this webpage.

2. Descriptive text

## R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.

The syntax for formatting the text portion of the report is relatively easy. You can easily get text that is bolded, italicized, bolded & italicized. You can create “headers” and “sub-headers” to organize the information by placing an “#” or “##” and so on in front of a line of text, generate numbered and bulleted lists, add hyperlinks to words or phrases, and so on.

Let’s take a look at the syntax of how to do this in RMarkdown:

You can also get more information about Markdown formatting here and here.

3. Code chunks

The basic idea behind RMarkdown is that you can describe your analysis workflow and provide interpretation of results in plain text, and intersperse chunks of R code within that document to tell a complete story using a single document. Code chunks in RMarkdown are delimited with a special marker (```). Backticks (`) commonly indicate a chunk of code.

Each individual code chunk should be given a unique name. The string after r between the curly brackets ({r name}) at beginning of chunks. The name should be something meaningful, and we recommend using snake_case for the names whenever possible.

There is a handy Insert button within RStudio that allows you to insert an empty R chunk in your document without having to type the backticks etc. yourself.

Alternatively, there are keyboard shortcuts available as well.

  • Ctrl + Alt + i for PC users
  • Command + option + i for Mac users

Finally, you can write inline R code enclosed by single backticks (`) containing a lowercase r. This allows for variable returns outside of code chunks, and is extremely useful for making report text more dynamic. For example, you can print the current date inline within the report with this syntax: 2023-09-22. See how we implement this in the YAML header.

For the final code chunk in your analysis, it is recommended to run the sessionInfo() function. This function will output the R version and the versions of all libraries loaded in the R environment. Documenting the versions of the tools you used is important for reproduction of your analysis in the future.

2.1.2 Generating the report

Once we have finished creating an RMarkdown file, we finally need to “knit” the report. You can knit the files by using the knit() function, or by just clicking on “knit” in the panel above the script as we had done in our first activity in this lesson.

Note that when creating your own reports, you will very likely find yourself knitting the report periodically as you work through rather than just once at the end. It is an iterative process usually since you may have to turn off warnings, or if you decide you need a figure to be larger/smaller, or updating the descriptive text in the document to be informative (for others and your future self).

When you click on the “knit” button, by default an HTML report will be generated. If you would prefer a different document format, this can be specified in the YAML header with the output: parameter as discussed above, or you can also click on the button in the panel above the script and click on “Knit” to get the various options as shown in the image under the 5th part of the exercise above.

Note: PDF rendering is sometimes problematic, especially when running R remotely, like on the cluster. If you run into problems, it’s likely an issue related to pandoc and latex instalation.

Only html file can be responsive.

Exercices #2
  1. Scroll down to the end of your Rmd template file document. Add a new code chunk. Within the code chunk place the operation 1+1. Knit your document.

  2. Add a new section header (“New section”) above the newly created code chunk and a new sub section header (“New sub section”).

  3. You can click on the Outline button on the top right corner to have the table of content of your document and thus more easily navigate in your document.

  1. Add a bold text, an italic text, a list

5 . Add an image to your Rmd. (First, save the image you want to display on your VM)

Show the code
![](img.png)
![](img.png){width=250px} #if you want to precise the width

Bonus: You can try to center it:

Show the code
#using option of the chunk: {r , out.width = "30%", fig.align = "center"}
#https://bookdown.org/yihui/rmarkdown-cookbook/fig-align.html

knitr::include_graphics("img/R.png")

To finish

Now it’s time to get to work on your project and make beautiful reproducible R analyses!

Remember, if you’re looking for a command, have an incomprehensible bug or don’t know where to start :

  • ask other members of your group
  • the internet is also your friend (it’s not cheating!)
  • ask one of the trainers

These materials have been developed by members of UE NGE teaching team of the ENS de Lyon. These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.