4 R Markdown

What is R Markdown: https://vimeo.com/178485416 created by RStudio

R Markdown documents consist of three components.

  • Code Chunks
  • Text
  • YAML Metadata

4.1 What is R Markdown and R Notebook

R Markdown provides an authoring framework for data science. You can use a single R Markdown file to both

  • save and execute code
  • generate high quality reports that can be shared with an audience

R Notebooks are an implementation of Literate Programming that allows for direct interaction with R while producing a reproducible document with publication-quality output.

An R Notebook is an R Markdown document with chunks that can be executed independently and interactively, with output visible immediately beneath the input.

(Reference: R Markdown: The Definitive Guide, 3.2 Notebook)

4.1.1 Two Goodies

  • Important: Implementation of Reproducible Research and Literate Programming

  • Useful to Render into Various Formats: R Notebook (HTML), R Markdown (HTML), PDF, MS Word, MS Powerpoint, Ioslides Presentation (HTML), Slidy Presentation (HTML), Beamer Presentation (PDF), etc.

4.2 Reproducible Research and Literate Programming

4.2.1 Literate Programming by D. Knuth

Literate programming is an approach to programming introduced by Donald Knuth in which a program is given as an explanation of the program logic in a natural language, such as English, interspersed with snippets of macros and traditional source code, from which a compilable source code can be generated.

4.2.2 D. Knuth

Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

4.2.3 Reproducible Research - Quote from a Coursera Course

Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available.

4.2.4 R Markdown workflow, R for Data Science

R Markdown is also important because it so tightly integrates prose and code. This makes it a great analysis notebook because it lets you develop code and record your thoughts. It:

  • Records what you did and why you did it. Regardless of how great your memory is, if you don’t record what you do, there will come a time when you have forgotten important details. Write them down so you don’t forget!

  • Supports rigorous thinking. You are more likely to come up with a strong analysis if you record your thoughts as you go, and continue to reflect on them. This also saves you time when you eventually write up your analysis to share with others.

  • Helps others understand your work. It is rare to do data analysis by yourself, and you’ll often be working as part of a team. A lab notebook helps you share why you did it with your colleagues or lab mates.

4.2.5 Records of EDA and Communication

  1. Memo on a scratch paper: R Scripts
  2. Record on a notebook: R Notebook (an R Markdown format)
  3. Short paper or a digital communication: R Notebook
  4. Paper or a report: R Markdown (html, pdf, MS Word, etc.)
  5. Presentation (html, pdf, MS Powerpoint, etc.)
  6. Publication of a Book

4.3 Let’s Get Started

  1. Start R Studio - Update R Studio if old
  2. Create a Project
  3. Tool > Install Packages rmarkdown
    • Or on Console: install.packages("rmarkdown")
  4. Tool > Install Packages tinytex (for pdf generation)
    • Alternatively, install.packages('tinytex')
    • If TeX is not installed: tinytex::install_tinytex() # install TinyTeX
      • If you are not sure, please check on Terminal in the left below pane:
        • which latex, which mktexlsr - Mac or Linux
        • where mktexlsr - Windows
  5. Let’s try!
    1. File > New File > R Notebook
    2. Save with a file name, say, test-notebook
    3. Preview by [Preview] button
    4. Run Code Chunk plot(cars) and then Preview again.
    5. Knit PDF, Word (and HTML)

4.4 Templates

4.4.1 RNotebook_Template

Template to submit your assignment of this course: RNotebook_Template.nb.html

title: "Title of R Notebook"
author: "ID and Your Name"
date: "2023-12-04" 
output:
  html_notebook: null

4.4.1.1 YAML

  • Change the title
  • Write ID and your name
  • Date is auto-generated and inserted. If you wish, you can replace “2023-12-04” by your favorite date style.

4.4.1.2 Code Chunk

  • When you execute or run a code within the notebook, the results appear beneath the code.
  • Try executing this chunk by clicking the Run button, a triangle pointing right, within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter (Win) or Cmd+Shift+Enter (Mac).
    • Ctrl + Shift + Enter (Windows) or Cmd + Shift + Enter (Mac): Runs the current code chunk and advances to the next one.
    • Ctrl + Alt + C (Windows) or Cmd + Option + C (Mac): Runs all the code chunks in the document.
  • Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Option+I (Win) or Cmd+Option+I (Mac).
  • When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K (Win) or Cmd+Shift+K (Mac) to preview the HTML file).
  • The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.
  • We will use the pipe command %>% very often later in this class.
    • The shortcut for the pipe operator (%>%) in Rmarkdown on Windows and Mac OS is:
      • On Windows: Ctrl + Shift + M
      • On Mac OS: Cmd + Shift + M

4.4.2 Testing R Markdown Formats

Various Output Formats: test-rmarkdown.nb.html

title: "Testing R Markdown Formats"
author: "DS-SL"
date: "2023-12-04"
output:
  html_notebook:
    number_sections: yes
  pdf_document: 
    number_sections: yes
  html_document:
    df_print: paged
    number_sections: yes
  word_document: 
    number_sections: yes
  powerpoint_presentation: default
  ioslides_presentation:
    widescreen: yes
    smaller: yes
  slidy_presentation: default
  beamer_presentation: default

4.4.3 Comments on Presentation Formats and Options

  • For slides, a new slide starts at ##, the second-level heading.
  • --- is page break for presentation formats.
  • For Word and Powerpoint, you can add your template. See the documents in References
    • Use R Markdown to create a Word document [similar for PowerPoint]
    • Save the rendered Word file as: ref-doc-style.docx
    • Edit the styles of the file ref-doc-style.docx
    • Add ref-doc-style.docx as reference_doc in YAML with indention as below
  word_document: 
    number_sections: yes
    reference_doc: ref-doc-style.docx
  powerpoint_presentation: 
    reference_doc: ref-ppt-style.pptx
  • You can use Output Options at the bottom of the gear icon next to Preview/knit button.

4.5 Markdown Language – or use WYSIWYG editor

  • Headers: #, ##, ###, ####
  • Lists: 1. 2. , *
  • Links: linked phrase
  • Images: ![alt text](figures/filename.jpg)
  • Block quotes” > (block)
  •   equations: e.g. $\frac{a}{b}$ for \(\frac{a}{b}\)
  • Horizontal rules: Three or more asterisks or dashes (*** or - - - )
  • Tables
  • Footnotes
  • Bibliographies and Citations
  • Slide breaks
  • Italicized text by _italic_, Bold text by **bold**
  • Superscripts, Subscripts, Strikethrough text

4.5.1 Visual R Markdown

R Studio introduced Visual Editor towards the end of 2021. It seems to be stable but it is not perfect to go back and forth from the original editor using tags. I always use the original editor and I am confident on all the functions of it but I do not have much experience on Visual Editor. [My Note in QALL401 2021]

Please refer to the document in the following link. You can switch between the Source editor and the Visual editor using the button on the top left pane’s left top corner. The document below is a bit old, and the switch button is shown at the top right corner of the top left pane.

4.6 References