23 Communicating
In this chapter, after reviewing the basics of R Markdown in chapters 4 and 14, we explain tips you should know when you write R Markdown. documents.
23.1 What is R Markdown and R Notebook
R Markdown provides an authoring framework for data science. You can use a single R Markdown file to both
- save and execute code
- generate high quality reports that can be shared with an audience
R Notebooks are an implementation of Literate Programming that allows for direct interaction with R while producing a reproducible document with publication-quality output.
An R Notebook is an R Markdown document with chunks that can be executed independently and interactively, with output visible immediately beneath the input.
(Reference: R Markdown: The Definitive Guide, 3.2 Notebook)
23.2 Reproducible Research and Literate Programming
23.2.1 Literate Programming by D. Knuth
Literate programming is an approach to programming introduced by Donald Knuth in which a program is given as an explanation of the program logic in a natural language, such as English, interspersed with snippets of macros and traditional source code, from which a compilable source code can be generated.
23.2.2 D. Knuth
Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.
23.2.3 Reproducible Research - Quote from a Coursera Course
Reproducible research is the idea that data analyses, and more generally, scientific claims, are published with their data and software code so that others may verify the findings and build upon them. The need for reproducibility is increasing dramatically as data analyses become more complex, involving larger datasets and more sophisticated computations. Reproducibility allows for people to focus on the actual content of a data analysis, rather than on superficial details reported in a written summary. In addition, reproducibility makes an analysis more useful to others because the data and code that actually conducted the analysis are available.
23.2.4 R Markdown workflow, R for Data Science
R Markdown is also important because it so tightly integrates prose and code. This makes it a great analysis notebook because it lets you develop code and record your thoughts. It:
Records what you did and why you did it. Regardless of how great your memory is, if you don’t record what you do, there will come a time when you have forgotten important details. Write them down so you don’t forget!
Supports rigorous thinking. You are more likely to come up with a strong analysis if you record your thoughts as you go, and continue to reflect on them. This also saves you time when you eventually write up your analysis to share with others.
Helps others understand your work. It is rare to do data analysis by yourself, and you’ll often be working as part of a team. A lab notebook helps you share why you did it with your colleagues or lab mates.
23.2.5 Records of EDA and Communication
- Memo on a scratch paper: R Scripts
- Record on a notebook: R Notebook (an R Markdown format)
- Short paper or a digital communication: R Notebook
- Paper or a report: R Markdown (html, pdf, MS Word, etc.)
- Presentation (html, pdf, MS Powerpoint, etc.)
- Publication of a Book
- BOOKDOWN: Write HTML, PDF, ePub, and Kindle books with R Markdown. Free online document is provided in pdf as well
- Arxive Page
23.3 Structure of the R Markdown
What is R Markdown: https://vimeo.com/178485416 created by RStudio
R Markdown documents consist of three components.
- Code Chunks
- Text
- YAML Metadata
23.4 Let’s Get Started
- Start R Studio - Update R Studio if old
- Create a Project
- Tool > Install Packages
rmarkdown
- Or on Console:
install.packages("rmarkdown")
- Or on Console:
- Tool > Install Packages
tinytex
(for pdf generation)- Alternatively,
install.packages('tinytex')
- If TeX is not installed:
tinytex::install_tinytex()
# install TinyTeX- If you are not sure, please check on
Terminal
in the left below pane:- which latex, which mktexlsr - Mac or Linux
- where mktexlsr - Windows
- If you are not sure, please check on
- Alternatively,
- Let’s try!
- File > New File > R Notebook
- Save with a file name, say, test-notebook
- Preview by [Preview] button
- Run Code Chunk
plot(cars)
and then Preview again. - Knit PDF, Word (and HTML)
23.5 Templates
23.5.1 RNotebook_Template
Template to submit your assignment of this course: RNotebook_Template.nb.html
title: "Title of R Notebook"
author: "ID and Your Name"
date: "2023-12-04"
output:
html_notebook: null
23.5.1.1 YAML
- Change the title
- Write ID and your name
- Date is auto-generated and inserted. If you wish, you can replace “2023-12-04” by your favorite date style.
23.5.1.2 Code Chunk
- When you execute or run a code within the notebook, the results appear beneath the code.
- Try executing this chunk by clicking the Run button, a triangle pointing right, within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter (Win) or Cmd+Shift+Enter (Mac).
- Ctrl + Shift + Enter (Windows) or Cmd + Shift + Enter (Mac): Runs the current code chunk and advances to the next one.
- Ctrl + Alt + C (Windows) or Cmd + Option + C (Mac): Runs all the code chunks in the document.
- Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Option+I (Win) or Cmd+Option+I (Mac).
- When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K (Win) or Cmd+Shift+K (Mac) to preview the HTML file).
- The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.
- We will use the pipe command
%>%
very often later in this class.- The shortcut for the pipe operator (%>%) in Rmarkdown on Windows and Mac OS is:
- On Windows: Ctrl + Shift + M
- On Mac OS: Cmd + Shift + M
- The shortcut for the pipe operator (%>%) in Rmarkdown on Windows and Mac OS is:
23.5.2 Testing R Markdown Formats
Various Output Formats: test-rmarkdown.nb.html
title: "Testing R Markdown Formats"
author: "DS-SL"
date: "2023-12-04"
output:
html_notebook:
number_sections: yes
pdf_document:
number_sections: yes
html_document:
df_print: paged
number_sections: yes
word_document:
number_sections: yes
powerpoint_presentation: default
ioslides_presentation:
widescreen: yes
smaller: yes
slidy_presentation: default
beamer_presentation: default
23.6 Markdown Language – or use WYSIWYG editor
- Headers: #, ##, ###, ####
- Lists: 1. 2. , *
- Links: linked phrase
- Images:
![alt text](figures/filename.jpg)
- Block quotes” > (block)
- equations: e.g.
$\frac{a}{b}$
for \(\frac{a}{b}\) - Horizontal rules: Three or more asterisks or dashes (*** or - - - )
- Tables
- Footnotes
- Bibliographies and Citations
- Slide breaks
-
Italicized text by
_italic_
, Bold text by**bold**
- Superscripts, Subscripts, Strikethrough text
23.6.1 Visual R Markdown
R Studio introduced Visual Editor towards the end of 2021. It seems to be stable but it is not perfect to go back and forth from the original editor using tags. I always use the original editor and I am confident on all the functions of it but I do not have much experience on Visual Editor. [My Note in QALL401 2021]
Please refer to the document in the following link. You can switch between the Source
editor and the Visual
editor using the button on the top left pane’s left top corner. The document below is a bit old, and the switch button is shown at the top right corner of the top left pane.
23.7 R Markdown Revisited
Presentation: Submit an R Notebook (with codes used in the presentation), and PowerPoint file or other files used for your presentation, if any. If you use R Notebook for your presentation, you do not need to submit extra files.
Final Paper: Submit an R Notebook (with codes as a work file), and a PDF (rendered directly from an R Notebook, or created from Word) - Maximum pages of PDF is eight.
Format of Presentation - R Notebook is fine and slide presentation in various format is also fine
23.7.1 Literate Programming and Reproducible Research
Importing Data:
- Read a csv file:
read_csv("./data/file_name.csv")
- Download and import using a url of a csv file:
read_csv(url)
- Read an Excel file:
readxl::read_excel("./data/excel_file_name.xlsx")
- Read from the clipboard:
read_delim(clipboard())
- zip file:
copy the url
wir1to10 <- “https://wir2022.wid.world/www-site/uploads/2022/03/WIR2022TablesFigures-Chapter.zip”
download.file(wir1to10, destfile = “./data/wir1to10.zip”)
unzip(“./data/wir1to10.zip”, exdir = “./data”)
list.files(“./data/WIR2022TablesFigures-Chapter”)
excel_sheets(“./data/WIR2022TablesFigures-Chapter/WIR2022TablesFigures-Chapter1.xlsx”)
df <- read_delim(clipboard()); df
Not reproducible unless clearly explained.
23.7.2 Code Chunk Options
https://yihui.org/knitr/options/
Chunk Name
-
Output: use document default
- Show code and output: echo=TRUE, eval=TRUE - Default
- Show output only: echo=FALSE
- Show nothing (run code): include=FALSE
- Show nothing (don’t run code): include=FALSE, eval=FALSE
Show message: message=TRUE, FALSE
Show warning: warning=TRUE, FALSE
Use Paged Tables: paged.print=TRUE, FALSE
Use custom figure size: width and height in inch.
You can use Hide Code and Show Code option on the rendered Notebook file.
23.7.3 Presentation and Paper
Data Source
Variables
Problems
Visualization
Model
-
Conclusions and Further Research
WDI, WIR, etc
23.7.4 Word
Custom Word templates: https://bookdown.org/yihui/rmarkdown-cookbook/word-template.html
You can apply the styles defined in a Word template document to new Word documents generated from R Markdown. Such a template document is also called a “style reference document.” The key is that you have to create this template document from Pandoc first, and change the style definitions in it later. Then pass the path of this template to the reference_docx option of word_document
---
word_document:
reference_docx: "template.docx"
---
23.7.5 PowerPoint
PowerPoint presentation: https://bookdown.org/yihui/rmarkdown/powerpoint-presentation.html
Custom templates: https://bookdown.org/yihui/rmarkdown/powerpoint-presentation.html#ppt-templates
---
powerpoint_presentation:
reference_doc: my-styles.pptx
---
YouTube: How To Create A PowerPoint Template
23.7.6 Create a PDF or Word file.
A Notebook file is created by pressing the Preview button, and the outputs appear as is. However, making a file with another format, R runs all code chunks from the top. So if the object is not defined above the code used, the knit program stops with an error message. I recommend the following steps.
Create a PDF right after you create a new (R Notebook) file (using Template). By this step, you can check your ‘Knit to PDF’ process by
tinytex
is working well. Please let me know if you fail to create a PDF and cannot solve the problem. I will look at the setting of your PC in class.Run all codes before you preview Notebook. You can use ‘Run All’, and ‘Run All Code Chunks Below’ under the ‘Run’ button if there is an incomplete code chunk.
Before you create a PDF or word, you need to correct all errors. But if you could not, add
eval = FALSE
as an option.
You can add a similar option from the gear mark at the top right in the code chunk. Select show nothing (don’t run code); it adds {r eval = FALSE, include = FALSE}
, and the code chunk itself is skipped.
- Rerun all. If you can reach the end of the file without having an error, ‘Knit to PDF’ or ‘Knit to Word’.
Creating a Word file is similar, and should be more accessible.
If you fail to create a PDF using Knit to PDF
or Knit to Word,
the alternative is to open the notebook wile with nb.html at the end in your web browser, such as Google Chrome, Edge, or Safari, and use the functionality of printing to PDF of your browser.
23.7.6.1 Other Code Chunk Options
Please review EDA5, and try options under the gear mark at the top right of each code chunk. I will add two useful options, I use often
-
cash = TRUE
option. Downloading data and accessing to the internet takes time, and may cause trouble for the hosting site. With this option, you can avoid it, and shorten the compilation time to render. I always add this option toWDI()
. As forWDIsearch()
, if you usecache = wdi_cache
, you do not need to add this option. It is another benefit to usecache = wdi_cache
.
-
echo = FALSE
option. When you create a PDF with a limit of pages, you do not want to include some code chunks. Then use this option. The output is included, but the code chunk is not. You can select this option by choosing ’Show output only` option.
23.7.6.2 Reference
- https://yihui.org/knitr/options/
- Cheat Sheet. We distributed in class. You can download the same from Help: Cheatsheet at the top menu of R Stduio.
23.8 References
- Posit Primers: Report Reproducibly
- Markdown Quick Reference: Top Menu Bar > Help > Markdown Quick Reference
- Cheat Sheet (Top Menu Bar: Help > Cheat Sheets): RMarkdown Cheat Sheet, RMarkdown Reference Guide
- Books:
- In Textbook: R for Data Science: Communicate
- R Markdown: The Definitive Guide by Yihui Xie, J. J. Allaire, Garrett Grolemund
- R Markdown Cookbook by Yihui Xie, Christophe Dervieux, Emily Riederer
- Markdown: R Markdown is based on the Markdown language of Pandoc
- Pandoc’s Markdown: Detailed Information
- Markdown Tutorials: Interactive Practicum
- DARING FIREBALL: Markdown (detailed explanation and editor as Dingus)
- Post error messages to a web search engine.
23.5.3 Comments on Presentation Formats and Options
---
is page break for presentation formats.ref-doc-style.docx
ref-doc-style.docx
ref-doc-style.docx
as reference_doc in YAML with indention as belowOutput Options
at the bottom of the gear icon next to Preview/knit button.