Workflow: Code Style

Clear Workspace, DON’T EDIT

Always start by clearing the workspace. This ensure objects created in other files are not used used here.

rm(list = ls())

List Used Packages, EDIT

List all the packages that will be used in chunk below.

packages <- c("styler")

Load Packages, DON’T EDIT

Install Missing

Any missing package will be installed automatically. This ensure smoother execution when run by others.

Installing Packages on Other People Machine

Be aware the people may not like installing packages into their machine automatically. This might break some of their previous code.

# Do NOT modify
install.packages(setdiff(packages, rownames(installed.packages())))
# Downloading packages -------------------------------------------------------
- Downloading styler from CRAN ...              OK [417.1 Kb in 0.13s]
- Downloading R.cache from CRAN ...             OK [34.9 Kb in 0.11s]
- Downloading R.methodsS3 from CRAN ...         OK [23.6 Kb in 0.18s]
- Downloading R.oo from CRAN ...                OK [375.3 Kb in 0.18s]
- Downloading R.utils from CRAN ...             OK [358.6 Kb in 0.18s]
Successfully downloaded 5 packages in 1.5 seconds.

The following package(s) will be installed:
- R.cache     [0.17.0]
- R.methodsS3 [1.8.2]
- R.oo        [1.27.1]
- R.utils     [2.13.0]
- styler      [1.10.3]
These packages will be installed into "~/work/notebook/notebook/renv/library/linux-ubuntu-noble/R-4.4/x86_64-pc-linux-gnu".

# Installing packages --------------------------------------------------------
- Installing R.methodsS3 ...                    OK [built from source and cached in 1.3s]
- Installing R.oo ...                           OK [built from source and cached in 3.9s]
- Installing R.utils ...                        OK [built from source and cached in 7.3s]
- Installing R.cache ...                        OK [built from source and cached in 1.6s]
- Installing styler ...                         OK [built from source and cached in 3.9s]
Successfully installed 5 packages in 18 seconds.

Load

Load all packages

# Do NOT modify
lapply(packages, require, character.only = TRUE)
[[1]]
[1] TRUE

11.1 Introduction

This page covers code style concepts when working with R. I took note for those that were new to me or found useful to remind myself with.

Code Style & Punctuation

Code style is like punctuation, when used correctly, itmakecodereadeasily.

11.2 Styling Overview

11.2.1 Consistency

Although there are styling guidelines (see below for example) that one can follow, it is important that a programmer pick one and stick with it to make easy for other including future self to read your work.

11.2.2 Guidelines

There is not official styling guideline for R. However, there are different styling guidelines that one can adopt, below are some of those found by search R styling guidelines (html):

  • tidyverse Style Guide (html) by Hadley Wickham. This is the adopted guidelines in these notes.
  • R Style Guide (html) by Google
  • R Coding Conventions (html) by Henrik Bengtsson, Assoc Professor, Dept of Statistics, University of California, Berkeley
  • Coding Style (html) by Bioconductor project (website)
  • R Style Guide (html) by Jean Fan (GitHub), Assistant Professor, Center for Computational Biology, Department of Biomedical Engineering, Johns Hopkins University

11.2.3 Automatic

There are package that can be used to automatically style existing code. Below are some of those:

  • styler package (website) by Lorenz Walthert (website). After installing the package, launch RStudio’s command palette using the keyboard shortcut Ctrl+Shift+P, type styler, and select from the available commands

11.3 Styling Specifics

11.3.1 Names

  • Use meaningful names
  • snake_case is used to separate_multi_word_variables
  • variables with certain theme should start with the same common word/letter to make use of the auto-complete functionality

11.3.2 Spaces

  • Except ^, put spaces on both sides of mathematical operators
  • Put spaces on both sides of the assignment operator, <-
# Strive for
z <- (a + b)^2 / d

# Avoid
z<-( a + b ) ^ 2/d
  • Don’t put spaces inside or outside parentheses for regular function calls
  • Always put a space after a comma
# Strive for
mean(x, na.rm = TRUE)

# Avoid
mean (x ,na.rm=TRUE)
  • It is okay to use extra space so align things.
flights |> 
  mutate(
    speed      = distance / air_time,
    dep_hour   = dep_time %/% 100,
    dep_minute = dep_time %%  100
  )

11.3.3 Pipes |>

The roles for pipes are nicely summarized in R4DS. Most of them are copied below.

  • Put a space before it
  • It should typically the last thing on a line. This make it easy to
    • add new steps
    • rearrange existing steps
    • modify elements within a step
    • quickly skip the verbs on the left-hand side
  • After the first step of the pipeline, indent each line by two spaces
# Strive for 
flights |>  
  filter(!is.na(arr_delay), !is.na(tailnum)) |> 
  count(dest)

# Avoid
flights|>filter(!is.na(arr_delay), !is.na(tailnum))|>count(dest)
  • If piping to a function without named arguments and its arguments fit on one line,
    • put all of them on one line.
  • If piping to a function with named arguments OR the function has not named arguments but the arguments do not fit on line,
    • put each argument on new line indented by two spaces
    • make sure the ) is on its own line and un-indented to match the horizontal position of the function name
# Strive for
flights |>  
  group_by(tailnum) |> 
  summarize(
    delay = mean(arr_delay, na.rm = TRUE),
    n = n()
  )

# Avoid
flights |>
  group_by(
    tailnum
  ) |> 
  summarize(delay = mean(arr_delay, na.rm = TRUE), n = n())

# Avoid
flights|>
  group_by(tailnum) |> 
  summarize(
             delay = mean(arr_delay, na.rm = TRUE), 
             n = n()
           )

# Avoid
flights|>
  group_by(tailnum) |> 
  summarize(
  delay = mean(arr_delay, na.rm = TRUE), 
  n = n()
  )
Long Pipeline

Break long pipelines (tasks) to meaningful pipelines (sub-tasks) and save the intermediate steps. This will make the code more readable and easy to check and debug.

11.3.4 ggplot2

The same rules from pipe can be applied to ggplot2.

flights |> 
  group_by(dest) |> 
  summarize(
    distance = mean(distance),
    speed = mean(distance / air_time, na.rm = TRUE)
  ) |> 
  ggplot(aes(x = distance, y = speed)) +
  geom_smooth(
    method = "loess",
    span = 0.5,
    se = FALSE, 
    color = "white", 
    linewidth = 4
  ) +
  geom_point()

11.3.5 Sectioning Comments

When writing long scripts, it is advisable to break the code into sections and using sectioning comments to label them. The RStudio keyboard shortcut to create such comment is Cnrl+Shift+R.

# Load data --------------------------------------

# Plot data --------------------------------------