R Coding Best Practices#
Quick Start#
Note
There are two primary style guides for R code: the Bioconductor Style Guide and the Tidyverse Style Guide. There are differences between them, such as Tidyverse style preferring two spaces for indentation while Bioconductor prefers four spaces, or snake_case naming for functions in Tidyverse while Bioconductor prefers camelCase.
Because the BHKLab develops packages for Bioconductor, as a general rule we prefer to follow the Bioconductor Style Guide.
Naming#
- All class names should use PascalCase.
- All function names should use camelCase.
- Variable names can be lower_snake_case or camelCase, but be consistent within your project.
- Private (non-exported) functions should be prepended with a dot (e.g.,
.myPrivateFunction). - Never use dots in your function, class, or variable names (e.g., never
my.class,my.function, ormy.variable). - Dots have a special meaning in R and are used to define S3 methods.
- For generic function
fand objectxof S3 classA,f(x)calls the methodf.A(x). - Dots are discouraged but acceptable in list, column, or function parameter names.
Formatting#
- Lines of code should ideally be less than 80 characters and never more than 120 characters.
- This makes it easy to read your code with multiple windows open and ensures no horizontal scrolling is necessary.
- Functions should ideally be less than 50 lines of code and never more than 80 lines of code.
- If your functions are longer than this, your function is doing too many things; consider refactoring into helper methods.
- Indentation should use four spaces.
- It is preferable to add an extra indentation for wrapped lines if it reduces the overall indentation of the code.
Style#
- Assignment should always use
<-and never use=. - Assignment with
->is discouraged, but acceptable when using pipes. - When using pipes, the base R pipe
|>is preferred over the magrittr pipe%>%. - Operators should always be surrounded by spaces.
- Good:
x[x > 1]orc("a", "b") %in% c("a", "b", "c")orif (x > 5 || x < 0 )orx <- 1 - Bad:
x[x>1]orc("a", "b")%in%c("a", "b", "c")orif (x>5||x<0)orx<-1 - No spaces after function names, except in
for,if, orelsestatements. - Bad:
mean (x) - Good:
mean(x)orif (x){ ... }orfor (x in 1:5) { ... } - Curly braces should start on the same line as a function.
Linting#
- Linting is a form of static code analysis that automatically checks if code in a directory or project follows best practices for coding style and format.
- The
lintrpackage provides linting for the R language and integrates into the R console and many IDEs. - This linter is included with the R VSCode extension and can be added to RStudio.
- By default, the linter enforces the Tidyverse Style Guide, which differs slightly from Bioconductor (and BHKLab) conventions.
- The
stylerpackage can be used to automatically format R files to follow yourlintrsettings, but be careful as it can mangle your code. Integration is easiest with RStudio. - The
airtool is a modern formatter and language server for R, developed primarily to integrate with Tidyverse style conventions.