8 R Developer Tools
8.1 Table of Contents
- R Package Development
- Documentation with roxygen2
- Version Control with Git
- GitHub Workflows
- Continuous Integration/Deployment
- Code Profiling and Optimization
- Package Distribution
- Development Tools
- Code Quality
- Best Practices
8.2 R Package Development
8.2.1 Creating a Package
# Install development tools
install.packages(c("devtools", "usethis", "roxygen2", "testthat"))
library(usethis)
library(devtools)
# Create new package
create_package("~/mypackage")
# Package structure created:
# mypackage/
# ├── R/
# ├── DESCRIPTION
# ├── NAMESPACE
# └── mypackage.Rproj8.2.2 DESCRIPTION File
Package: mypackage
Type: Package
Title: My Awesome R Package
Version: 0.1.0
Author: Your Name <your.email@example.com>
Maintainer: Your Name <your.email@example.com>
Description: This package provides tools for data analysis and visualization.
It includes functions for data cleaning, transformation, and statistical modeling.
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
Imports:
dplyr (>= 1.0.0),
ggplot2 (>= 3.3.0),
magrittr
Suggests:
testthat (>= 3.0.0),
knitr,
rmarkdown,
covr
Depends: R (>= 4.0.0)
VignetteBuilder: knitr
URL: https://github.com/username/mypackage
BugReports: https://github.com/username/mypackage/issues8.2.3 Adding Functions
# Create new R file
use_r("data_processing")
# R/data_processing.R
#' Clean Data
#'
#' Remove missing values and duplicates from a data frame
#'
#' @param data A data frame to clean
#' @param columns Character vector of column names to check for NAs
#' @return A cleaned data frame
#' @export
#' @examples
#' df <- data.frame(x = c(1, 2, NA, 4), y = c("a", "b", "b", "c"))
#' clean_data(df, c("x"))
clean_data <- function(data, columns = NULL) {
if (!is.data.frame(data)) {
stop("data must be a data frame")
}
if (is.null(columns)) {
columns <- names(data)
}
# Remove rows with NA in specified columns
data <- data[complete.cases(data[, columns]), ]
# Remove duplicates
data <- unique(data)
return(data)
}8.2.4 Adding Data
# Add internal data (available only to package functions)
use_data_raw("prep_data")
# data-raw/prep_data.R
my_data <- data.frame(
x = 1:100,
y = rnorm(100)
)
usethis::use_data(my_data, internal = TRUE)
# Add external data (available to users)
usethis::use_data(my_data, overwrite = TRUE)
# Document data
# R/data.R
#' Sample Dataset
#'
#' A dataset containing sample data for demonstration
#'
#' @format A data frame with 100 rows and 2 variables:
#' \describe{
#' \item{x}{Numeric values from 1 to 100}
#' \item{y}{Random normal values}
#' }
#' @source Generated for package demonstration
"my_data"8.2.5 Package Dependencies
# Add to Imports (required)
use_package("dplyr")
use_package("ggplot2", min_version = "3.3.0")
# Add to Suggests (optional)
use_package("testthat", type = "Suggests")
# Import specific functions
#' @importFrom dplyr filter mutate
#' @importFrom magrittr %>%
# Import entire package
#' @import ggplot28.3 Documentation with roxygen2
8.3.1 Function Documentation
#' Calculate Summary Statistics
#'
#' This function calculates various summary statistics for a numeric vector.
#' It handles missing values and provides multiple measures of central tendency
#' and dispersion.
#'
#' @param x A numeric vector
#' @param na.rm Logical; if TRUE, missing values are removed before computation
#' @param trim Numeric; fraction (0 to 0.5) of observations to trim from each end
#' @param digits Integer; number of decimal places to round to
#'
#' @return A named list containing:
#' \item{mean}{Arithmetic mean}
#' \item{median}{Median value}
#' \item{sd}{Standard deviation}
#' \item{min}{Minimum value}
#' \item{max}{Maximum value}
#' \item{n}{Number of observations}
#'
#' @export
#' @examples
#' # Basic usage
#' x <- c(1, 2, 3, 4, 5, NA)
#' summary_stats(x, na.rm = TRUE)
#'
#' # With trimming
#' summary_stats(x, na.rm = TRUE, trim = 0.1)
#'
#' @seealso \code{\link{mean}}, \code{\link{median}}, \code{\link{sd}}
#' @family statistical functions
summary_stats <- function(x, na.rm = FALSE, trim = 0, digits = 2) {
if (!is.numeric(x)) {
stop("x must be numeric")
}
result <- list(
mean = round(mean(x, na.rm = na.rm, trim = trim), digits),
median = round(median(x, na.rm = na.rm), digits),
sd = round(sd(x, na.rm = na.rm), digits),
min = min(x, na.rm = na.rm),
max = max(x, na.rm = na.rm),
n = if (na.rm) sum(!is.na(x)) else length(x)
)
return(result)
}8.3.2 Building Documentation
# Generate documentation from roxygen comments
devtools::document()
# Check documentation
devtools::check_man()
# Preview help page
?summary_stats8.3.3 Vignettes
# Create vignette
use_vignette("introduction")
# vignettes/introduction.Rmd
---
title: "Introduction to mypackage"
author: "Your Name"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Introduction to mypackage}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```r
# chunk: setup, include = FALSE
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(mypackage)8.4 Overview
This package provides tools for…
8.5 Installation
# chunk: eval=FALSE (installation)
devtools::install_github("username/mypackage")8.6 Basic Usage
# chunk: basic-usage
data <- data.frame(x = 1:10, y = rnorm(10))
result <- clean_data(data)9 Build vignettes
devtools::build_vignettes()
### Package Website with pkgdown
```r
# Setup pkgdown
use_pkgdown()
# Build website
pkgdown::build_site()
# Deploy to GitHub Pages
use_pkgdown_github_pages()
9.1 Version Control with Git
9.1.1 Setup Git Repository
# Initialize git
git init
# Configure user
git config user.name "Your Name"
git config user.email "your.email@example.com"
# Create .gitignore# R approach
use_git()
# Add .gitignore entries
use_git_ignore(c("*.Rproj", ".Rhistory", ".RData"))9.1.2 Basic Git Workflow
# Check status
git status
# Stage files
git add R/my_function.R
git add . # Stage all changes
# Commit
git commit -m "Add data processing function"
# View history
git log --oneline
# Create branch
git checkout -b feature/new-analysis
# Switch branches
git checkout main
# Merge branch
git merge feature/new-analysis
# Delete branch
git branch -d feature/new-analysis9.1.3 .gitignore for R
# History files
.Rhistory
.Rapp.history
# Session Data files
.RData
# User-specific files
.Ruserdata
# RStudio files
.Rproj.user/
*.Rproj
# OAuth2 token
.httr-oauth
# knitr and R markdown
*.utf8.md
*.knit.md
# Temporary files
*~
*.swp
*.swo
# Output files
*.pdf
*.html
# Data files (if large)
data/raw/*
!data/raw/.gitkeep
9.2 GitHub Workflows
9.2.1 Create GitHub Repository
# Create GitHub repo from R
use_github()
# Or manually:
# 1. Create repo on GitHub
# 2. Add remote
git remote add origin https://github.com/username/mypackage.git
git push -u origin main9.2.2 README
# Create README
use_readme_rmd()
# README.Rmd
---
output: github_document
---
# mypackage
<!-- badges: start -->
[](https://github.com/username/mypackage/actions)
[](https://codecov.io/gh/username/mypackage)
<!-- badges: end -->
The goal of mypackage is to...
## Installation
```r
# chunk: eval=FALSE (development version)
# Development version
devtools::install_github("username/mypackage")9.3 Example
# chunk: example
library(mypackage)
data <- data.frame(x = 1:10, y = rnorm(10))
result <- clean_data(data)10 Build README
devtools::build_readme()
### LICENSE
```r
# Add MIT license
use_mit_license("Your Name")
# Or GPL-3
use_gpl3_license()
# Or Apache 2.0
use_apache_license()
10.1 Continuous Integration/Deployment
10.1.1 GitHub Actions - R CMD Check
# Setup GitHub Actions
use_github_action_check_standard()
# .github/workflows/R-CMD-check.yaml
name: R-CMD-check
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]
jobs:
R-CMD-check:
runs-on: ${{ matrix.config.os }}
name: ${{ matrix.config.os }} (${{ matrix.config.r }})
strategy:
fail-fast: false
matrix:
config:
- {os: macos-latest, r: 'release'}
- {os: windows-latest, r: 'release'}
- {os: ubuntu-latest, r: 'devel'}
- {os: ubuntu-latest, r: 'release'}
- {os: ubuntu-latest, r: 'oldrel-1'}
steps:
- uses: actions/checkout@v3
- uses: r-lib/actions/setup-r@v2
with:
r-version: ${{ matrix.config.r }}
- uses: r-lib/actions/setup-pandoc@v2
- uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: any::rcmdcheck
needs: check
- uses: r-lib/actions/check-r-package@v210.1.2 Code Coverage
# Setup code coverage
use_github_action("test-coverage")
use_coverage()
# Run locally
library(covr)
package_coverage()
# View report
report()10.1.3 Automatic pkgdown Deployment
# Setup automatic website deployment
use_github_action("pkgdown")
# Website will be built and deployed on push to main10.2 Code Profiling and Optimization
10.2.1 Profiling with profvis
library(profvis)
# Profile code
profvis({
data <- data.frame(
x = rnorm(100000),
y = rnorm(100000)
)
# Slow approach
result <- numeric(nrow(data))
for (i in 1:nrow(data)) {
result[i] <- data$x[i] + data$y[i]
}
# Fast approach
result2 <- data$x + data$y
})10.2.2 Benchmarking
library(bench)
# Compare implementations
results <- mark(
loop = {
result <- numeric(10000)
for (i in 1:10000) result[i] <- i^2
},
vectorized = {
result <- (1:10000)^2
},
check = FALSE,
iterations = 100
)
print(results)
plot(results)10.2.3 Memory Profiling
# Check object size
object.size(large_data)
format(object.size(large_data), units = "MB")
# Profile memory
library(profmem)
p <- profmem({
x <- numeric(1e6)
y <- rnorm(1e6)
z <- x + y
})
print(p)
total(p)10.2.4 Optimization Strategies
# 1. Vectorization
# Bad
sum_squares_slow <- function(x) {
total <- 0
for (i in seq_along(x)) {
total <- total + x[i]^2
}
total
}
# Good
sum_squares_fast <- function(x) {
sum(x^2)
}
# 2. Pre-allocation
# Bad
grow_vector <- function(n) {
result <- c()
for (i in 1:n) {
result <- c(result, i^2)
}
result
}
# Good
preallocate_vector <- function(n) {
result <- numeric(n)
for (i in 1:n) {
result[i] <- i^2
}
result
}
# 3. Use efficient data structures
library(data.table)
dt <- as.data.table(large_df) # Fast operations
# 4. Parallel processing
library(parallel)
cl <- makeCluster(detectCores() - 1)
results <- parLapply(cl, data_list, expensive_function)
stopCluster(cl)10.3 Package Distribution
10.3.1 CRAN Submission
# Check package thoroughly
devtools::check()
# Check on multiple R versions
devtools::check_rhub()
devtools::check_win_devel()
devtools::check_mac_release()
# Build package
devtools::build()
# Submit to CRAN
devtools::submit_cran()10.3.2 CRAN Release Checklist
# 1. Update version number
use_version()
# 2. Update NEWS.md
use_news_md()
# 3. Run checks
devtools::check()
devtools::check_rhub()
devtools::check_win_devel()
# 4. Update cran-comments.md
use_cran_comments()
# 5. Build package
devtools::build()
# 6. Submit
devtools::submit_cran()10.3.3 R-universe (Alternative)
# Easy distribution without CRAN submission
# Setup at https://r-universe.dev
# Users install with:
install.packages("mypackage", repos = "https://username.r-universe.dev")10.4 Development Tools
10.4.1 RStudio Addins
# Create addin
use_addin("insert_pipe")
# inst/rstudio/addins.dcf
Name: Insert Pipe
Description: Inserts %>% at cursor
Binding: insert_pipe
Interactive: false
# R/addins.R
insert_pipe <- function() {
rstudioapi::insertText(" %>% ")
}10.4.2 Custom RStudio Snippets
# Edit snippets
usethis::edit_rstudio_snippets()
# Add custom snippets:
snippet fun
${1:name} <- function(${2:args}) {
${0}
}
snippet test
test_that("${1:description}", {
${0}
})10.4.3 Pre-commit Hooks
# Setup pre-commit hooks
library(precommit)
use_precommit()
# .pre-commit-config.yaml
repos:
- repo: https://github.com/lorenzwalthert/precommit
rev: v0.3.2
hooks:
- id: style-files
- id: roxygenize
- id: use-tidy-description
- id: spell-check
- id: lintr
- id: readme-rmd-rendered
- id: parsable-R
- id: no-browser-statement10.5 Code Quality
10.5.1 Static Code Analysis
# lintr - code style checking
library(lintr)
lint_package()
lint("R/my_function.R")
# Custom linters
my_linters <- linters_with_defaults(
line_length_linter(120),
object_name_linter("snake_case")
)
lint_package(linters = my_linters)10.5.2 Code Formatting
# styler - automatic formatting
library(styler)
# Format file
style_file("R/my_function.R")
# Format package
style_pkg()
# Format with custom style
style_pkg(scope = "line_breaks")10.5.3 Spell Checking
# Check spelling
library(spelling)
spell_check_package()
# Update wordlist
update_wordlist()
# Add to .aspell file
use_spell_check()10.6 Best Practices
10.6.1 Package Development Workflow
# 1. Write function
use_r("my_function")
# 2. Load all functions
load_all() # or Ctrl+Shift+L
# 3. Test interactively
my_function(test_data)
# 4. Write tests
use_test("my_function")
# 5. Run tests
test() # or Ctrl+Shift+T
# 6. Document
document() # or Ctrl+Shift+D
# 7. Check package
check() # or Ctrl+Shift+E
# 8. Install
install()10.6.2 Semantic Versioning
# Version format: MAJOR.MINOR.PATCH
# 1.0.0 -> 1.0.1 (patch: bug fixes)
# 1.0.1 -> 1.1.0 (minor: new features, backward compatible)
# 1.1.0 -> 2.0.0 (major: breaking changes)
# Update version
use_version("patch")
use_version("minor")
use_version("major")10.6.3 NEWS.md
# mypackage 0.2.0
## New features
* Added `new_function()` for advanced analysis (#15)
* Support for tibbles in `clean_data()` (#18)
## Bug fixes
* Fixed issue with NA handling in `summary_stats()` (#12)
* Corrected documentation for `transform_data()` (#14)
## Breaking changes
* Renamed `old_function()` to `new_function()`
* Changed default behavior of `clean_data()`
# mypackage 0.1.0
* Initial CRAN release10.6.4 Code Review Checklist
# Before committing:
# Functions are documented
# Tests are written and passing
# Code is formatted (styler)
# No linter warnings
# Package checks pass
# NEWS.md is updated
# Version number is bumped
# Examples run correctly
# Vignettes build
# No browser() or debug statements10.7 Complete Package Example
10.7.1 Final Package Structure
mypackage/
├── .github/
│ └── workflows/
│ ├── R-CMD-check.yaml
│ ├── test-coverage.yaml
│ └── pkgdown.yaml
├── R/
│ ├── data_processing.R
│ ├── visualization.R
│ ├── utils.R
│ └── mypackage-package.R
├── man/
│ ├── clean_data.Rd
│ └── mypackage-package.Rd
├── tests/
│ ├── testthat.R
│ └── testthat/
│ ├── test-data_processing.R
│ └── test-visualization.R
├── vignettes/
│ └── introduction.Rmd
├── data/
│ └── sample_data.rda
├── data-raw/
│ └── prep_data.R
├── .gitignore
├── .Rbuildignore
├── DESCRIPTION
├── LICENSE
├── LICENSE.md
├── NAMESPACE
├── NEWS.md
├── README.md
├── README.Rmd
├── cran-comments.md
└── mypackage.Rproj
10.8 Summary
10.8.1 Skills Mastered
- R package development from scratch
- Comprehensive documentation with roxygen2
- Version control with Git and GitHub
- CI/CD pipelines with GitHub Actions
- Code profiling and optimization
- Package distribution (CRAN, GitHub, R-universe)
- Code quality tools (lintr, styler, spelling)
- Professional development workflow
10.8.2 Resources
- R Packages Book by Hadley Wickham
- Writing R Extensions (Official)
- usethis Documentation
- devtools Documentation
- GitHub Actions for R
10.8.3 Next Steps
- Build: Create your first R package
- Test: Achieve 100% test coverage
- Share: Publish on GitHub and CRAN
- Contribute: Contribute to open-source R packages
- Teach: Share your knowledge with the R community
Congratulations!
You’ve completed the entire R Programming course! You now have the skills to: - Write professional R code - Build data science projects - Create interactive applications - Develop and distribute R packages - Apply best practices in production environments
Keep coding, keep learning, and contribute to the R community!