2.2 Getting started with R

Download and install R (http://cran.r-project.org/) and RStudio (http://www.rstudio.com/) if you do not have them already. Rstudio is optional but it is a great tool if you are just starting to learn R. You will need specific data sets to run the code snippets in this book; we have explained how to install and use the data in the Data for the book section in the Preface. If you have not used Rstudio before, we recommend running it and familiarizing yourself with it first. To put it simply, this interface combines multiple features you will need while analyzing data. You can see your code, how it is executed, the plots you make, and your data all in one interface.

2.2.1 Installing packages

R packages are add-ons to base R that help you achieve additional tasks that are not directly supported by base R. It is by the action of these extra functionality that R excels as a tool for computational genomics. The Bioconductor project (http://bioconductor.org/) is a dedicated package repository for computational biology-related packages. However main package repository of R, called CRAN, also has computational biology related packages. In addition, R-Forge (http://r-forge.r-project.org/), GitHub (https://github.com/), and Bitbucket (http://www.bitbucket.org) are some of the other locations where R packages might be hosted. The packages needed for the code snippets in this book and how to install them are explained in the Packages needed to run the book code section in the Preface of the book.

You can install CRAN packages using install.packages() (# is the comment character in R).

# install package named "randomForests" from CRAN
install.packages("randomForests")

You can install bioconductor packages with a specific installer script.

# get the installer package if you don't have
install.packages("BiocManager")

# install bioconductor package "rtracklayer"
BiocManager::install("rtracklayer")

You can install packages from GitHub using the install_github() function from devtools package.

library(devtools)
install_github("hadley/stringr")

Another way to install packages is from the source.

# download the source file
download.file(
"https://github.com/al2na/methylKit/releases/download/v0.99.2/methylKit_0.99.2.tar.gz",
               destfile="methylKit_0.99.2.tar.gz")
# install the package from the source file
install.packages("methylKit_0.99.2.tar.gz",
                 repos=NULL,type="source")
# delete the source file
unlink("methylKit_0.99.2.tar.gz")

You can also update CRAN and Bioconductor packages.

# updating CRAN packages
update.packages()

# updating bioconductor packages
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install()

2.2.2 Installing packages in custom locations

If you will be using R on servers or computing clusters rather than your personal computer, it is unlikely that you will have administrator access to install packages. In that case, you can install packages in custom locations by telling R where to look for additional packages. This is done by setting up an .Renviron file in your home directory and add the following line:

R_LIBS=~/Rlibs

This tells R that the “Rlibs” directory at your home directory will be the first choice of locations to look for packages and install packages (the directory name and location is up to you, the above is just an example). You should go and create that directory now. After that, start a fresh R session and start installing packages. From now on, packages will be installed to your local directory where you have read-write access.

2.2.3 Getting help on functions and packages

You can get help on functions by using help() and help.search() functions. You can list the functions in a package with the ls() function

library(MASS)
ls("package:MASS") # functions in the package
ls() # objects in your R enviroment
# get help on hist() function
?hist
help("hist")
# search the word "hist" in help pages
help.search("hist")
??hist

2.2.3.1 More help needed?

In addition, check package vignettes for help and practical understanding of the functions. All Bioconductor packages have vignettes that walk you through example analysis. Google search will always be helpful as well; there are many blogs and web pages that have posts about R. R-help mailing list (https://stat.ethz.ch/mailman/listinfo/r-help), Stackoverflow.com and R-bloggers.com are usually sources of good and reliable information.