Chapter 2 Introduction to R for Genomic Data Analysis

The aim of computational genomics is to provide biological interpretation and insights from high-dimensional genomics data. Generally speaking, it is similar to any other kind of data analysis endeavor but oftentimes doing computational genomics will require domain-specific knowledge and tools.

As new high-throughput experimental techniques are on the rise, data analysis capabilities are sought-after features for researchers. The aim of this chapter is to first familiarize readers with data analysis steps and then provide basics of R programming within the context of genomic data analysis. R is a free statistical programming language that is popular among researchers and data miners to build software and analyze data. Although basic R programming tutorials are easily accessible, we are aiming to introduce the subject with the genomic context in the background. The examples and narrative will always be from real-life situations when you try to analyze genomic data with R. We believe tailoring material to the context of genomics makes a difference when learning this programming language for the sake of analyzing genomic data.