Overview:
1. Setting working directory – setwd()
2. Reading CSV files – read.csv()
3. Reading MS Excel files – read_excel()
In the previous posts, we learned about data types and data structures. Further, I showed how to create data using the seq() function and sample() function. In this post, I shall demonstrate how to read data from the folders across your computer. To read data files from your computer, you have to set your working directory indicating from where you would like to access the file.
Setting working directory:
Working directory is the location with which you access, save, or store files. By default, your working directory will be set when you open R or RStudio. To know you working directory, you can use the function getwd().
# Use getwd() to know your working directory
getwd()
OUTPUT:
[1] "/Users/balachandark"
# is the default working directory in my computer.
To set the working directory, one can use setwd() function. The path of the working directory is supposed to be provided within double quote or single quote.
# Setting working directory:
setwd("your_file_path")
# To know you file's path.
# For Mac users: Right click on the file --> click Get Info --> Go to Where: and then Right click --> and select Copy as Pathname
# For Windows users: Go to the file's location and copy the address from the address bar.

# Example setting working directory:
# While setting the working directory, you have to use either single forward slash / or double back slash \\
# For Mac users:
setwd("/Users/balachandark/Downloads") # sets the Downloads as working directory
# For windows users: this code will work
setwd("\\Users\\balachandark\\Downloads")
Reading data files:
Now we know how to set up our working directory. The next step is to read the files within the working directory. When you wish to read a file in another folder that is not your working directory, please provide the entire path to read the files.
Within R, there are two kinds of files, one is the .rdata file, and another is the .rds file. The Data file can store multiple objects, and Rds files can only store a single object. However, in research, I mostly use CSV (comma separated value) files. Hence, I shall restrict myself to reading CSV and Excel files.
Reading CSV files:
To read csv files, you can use in built read.csv() function. While reading the file, you must wrap the file name within single or double quote.
# Reading CSV files:
# Imagine you data is saved in my_data.csv file.
read.csv("my_data.csv")
# There are additional arguments - to know about it use the help command
help(read.csv)
# Additional arguments that will help you read the files
read.csv("my_data.csv", header = T) # when you know the first row is header/ column names. However, header=T is provided by default. Hence, you don't have to worry about it.
read.csv("my_data.csv", sep = ",") # sep - refers to the separator - CSV files are separated by comma. Hence, it is wise to use it while reading comma separated value files.
# /t - refers to tab separated value file
# ; - refers to semicolon separated value file
# " " - if your file separated by empty strings
Reading Excel files:
To read excel files, you need readxl package developed by Hadley Wickham.
# First let's install readxl package
install.packages("readxl")
# Load readxl package
library(readxl)
# Read my_excel_file.xls and save to the object my_data
my_data <- read_excel("my_excel_file.xls") # you can also read xlsx file
# Referring to specific sheet name or number
my_data <- read_excel("my_excel_file.xls", sheet="sheet_one") # reads the sheet with the name sheet_one
my_data <- read_excel("my_excel_file.xls", sheet=3) # reads the third sheet of the file
In the next blog post, I shall start with data wrangling using dplyr.
Leave a Reply