R series – 6: Setting working directory and reading data

2–4 minutes

read

Overview:
1. Setting working directory – setwd()
2. Reading CSV files – read.csv()
3. Reading MS Excel files – read_excel()

In the previous posts, we learned about data types and data structures. Further, I showed how to create data using the seq() function and sample() function. In this post, I shall demonstrate how to read data from the folders across your computer. To read data files from your computer, you have to set your working directory indicating from where you would like to access the file.

Setting working directory:
Working directory is the location with which you access, save, or store files. By default, your working directory will be set when you open R or RStudio. To know you working directory, you can use the function getwd().

# Use getwd() to know your working directory

getwd()

OUTPUT:
[1] "/Users/balachandark" 
# is the default working directory in my computer. 

To set the working directory, one can use setwd() function. The path of the working directory is supposed to be provided within double quote or single quote.

# Setting working directory:

setwd("your_file_path")

# To know you file's path. 
# For Mac users: Right click on the file --> click Get Info --> Go to Where: and then Right click --> and select Copy as Pathname
# For Windows users: Go to the file's location and copy the address from the address bar. 
Finding file path for Mac Users.
# Example setting working directory:
# While setting the working directory, you have to use either single forward slash / or double back slash \\

# For Mac users:

setwd("/Users/balachandark/Downloads") # sets the Downloads as working directory

# For windows users: this code will work

setwd("\\Users\\balachandark\\Downloads")

Reading data files:
Now we know how to set up our working directory. The next step is to read the files within the working directory. When you wish to read a file in another folder that is not your working directory, please provide the entire path to read the files.

Within R, there are two kinds of files, one is the .rdata file, and another is the .rds file. The Data file can store multiple objects, and Rds files can only store a single object. However, in research, I mostly use CSV (comma separated value) files. Hence, I shall restrict myself to reading CSV and Excel files.

Reading CSV files:
To read csv files, you can use in built read.csv() function. While reading the file, you must wrap the file name within single or double quote.

# Reading CSV files:
# Imagine you data is saved in my_data.csv file. 

read.csv("my_data.csv")

# There are additional arguments - to know about it use the help command

help(read.csv)

# Additional arguments that will help you read the files

read.csv("my_data.csv", header = T) # when you know the first row is header/ column names. However, header=T is provided by default. Hence, you don't have to worry about it. 

read.csv("my_data.csv", sep = ",") # sep - refers to the separator - CSV files are separated by comma. Hence, it is wise to use it while reading comma separated value files. 

# /t - refers to tab separated value file
# ; - refers to semicolon separated value file
# " " - if your file separated by empty strings

Reading Excel files:

To read excel files, you need readxl package developed by Hadley Wickham.

# First let's install readxl package 

install.packages("readxl")

# Load readxl package

library(readxl) 

# Read my_excel_file.xls and save to the object my_data

my_data <- read_excel("my_excel_file.xls") # you can also read xlsx file

# Referring to specific sheet name or number

my_data <- read_excel("my_excel_file.xls", sheet="sheet_one") # reads the sheet with the name sheet_one

my_data <- read_excel("my_excel_file.xls", sheet=3) # reads the third sheet of the file 

In the next blog post, I shall start with data wrangling using dplyr.

HOME

Leave a Reply

Discover more from Balachandar Kaliappan

Subscribe now to keep reading and get access to the full archive.

Continue reading