Tutorial 1

A Brief Introduction



The purpose of this tutorial give a brief tour of R, by getting you to type some R code.

The code that you write will accomplish two finance related tasks:

  1. download historical SPY price data from Yahoo finance
  2. graph the time series of the prices

Don’t worry if what you type seems foreign. As long as you are running code and getting output (even errors), you are making progress.


Let’s Get Typing

Type the following and the press ctrl + shift + enter.

rnorm(5)
[1] -0.04990784  0.24567051 -0.51225640  0.27608265 -0.67427157

What did we do?

  • we typed code that called the function rnorm() with the input 5.

  • we ran the code by pressing ctrl + shift + enter.

What happened?

  • 5 normal random numbers were generated
  • these numbers were printed to the screen


Code Challenge: Generate a set of 10 random numbers and print them to the screen.


Packages

1. R is free and open source.

2. Anyone can extend R by creating packages of functionality.

3. There are thousands of packages freely available on CRAN.

4. The power of R comes from this huge ecosystem of packages.

5. Proficiency in R entails knowing which packages will be useful for you.


Installing Packages

1. If you want to use a package, you first have to install it on your machine.

2. This is done easily from RStudio by running: install.packages("package_name")

3. Type this code to install the packages that we will need throughout this course.

install.packages("tidyverse")
install.packages("tidyquant")
install.packages("lubridate")

4. These packages now live on your machine.

5. In order to use them in an R session, you have to load them with the library() function.


Loading Packages

1. Whenever you sit down to do some analysis, you will first load the packages that you are going to need.

2. This is done with the following command: library(package_name)

3. Let’s load the two packages we are going to need for this tutorial.

library(tidyverse)
library(tidyquant)

4. When you run the code you will see a bunch of text get printed - don’t worry about this for now.


Using Package Functions

1. In R, most data analysis takes the form of applying functions to input data. The result of the function call is usually more data.

2. For example, we can use the tq_get() function in the tidyquant package to grab price data from Yahoo Finance.

3. The following function call retrieves historical SPY price data - from 2014-2018 - and then prints it to the screen.

tq_get("SPY", get = "stock.prices", from = "2014-01-01"
       , to = "2019-01-01")
# A tibble: 1,258 x 7
   date        open  high   low close    volume adjusted
   <date>     <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl>
 1 2014-01-02  184.  184.  182.  183. 119636900     165.
 2 2014-01-03  183.  184.  183.  183.  81390600     165.
 3 2014-01-06  183.  184.  182.  182. 108028200     164.
 4 2014-01-07  183.  184.  183.  183.  86144200     165.
 5 2014-01-08  183.  184.  183.  184.  96582300     165.
 6 2014-01-09  184.  184.  183.  184.  90683400     166.
 7 2014-01-10  184.  184.  183.  184. 102026400     166.
 8 2014-01-13  184.  184.  181.  182. 149892000     164.
 9 2014-01-14  182.  184.  182.  184. 105016100     166.
10 2014-01-15  184.  185.  184.  185.  98525800     167.
# … with 1,248 more rows

Code Challenge: Copy and paste the tq_get() call from above, and then modify it to grab data only for the month of December 2018.


Variable Assignment

1. Assigning values to variables is an important part of data analysis.

2. A variable can contain data that is as simple as a single character, or as complicated as a five million row data set.

3. In R, we use <- to assign value to a variable. (Keyboard shortcut alt + “–”.)

4. The following code assigns the 5-year data set of SPY prices to the variable df_spy.

df_spy <- 
    tq_get(x = "SPY", get = "stock.prices", from = "2014-01-01"
           , to = "2018-12-31")


Viewing Variable Contents

1. You can view the contents of a variable by running name of the variable.

df_spy
# A tibble: 1,257 x 7
   date        open  high   low close    volume adjusted
   <date>     <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl>
 1 2014-01-02  184.  184.  182.  183. 119636900     165.
 2 2014-01-03  183.  184.  183.  183.  81390600     165.
 3 2014-01-06  183.  184.  182.  182. 108028200     164.
 4 2014-01-07  183.  184.  183.  183.  86144200     165.
 5 2014-01-08  183.  184.  183.  184.  96582300     165.
 6 2014-01-09  184.  184.  183.  184.  90683400     166.
 7 2014-01-10  184.  184.  183.  184. 102026400     166.
 8 2014-01-13  184.  184.  181.  182. 149892000     164.
 9 2014-01-14  182.  184.  182.  184. 105016100     166.
10 2014-01-15  184.  185.  184.  185.  98525800     167.
# … with 1,247 more rows


Plotting SPY Close Prices

1. Visualization is an important tool in data analysis.

2. The ggplot2 package, which is a part of the tidyverse, makes plotting easy.

3. This code references the price data that we have in df_spy and then plots it.

ggplot(data = df_spy) + geom_line(mapping = aes(x = date, y = close))


Code Challenge: Copy the above ggplot() function call, and then modify the code to graph the adjusted prices instead of the close prices.