4 min read

Intro to computational finance

Objective

The general objective of this post is to introduce portfolio analysis in R.

A good reference for computational finance is Eric Ziviot’s course. Some years ago this course was available in coursera but it seems to have been removed. This is quite unfortunate as the material is really high quality.

Load packages

options(warn = -1)
library(data.table)
library(magrittr)
library(ggplot2)
library(plotly)
library(quantmod)
options("getSymbols.warning4.0"=FALSE)
options("getSymbols.yahoo.warning"=FALSE)

Getting some data from yahoofinance

  • VTI: is the total US stock market ETF
  • IEF: is the 10 year treasury bond ETF
tickers = c("VTI", "IEF")
data = lapply(tickers, function(tk) 
  getSymbols(tk, from="2007-06-01", to=Sys.Date(), auto.assign=FALSE)
)
names(data) = tickers

head(data[[1]])
##            VTI.Open VTI.High VTI.Low VTI.Close VTI.Volume VTI.Adjusted
## 2007-06-01   76.525   76.785  76.400    76.630     456600     60.14484
## 2007-06-04   76.530   76.810  76.360    76.755     265800     60.24295
## 2007-06-05   76.570   76.570  76.070    76.365     390400     59.93682
## 2007-06-06   76.120   76.120  75.455    75.585     544000     59.32464
## 2007-06-07   75.485   75.565  74.095    74.095     791800     58.15517
## 2007-06-08   74.400   75.145  74.085    75.110    1175600     58.95182

We can see the data comes from yahoo as xts and zoo classes. This helps a great deal to analyze the data as there are multiple financial libraries based on xts and zoo.

class(data[[1]])
## [1] "xts" "zoo"

Convert xts object to data.table

xts and zoo are great packages and there are simple ways to do the following using them. I use data.table as I use it for other non-finance related projects and I believe it helps understand the concepts more clearly.

We use the adjusted price for the analysis that has already dividend/coupon payments.

df_daily = lapply(tickers, function(tk){
  df = as.data.table(data[[tk]])
  setnames(df, names(df), gsub(".*\\.", "", names(df)))
  df[, .(date=index, close=Adjusted, ticker=tk)]
})

Convert series to monthly frequency

In this analysis I’ll use monthly prices. The reason for this is daily prices are noisier whereas monthly prices and therefore monthly returns are much more interesting to make long term analysis.

The last observation of each month is selected from the df_daily list. This is then rbinded in one data.table. The same approach would work with more assets.

df_monthly = lapply(df_daily, function(df) 
  df[endpoints(date, on='months')]
  ) %>% rbindlist

Compute returns

I first sort the data.table. There are other ways to do this and this is perhaps no the most memmory efficient way but this only matters when you are working with larger datasets.

df_monthly = df_monthly[order(ticker, date)]

I first create a laggued close price

df_monthly[, lag:=shift(close, n=1), by=ticker]

Now I compute the simple return.

df_monthly[, R:=close / lag - 1]
df_monthly = df_monthly[complete.cases(df_monthly)]

Monthly returns of VTI and IEF

It’s clear the return distributions of VTI and IEF are different. This makes sense as VTI is a much riskier asset than IEF. Remember VTI is an stock ETFs and IEF is a US treasury bond ETF.

ggplot(df_monthly[ticker %in% c("VTI", "IEF")], 
       aes(x=R, fill=ticker)) + 
  geom_density(alpha=0.6) +
  xlab("Monthly returns")

df_monthly[, .(`Median Yearly Return`=(median(R) * 12)), by=ticker]
##    ticker Median Yearly Return
## 1:    IEF           0.02369223
## 2:    VTI           0.15568391

Cumulative returns

Once the returns are computed all that is needed to get the cumulative return is taking the cumulative product

df_monthly[, Rcumulative:=cumprod(1 + R), by=ticker]

Finally we can plot the cumulative return. Think of this as investing one dolar in each asset when the time serie starts. We can see there is a sharp drop in the VTI serie during 2008, this was the financial crisis. It’s clear that when VTI returns decreased IEF incresed, so these two assets probably have a negative correlation. But it’s clear this is not always the case, between 2014 and 2016 both assets seem to have positive correlation.

plot_ly(df_monthly, x=~date, y=~Rcumulative, color=~ticker) %>%
  add_lines() %>%
  layout(hovermode = 'compare', 
         title = 'Cumulative return of VTI and IEF', 
         yaxis = list(title = 'Cumulative return'))
ret = dcast(df_monthly, date ~ ticker, value.var = "R")
ret[, `Rolling Correlation`:=runCor(VTI, IEF, 6)]

Relation between rolling correlation and cumulative returns

It’s interesting to see how negative the correlation between these two asset classes got to be in the financial crisis. This makes sense and is a common behaviour in market crisis where people panic because stock prices experience a sharp drop, then buy treasury bonds that are a lower risk investment.

ret = merge(ret[, .(date, `Rolling Correlation`)], 
            df_monthly[, .(date, ticker, Rcumulative)], 
            by='date')
mdf = melt(ret, id.vars=c("date", "ticker"))

ggplot(mdf, aes(x=date, y=value, color=ticker)) + 
  geom_line() +
  facet_grid(variable ~ ., scales='free') + 
  theme_bw()