Introduction
The idea of this post is to give a high level overview of what’s needed to develop a stock screener with R. Even though there are websites that provide an interface for this even for free in some cases it is useful to develop your own custom screener. Assuming you know what you are looking for, it’s not a complex process.
The main ingredients are:
- Collect the relevant data
- Use a criteria to rank stocks
In this example I’ll demonstrate how to screen stocks to implement part of the strategy that’s described in Laurence bensdorp’s book, “The 30 minute trader”. In the book it’s described as “Mean reversion long”, there is only one parameter I removed for simplicity so in general terms this should be fairly close to what’s in the book. I’m not affiliated to him, but I thought the book was simply excellent and will recommend anyone to read it.
1. Collecting price data
Collecting price data is simple in R. In order to keep this post as a demo I’ll use the Nasdaq-100 stocks but you could collect prices of more stocks. For example the Russel-1000 or S&P 500 constituents.
In this example I’ll use quantmod
that provides a simple way to access to yahoo-finance data and data.table
to aggregate the result of the screen and query it.
# Load libraries
library(data.table)
library(magrittr)
library(quantmod)
options("getSymbols.warning4.0"=FALSE)
options("getSymbols.yahoo.warning"=FALSE)
# Read data from ishares site to get stock tickers
ticker_lkp = suppressWarnings(fread("https://www.ishares.com/uk/individual/en/products/253741/ishares-nasdaq-100-ucits-etf/1506575576011.ajax?fileType=csv&fileName=CNDX_holdings&dataType=fund", showProgress=F))
# get stock tickers
tickers = ticker_lkp[nchar(ISIN) == 12, `Issuer Ticker`]
I’ll collect the price data in a list in order to be able to access each ticker by it’s name
# Get prices as xts
data = list()
for (i in seq_along(tickers)) {
Sys.sleep(0.005) # not to get rate limited
ticker = tickers[i]
from = "2015-01-01"
to = as.character(Sys.Date())
data[[ticker]] = try(getSymbols(ticker, from=from, to=to, auto.assign = FALSE, src='yahoo'), silent=TRUE)
}
names(data) = tickers
First I’ll just check if all the stocks were downloaded. It looks like we have 100 stocks plus 2 that have A shares.
All objects are of class xts
so it seems the data was collected properly.
# Remove empty data
table(sapply(data, function(x) class(x)[1]))
##
## xts
## 102
Below I show a few examples of how to access the price and volume data of each stock.
tail(data[['MSFT']], 3)
## MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted
## 2021-01-20 217.70 225.79 217.29 224.34 37777300 224.34
## 2021-01-21 224.70 226.30 222.42 224.97 30749600 224.97
## 2021-01-22 227.08 230.07 225.80 225.95 30124900 225.95
tail(data[['COST']], 3)
## COST.Open COST.High COST.Low COST.Close COST.Volume COST.Adjusted
## 2021-01-20 354.39 361.90 353.41 361.3 2767200 361.3
## 2021-01-21 361.30 363.99 359.94 362.8 2122600 362.8
## 2021-01-22 363.20 364.63 359.85 362.3 1959100 362.3
2. Develop the screener
In this case I’ll show how to find stocks in a long term up-trend that seem have some short term weakness.
First of all I’ll adjust the OHLC values.
data = lapply(data, adjustOHLC, use.Adjusted=TRUE)
In order to make the code reproducible I’ll keep the data up to today. You can remove this like if you want to get the most recent results
data = lapply(data, function(d) d['/20210115'])
Screener parameters
- Long term uptrend: Close > SMA-150 (long term uptrend)
- High volatility: ATR t10 > 4% (high volatility stocks)
- Short term oversold: RSI t3 < 30
- Ranked by lower RSI-3
Some definitions: - ATR: Average true range - RSI: Relative strenght indicator - SMA-150: Simple moving average of the past 150 days
Of course this is an example, you can try different parameters.
Here is the main function I used, the code is commented.
mean_reversion <- function(data, ticker){
# Read the data
df = data[[ticker]]
# Compute indicators
HLC = HLC(df)
sma150 = SMA(Cl(df), n=150)
atr10 = ATR(HLC, n=10, maType='EMA')
rsi3 = RSI(Cl(df), n=3)
# join them to the main data
df = merge(df, sma150)
df = merge(df, atr10$atr)
df = merge(df, rsi3)
df$atr = df$atr / Cl(df)
df$close = Cl(df)
# Select relevant columns
keep_cols = c("SMA", "atr", "rsi", "close")
df = df[, keep_cols]
names(df)[1:3] = c("sma150", "atr10", "rsi3")
# Get the most recent observation
df = tail(df, 1)
# Convert the output as a data table
dt = as.data.table(df)
dt[, ticker:=ticker]
setnames(dt, "index", "date")
dt
}
Execute the screener and evaluate results
indicators = list()
for(ticker in tickers){
indicators[[ticker]] = mean_reversion(data, ticker)
}
# Rbind the indicators list
indicators = rbindlist(indicators)
head(indicators)
## date sma150 atr10 rsi3 close ticker
## 1: 2021-01-15 113.4679 0.02581628 27.30745 127.14 AAPL
## 2: 2021-01-15 210.9870 0.01823743 24.45446 212.65 MSFT
## 3: 2021-01-15 3132.2195 0.02003178 29.26385 3104.25 AMZN
## 4: 2021-01-15 439.5304 0.05079055 43.58718 826.16 TSLA
## 5: 2021-01-15 262.9280 0.03270333 43.37887 251.36 FB
## 6: 2021-01-15 1602.1517 0.02265435 28.77536 1736.19 GOOG
Keep stocks with close higher than the SMA-150
1. Long-Term Trend
indicators = indicators[close >= sma150]
nrow(indicators)
## [1] 87
we get 87 stocks just with this filter.
2. ATR-10 > 4%
indicators = indicators[atr10 >= 0.04]
nrow(indicators)
## [1] 15
After the ATR(10) filter we get 15 stocks
3. RSI-3 < 30
indicators = indicators[rsi3 <= 30]
nrow(indicators)
## [1] 3
4. Rank by RSI-3
This last part makes sense if you include more stocks to the example using an index such as the Russel-1000 or S&P 500. But still the code is the same. I’ll leave the ranking part to make the code complete.
indicators[, rk:=frank(rsi3)]
indicators[order(rk)]
## date sma150 atr10 rsi3 close ticker rk
## 1: 2021-01-15 116.08595 0.04108247 14.94451 136.60 XLNX 1
## 2: 2021-01-15 78.94353 0.04119751 15.14981 88.21 AMD 2
## 3: 2021-01-15 105.78900 0.06850013 21.49396 161.20 PDD 3
Example plot of one stock that was ranked in the top-10
At the time I wrote the post PDD
was a good example of what the strategy tries to find. It’s clear that the stock is in a long term uptrend and recently had a pullback.
df = data[['PDD']]["2020/"]
{plot(Cl(df))
lines(SMA(Cl(df), n = 20), col="blue")
lines(SMA(Cl(df), n = 50), col="red", lty=2)
# add legend to panel 1
addLegend("topleft", legend.names = c("Close", "SMA(20)", "SMA(50)"),
lty=c(1, 1, 2), lwd=c(2, 1, 1),
col=c("black", "blue", "red"))}
I hope you enjoyed the post and found it useful!
Disclaimer - Not financial advise
The information contained in this website and resources available are not intended and shall not be understood as financial advise. These are simply educational tools. Use at your own risk.