6 min read

Exploratory analysis of corporate bonds data

Introduction

The objective of this post is to explore current bonds data. This data is available from my broker so I guess most investors should be able to access it. It has helped me a great deal to download this data and use a simple approach to find mispriced bonds.

Reading the data

The file I’ll work with has 69921 observations and is available for download. Each of this is the average price of the bond for the day the data was extracted. Below are a few rows of the data.

library(data.table)
library(ggplot2)

data_file = "https://gist.githubusercontent.com/martinbel/13ddba5c85a1e48946f6cd04c750dd13/raw/2d09839ccedfed2127ab5a86efa980ecfa4ee435/bonds_data.csv"
dt = fread(data_file)

setnames(dt, "p_rating", "sp_rating")
# Compute the years to maturity
dt[, maturity_year:=round(as.numeric(maturity - as.IDate("2021-01-22"))/365, 2)]

# keep bonds that haven't matured yet
dt = dt[maturity > as.IDate("2021-01-22")]

# Convert ytm to a percent
dt[, ytm:=ytm/100]

# Order rating values for S&P and Moody's
sp_levels = c("AAA", "AA+","AA-","AA","A+","A","A-","BBB+","BBB","BBB-","BB+","BB","BB-","B","B+","B-","NR","CCC","CCC+","CC","CCC-","C","D")
dt[, sp_rating:=factor(sp_rating, levels=sp_levels)]

dt[moodys_rating == 'a1', moodys_rating:=NA]
moodys_levels = c("Aaa","Aa1","Aa2","Aa3","A1","A2","A3","Baa1","Baa2","Baa3","Ba1","Ba2","Ba3","B1","B2","B3","NR","WR",
                  "N/R","Caa1","Caa2","Caa3","Ca")
dt[, moodys_rating:=factor(moodys_rating, levels=moodys_levels)]

dt[1:2]
##    sp_rating moodys_rating               description coupon   maturity callable
## 1:       BBB          Baa2        Textron Inc. 3.65%   3.65 2021-03-01      Yes
## 2:       BBB          Baa2 Kinder Morgan Energy 5.8%   5.80 2021-03-01      Yes
##    quote qty   price min max     ytm ytw1 accrued_interest estimated_total
## 1:   Ask  25 100.904   2 178 0.00033   NA           228.13        25454.13
## 2:   Ask  16 101.444   1  16 0.00023   NA           232.00        16463.04
##    current_yield maturity_year type       date
## 1:     0.0361730           0.1   IG 2020-11-27
## 2:     0.0571744           0.1   IG 2020-11-27

High level statistics

The majority of bonds are of IG (investment grade).

dt[, .N, type]
##    type     N
## 1:   IG 60513
## 2:   HY  9404

The following plot shows the frequency of bonds for each S&P rating. The BBB, A and BB bonds are the most frequent.

sp_counts = dt[, .N, .(sp_rating)][order(-N)]
sp_counts[, prop:=N/sum(N)]

ggplot(sp_counts, aes(sp_rating, prop, group=1)) + 
  geom_line() + 
  geom_point() +
  scale_y_continuous(labels=scales::percent) +
  xlab("S&P Rating") + ylab("Proportion") +
  ggtitle("Proportion of bonds by S&P Rating")

Yield to maturity distribution

The following plot shows the YTM distribution for HY and IG bonds. The coord_cartesian call filters the data I show in the plot but doesn’t remove it from the available data. The HY distribution has some observations between 10% to 20% YTM that are left out but could potentially be interested in some cases.

ggplot(dt, aes(ytm, y=after_stat(density), fill=type)) + 
  geom_histogram(binwidth = 0.0025, alpha=0.8) + 
  geom_density(binwidth = 0.0025, alpha=0.2) +
  facet_grid(type ~ .) + 
  scale_x_continuous(labels=scales::percent, breaks=seq(0, 10, 0.02)) +
  coord_cartesian(xlim=c(0, 0.10))
## Warning: Ignoring unknown parameters: binwidth
## Warning: Removed 206 rows containing non-finite values (stat_bin).
## Warning: Removed 206 rows containing non-finite values (stat_density).

Based on this we can arrive to a few conclusions. The HY distribution is centered ~4% YTM whereas the IG distribution is bimodal. This means we probably need another variable to explain the YTM.

YTM & maturity

The following plot shows the relation between YTM and maturity for multiple S&P ratings.

drop_ratings = c("D", "C", "CCC-", "CC", "CCC+", "CCC", "NR")

ggplot(dt[!(sp_rating %in% drop_ratings)][ytm <= 0.15], aes(maturity_year, ytm, color=type)) + 
  geom_point(alpha=0.3) +
  xlab("Years to Maturity") +
  facet_wrap(~sp_rating, scales='free_y') + 
  scale_y_continuous(labels=scales::percent) + 
  coord_cartesian(xlim=c(0, 50))

How I screen for High yield bonds

dt[ytm <= 0.2, quantile(ytm, seq(0, 1, 0.1), na.rm=T)]
##      0%     10%     20%     30%     40%     50%     60%     70%     80%     90% 
## 0.00002 0.00478 0.00782 0.01100 0.01512 0.01963 0.02468 0.02763 0.03148 0.03987 
##    100% 
## 0.19780

I’d start looking at the companies yielding +10% that have bonds maturing in more than 2 years. Most of these are probably awful companies that could default and go bankrupt BUT here is the place where mispriced bonds live.

dt_hy = dt[!(sp_rating %in% drop_ratings)]
dt_hy = dt_hy[ytm > 0.10 & maturity_year >= 2]
dt_hy[1:3]
##    sp_rating moodys_rating                description coupon   maturity
## 1:        B+            B1 The GEO Group, Inc. 5.125%  5.125 2023-04-01
## 2:        B+            B3  QEP Resources, Inc. 5.25%  5.250 2023-05-01
## 3:         B            B3 Precision Drilling C 7.75%  7.750 2023-12-15
##    callable quote qty  price min max     ytm   ytw1 accrued_interest
## 1:      Yes   Ask  25 87.100   2 248 0.11584 11.584           213.54
## 2:      Yes   Ask  25 86.675   5 100 0.11737 11.737           109.38
## 3:      Yes   Ask  15 88.100   2  15 0.12579 12.579           536.04
##    estimated_total current_yield maturity_year type       date
## 1:        21988.54    0.05884041          2.19   HY 2020-11-27
## 2:        21778.13    0.06057110          2.27   HY 2020-11-27
## 3:        13751.04    0.08796822          2.90   HY 2020-11-27

These are 102 bonds, but how many companies are in there? If I leave out the coupon information from the description I get ~10 companies. Now it looks a lot more simple to look at.

dt_hy[, unique(trimws(gsub("[[:digit:]]|%|\\.", "", description)))]
##  [1] "The GEO Group, Inc"   "QEP Resources, Inc"   "Precision Drilling C"
##  [4] "SESI, LLC"            "Laredo Petroleum, In" "Antero Resources Cor"
##  [7] "Exterran Energy Solu" "SM Energy Co"         "PBF Holding Co LLC"  
## [10] "Macys Inc"            "Macy's, Inc"          "MAY DEPT STORES CO"

From these I would avoid energy companies. But GEO and M look like viable alternatives considering we are looking for a YTM that’s more than twice the median YTM for HY bonds. So now I can go back and filter the companies I’m interested. Let’s take a look at Macy’s Bonds for example.

dt_macys = dt[grep("Macy|MAY DEPT STORES CO", description)]

The plot shows the 3 most important variables I look at when screening for bonds. From the plot I can see that bonds maturing in 5 to 10 years have the highest YTM and current yield.

ggplot(dt_macys, aes(maturity_year, ytm, color=current_yield)) + 
  geom_point() + 
  xlab("Years to Maturity") + ylab("YTM") +
  ggtitle("Macy's Bonds key variables") + 
  scale_y_continuous(labels=scales::percent)

Now I just need to finish filtering the data and decide which bond to buy.

dt_macys[maturity_year %between% c(5,13) & ytm >= 0.09 & current_yield > 0.08][order(-ytm)]
##    sp_rating moodys_rating               description coupon   maturity callable
## 1:         B            B1         Macy's, Inc. 6.9%  6.900 2029-04-01      Yes
## 2:         B            B1           Macys Inc. 6.7%  6.700 2028-09-15      Yes
## 3:         B            B1 MAY DEPT STORES CO 7.875%  7.875 2030-03-01      Yes
## 4:         B            B1   MAY DEPT STORES CO 6.9%  6.900 2032-01-15      Yes
## 5:         B            B1         Macy's, Inc. 6.9%  6.900 2029-04-01      Yes
## 6:         B            B1   MAY DEPT STORES CO 6.9%  6.900 2032-01-15      Yes
##    quote qty  price min max     ytm ytw1 accrued_interest estimated_total
## 1:   Ask   2 79.000   2   2 0.10778   NA            23.00         1603.00
## 2:   Ask  10 79.436   5  10 0.10646   NA           141.44         8085.04
## 3:   Ask  19 85.400   1  19 0.10362   NA           374.06        16600.06
## 4:   Ask  25 78.050   5  40 0.10251   NA           651.67        20164.17
## 5:   Ask   5 86.136   5   5 0.09331   NA            64.21         4371.01
## 6:   Ask  15 84.050   5  15 0.09224   NA           411.13        13018.63
##    current_yield maturity_year type       date
## 1:    0.08734177          8.19   HY 2020-11-27
## 2:    0.08434463          7.65   HY 2020-11-27
## 3:    0.09221311          9.11   HY 2020-11-27
## 4:    0.08840487         10.99   HY 2020-11-27
## 5:    0.08010588          8.19   HY 2020-12-04
## 6:    0.08209399         10.99   HY 2020-12-04

This provides a nice list of bonds to look into in more detail. One more caveat, it’s possible these prices are not 100% correct or the bond spreads are too high.

Disclosure:

The information contained in this website and resources available are not intended and shall not be understood as financial advise. These are simply educational tools. Use at your own risk.