1 Introduction

The purpose of this initial chapter is to provide some context and motivation for how and why I wrote this book. For those of you who want to get to the good stuff, it can safely be skipped.

1.1 Delta-Neutral

This book would not have been possible without the support of my data partners at Delta Neutral. When I was asked to teach a data science course at the University of Minnesota School of Mathematics, I knew I wanted a real-world data set to be the bedrock of the curriculum. A few years prior, I had purchased data from Delta Neutral in order to complete a backtest that I was working on. I was immeasurably impressed with the data, especially when compared to that of their competitors, both in terms of price and quality.

Delta Neutral was founded in 2002 by Rick Fortier. At the time he was looking for high quality daily option data, and there simply were no reasonable sources. And so he saw a business opportunity. Options data is complex. Rick and his team have hit an unequivocal home run with their product. When I approached Rick about showcasing his data in my courses, he was excited to help spread the knowledge of options and data analysis through this innovative channel.

Nearly all the data analysis examples in this book come from Delta Neutral’s Level 3 product. This is a phenomenal data set of the highest quality. It contains daily prices for all options traded on thousands of equity underlyings, and goes back to 2002. Feel free to reach out me, or to the team at Delta Neutral, if you have further questions about their data offerings: support@deltaneutral.com.

1.2 Why This Book

This book emanated from a course I was asked to teach for Master of Financial Mathematics (MFM) Program at the University of Minnesota. My initial vision for the course was a short but intensive module on backtesting options trading strategies. It would be geared towards advanced students who had already completed year-long sequences in both options theory and computer programming. The idea was that this module would help the student put their new skills to the test with a challenging data analysis problem - precisely the kind of project they would be required to execute on the job.

Due to a scheduling conflict, this initial vision had to be modified. Instead I would be teaching similar material, but at a more introductory level to the first year students. The typical first year MFM student is a freshly graduated math major with minimal previous exposure to options or programming. So generally very smart students with a solid education, but having little in the way of knowledge that is directly applicable to the task of backtesting options.

As I pondered how I might go about teaching such a course, it occurred to me that I was going to have to present both the basic definition of an option as well as the basic definition of a for-loop. To guide students from that point, to the point of being able to backtest trading strategies struck me as a markedly tall order. But I was intrigued by the challenge and so I accepted.

While designing the course, I knew that I wanted to introduce data analysis material as quickly as possible, but that I was also going to have to cover a lot of options basics up front as well. I quickly realized that basic coding material has been presented exceptionally well in texts such as R for Data Science, so I could simply assign readings from these texts as self-study assignments. I also knew that introductory options material has been covered superbly in texts such as Hull and McDonald.

The key motivating insight for me was that many important facts about options can be best illustrated by manipulating options data. As I reflected, I concluded that many of my greatest learning about options came when my hands were very dirty in real-world data (and NOT by writing out equations with a pen and a paper). Moreover, by learning about options via programming, you also develop the single most important practical skill for a quantitative finance professional: expressing financial ideas in code.

Thus, I set out to write this new approach finance: Learning Options from Data (LOD).

1.3 Who Is This Book For?

I classify the target audience for this book in terms of the particular subject matter that they are primarily interested in learning:

Options -and- Coding: The readers who will get the most out of this book will be those that are similar to my MFM students - students who want to learn data analysis in a quantitative finance context, but who still need to build foundations in both areas. My University of Minnesota courses are based on systematically working through this book, as well as completing the further reading assignments.

Coding: Suppose you are a professional that has a solid knowledge of options, and you would like to build some additional data analysis skills in R or Python. (Shameless plug: these are precisely the kind of learners that are served by the bespoke training courses from Option Data Science.) You can use this book by jumping straight to the coding exercises, and perusing the explanatory material as needed. You will likely spend a fair amount of time on the prerequisite reading assignments from the coding books mentioned in the next section.

Options: Suppose you are a data scientist or developer that needs to apply their skills in a quantitative finance context. For you, the coding will be simple, but you will need to focus on the explanatory finance material to be able to truly understand the coding assignments. You will spend a fair amount of time on the prerequisite reading assignments from the finance books mentioned in the next section.

Delta-Neutral: If you have already purchased, or are considering purchasing, the Delta-Neutral data set, this book will serve as a guided tour of the data. The coding exercises in this book will greatly accelerate your learning curve. As you probably know, a huge amount time gets spent figuring out a new data set and performing initial wrangling tasks. This book seeks to give you a head start on journey of wrangling the Delta-Neutral data set.

1.4 How to Read

This section details some of the guiding principles I enlisted in writing this book; these provide insights on how to use the book for maximum return on your investment of time.

1.4.1 Guiding Philosophies

Coding without Context Sucks

I have a huge amount of affection for Hadley Wickham’s R for Data Science and similar titles. So much so that this book was written in it’s image. So much so that this morning I spent 20 minutes figuring how to get my welcome page to look exactly like his.

A general data analysis book like R4DS, by by it’s very nature, cannot be context specific. But ultimately, I’m usually left feeling a little bit empty and drowsy performing analysis on toy data sets about cars or flights. The truth is that it’s always more enjoyable and effective to learn data analysis when you care about the data. That’s really the thrust of this book - learning data analysis in the context of options.

Type Along

Pen and paper is dead. If you’re not typing code and watching output come out, you’re not really learning. The interactive coding environments of R and Python are ideal for this kind of learning. Even if you think you have no idea of what is going on, if there is a coding example, force yourself to type it out (copy/paste doesn’t count) and make sure you get the same result.

Don’t Reinvent the Wheel

So much for avoiding cliches. The point is that there has been a ton of great introductory material written on both data analysis and options. While I do present my take on some of this material, I also assign reading from a few other texts (which are mentioned below). This book is still only weakly self-contained, and is best studied along side the companion texts.

1.4.2 Prerequisites

The math prerequisite to mastering options is a strong intuitive understanding of calculus, linear algebra, probability, and statistics. For most people, this can really only be acquired from a 3-semester calculus sequence, and semester long undergraduate courses in the other three areas. The options self-help community would probably argue that you don’t need that much math, but I disagree. There are certainly diminishing practical returns to advanced subjects like stochastic calculus and functional analysis, but the areas I detail above are a far cry from that.

In my experience, the most productive finance professionals are those who have a strong intuitive understanding of basic undergraduate mathematics, who don’t get too caught up in theoretical details, and who focus on implementing their knowledge in code as quickly as possible.

This book is designed for learning data analysis in the context of quantitative finance, but I don’t over the basics of either topic (data analysis or quantitative finance) thoroughly. So if you are lacking basic knowledge in either area, you’ll need to spend some time on the suggested reading.

1.4.3 Further Reading

This book is a hybrid of an introductory data analysis text and an introductory options text. It’s a jack of two trades, and master of neither. LOD is designed to be a companion to the books detailed below. Throughout this text, I will assign readings from these other books for one of two reasons: 1) addressing basics that have already been explained well elsewhere; 2) a deeper dive into material for those who want to know more. The amount of further reading you will need will depend on your background and your interests.

Data Analysis:

R for Data Science - Hadley Wickham and Garrett Grolemond

Advanced R - Hadley Wickham

Python for Data Analysis - Wes McKinney

Python Data Science Handbook - Jake VanderPlas

Quantitative Finance:

Options, Futures, and Other Derivatives - John Hull

Derivatives Markets - Robert McDonald

1.5 Acknowledgments

The layout of this book was adapted from the bookdown files of two great books: R Markdown: The Definitive Guide and R for Data Science. Thank you to the authors Yihui Xie, J. J. Allaire, Garrett Grolemund, and Hadley Wickham for making the files freely available on Github.

Many thanks to the students at the University of Minnesota, who’s questions and curiosities helped to guide the writing of this book.

1.6 About the Author

Cover image

This book was written by Pritam Dalal, founder of Option Data Science - a boutique consultancy focused on the intersection between data science and volatility markets.

Prior to launching ODS, Pritam held a variety of trading and analysis roles at firms such as Cargill, Wolverine Trading, Allianz Investment Management, and Two Harbors Investment Corp. A common thread running through all of his work has been data-driven analysis, coupled with the ability to communicate those findings to fellow decision makers. These skills have been applied to areas ranging from volatility trading in commodity markets, to equity option market-making, to hedging complex annuities.

In addition to serving clients, Pritam is an adjunct professor in the University of Minnesota School of Mathematics where he teaches graduate students about applying data science to quantitative finance.

Pritam holds a Bachelors in Mathematics and Economics from U.C. Berkeley, and a Masters in Financial Mathematics from the University of Minnesota, Twin Cities.

Feel feel to reach him by phone or e-mail: 206.802.5525; dalal030@umn.edu.