Giter VIP home page Giter VIP logo

predicting-implied-volatility's Introduction

DBR CLOUD POC

A day in the life of a Quantitative Researcher

In this solution we will reproduce the most common tasks quantitative researchers perform, namely 1. developing new quantitative models like asset allocation or novel risk-adjusted performance metrics (to account for non-standard risk) using academic papers and 2. designing experiments to test these models.

We will implement the logic of the following academic paper (Deep Learning Volatility, 2019, Horvath et al), build the proposed model, and productionalize everything using various Databricks services (see the Architecture at the end of this notebook).

The aim of the paper is the build a neural networks that is an off-line approximation of complex pricing functions, which are difficult to represent or time-consuming to evaluate by other means. In turn this solves the calibration bottleneck posed by a slow pricing of derivative contracts.

Link to the paper - https://arxiv.org/pdf/1901.09647.pdf

Why Databricks Lakehouse for "Deep Learning Volatility"?

  1. Scale: The burst capacity of Databricks Runtime and Photon can run this very computationally intensive synthetic data generation algorithm (from the paper) extremely quickly and in a cost-efficient way.
  2. DataOps - Feature Store: Databricks Feature Store can keep these generated features in a highly efficient format (as Delta tables) making them ready for online and offline training, eliminating the need to re-run these algorithms many times and avoid additional costs.
  3. Collaboration between R and python: After generating the synthetic data, we need to test the quality of the data and identify potential statistical issues, such as heteroskedasticity, as we will use this data to train regression models. For this purpose, we will use R packages, as R is specifically built for performing statistical tasks. This will visualize the simplicity of using python and R in the same Interactive Databricks Notebook, and utilizing the strengths of each language (and supported libraries), without having to re-write R libraries or provision additional Notebooks or clusters.
    • We will also use the built-in dashboarding capabilities of the Databricks Notebook to visualize the pair-plot of the generated data, and
    • The automated Data Profile feature of the Databricks Notebooks will help us observe the distribution and overall quality of the generated data.
  4. MLOps - ML experiments and deployment: This paper requires training many models at the same time. Keeping track of each model (hyper-param tunning, computation time, feature selection, and many others) can become very ineffective when handling so many models simultaneously. That is where MLFlow's Experiments tracking comes to help and streamline model development (see Notebook 03_ML).
  5. Productionalization: Finally, we use Databricks Workflows to orchestrate the end-to-end execution and deployment. Databricks Workflows is the fully-managed orchestration service for all your data, analytics, and AI needs. Tight integration with the underlying lakehouse platform ensures you create and run reliable production workloads on any cloud while providing deep and centralized monitoring with simplicity for end-users (see Notebook 04_Productionalizing).

[email protected]



© 2022 Databricks, Inc. All rights reserved. The source in this notebook is provided subject to the Databricks License [https://databricks.com/db-license-source]. All included or referenced third party libraries are subject to the licenses set forth below.

library description license source
PyYAML Reading Yaml files MIT https://github.com/yaml/pyyaml
Tensorflow Machine Learning Apache 2.0 https://github.com/tensorflow/tensorflow
TF Quant Finance Machine Learning Apache 2.0 https://github.com/google/tf-quant-finance

To run this accelerator, clone this repo into a Databricks workspace. Attach the RUNME notebook to any cluster running a DBR 11.0 or later runtime, and execute the notebook via Run-All. A multi-step-job describing the accelerator pipeline will be created, and the link will be provided. Execute the multi-step-job to see how the pipeline runs.

The job configuration is written in the RUNME notebook in json format. The cost associated with running the accelerator is the user's responsibility.

predicting-implied-volatility's People

Contributors

aamend avatar borisbanushev avatar dbbnicole avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.