Comments (7)
My understanding is that we want informative features in addition to the side during meta-labeling training to assist the model in discriminating between profitable and unprofitable signals.
from adv_fin_ml_exercises.
Hey,
Assume you have a trading model to decide on the side of a bet, long or short. Usual thing is to open a position right away. Meta labeling is for making a decision on either opening or not opening that position.
In practice, since you know the side of the bet and you have your labels, you already know if this is a profit or a loss (two classes 1,0). Hence, Meta labels are the PNL of the trading strategy.
Using these two classes as the response variable and using your same features set as the input variable, you can train a binary classification model like logistic regression to decide on either opening or not opening those positions. You can also decide on the size of the bet using the probability of the prediction.
from adv_fin_ml_exercises.
I had some similar questions. I Read all the papers in the bib and have built up somewhat of an intuition but still so many questions. I wrote the following notebook on Quantopian illustrating meta labeling but using MNIST data. https://www.quantopian.com/posts/meta-labeling-advances-in-financial-machine-learning-ch-3-pg-50
Later on in the bet sizing chapter de Prado expands a bit more on the relevance of meta labeling.
from adv_fin_ml_exercises.
Thanks you very much !!
My conclusion (please correct me if I sound wrong)
From our trading model we get the side of the bet (as 1 or 0, long or short respectively). We will use this in real environment for trading and get the real output as profit or loss (1 or 0, profit and loss receptively//similar as side). The 'output from model' as input the 'real trading output' as output we train a classifier, depending on the output of classifier we choose our positions
------ This is called Meta-labelling
from adv_fin_ml_exercises.
I notice that de Prado will add additional inputs, beyond simply side (i.e. moving average plus side are the inputs, and binary output)
Does anybody have insight on which is preferred? Because intuitively, we'd want to add more and more inputs beyond side as a trading signal in live trading
from adv_fin_ml_exercises.
My understanding is that we want informative features in addition to the side during meta-labeling training to assist the model in discriminating between profitable and unprofitable signals.
Opened an issue regarding this in the Ch3 notebook.
from adv_fin_ml_exercises.
It isn't really explained very well theoretically. A few observations: it is basically a meta-probability model if you forget about bet-size and just focus on model B is predicting probability model A is correct given
My suspicous is that when you include context data (X) you almost always are running out of data points if you do anything but classification on a binarized outcome. Quantile regression should help, I didn't see this discussed.
It would be interesting to see some work with these different model objectives but on layers of an algo (a "network"). You could vary the weights on these to go from the fullly disconnected model A to a blend of model A caring about it's objective and the overall objective.
Generally, adding extra objectives has a regularizing effect on these problems. This is likely the most intuitive way to think about if it actually makes sense.
If you take the words literally in the book, he seems to be confounding bet-sizing withe predicting the distribution conditional on the side.
For example, how would you use meta-labeling as Prado discribes it to combine the predictions of many models across many times and assets to make portfolio level decisions?
from adv_fin_ml_exercises.
Related Issues (20)
- Possible problem in main_mp/sequential bootstrap code
- clean_IVE_fut_prices.parquet is not provided. HOT 1
- Ch3 Notebook: Adjust the getBins function HOT 1
- Ch3 Notebook: Random Forest Model HOT 1
- Ch3 Notebook: Trend Follow Strategy HOT 2
- Speed improvements for sampling HOT 2
- Exercices HOT 1
- Meta-labelling HOT 24
- getWeights_FFD is incomplete! HOT 1
- Question: Chapter 16
- Problem 1.7 on pitfalls of Sharpe ratio HOT 3
- Tick, Volume HOT 1
- How useful are Featrues(but Time & Price itself) in LSTM deep Learning for Trading ? HOT 6
- Optimum way of Scaling for LSTM Regressor?
- NameError: name 'data_dir' is not defined HOT 1
- Strange Dataframe doesnt work with specific functions : HOT 1
- Features for meta-labeling
- Sequentializing data HOT 1
- Potential bug in PurkedKFold class HOT 2
- Code questions corrections and suggestions.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from adv_fin_ml_exercises.