Giter VIP home page Giter VIP logo

kieranjwood / trading-momentum-transformer Goto Github PK

View Code? Open in Web Editor NEW
425.0 425.0 186.0 26 KB

This code accompanies the the paper Trading with the Momentum Transformer: An Intelligent and Interpretable Architecture (https://arxiv.org/pdf/2112.08534.pdf).

Home Page: https://kieranjwood.github.io/publication/momentum-transformer/

License: MIT License

Python 100.00%
deep-learning machine-learning momentum-trading-strategy quantitative-finance trading-strategies transformer

trading-momentum-transformer's People

Contributors

kieranjwood avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

trading-momentum-transformer's Issues

the labels v.s. output of predict

Here is the output of the best model's prediction, which is positions:

But the labels that fit this model are "target_returns":
https://github.com/kieranjwood/trading-momentum-transformer/blob/d7df00bba31f5728e1c8bc735da0208892487142/mom_trans/model_inputs.py#L94C20-L94C20

why does the output of the fit have a different meaning from the output of the predicted function?

the last step

run:python -m examples.run_dmn_experiment TFT

error:
Failed to create a directory: results\experiment_quandl_100assets_tft_cpnone_len252_notime_div_v1\2016-2017\hp\experiment_quandl_100assets_tft_cpnone_len252_notime_div_v1\trial_89e4414cc44cc0adf7739485894108be/checkpoints; No such file or directory

Possible Data Leakage in both CPD?

Within the code in mon_trans/changepoint_detection.py at function changepoint_loc_and_score.

The script used the StandardScaler to fit transform the entire timeseries data, and I could not find anywhere else has a train/test split before generating CPD data. Is this a possible Data Leakage that improves the prediction result with CPD feature ?

 time_series_data[["Y"]] = StandardScaler().fit(Y_data).transform(Y_data)

Best Regards,
Chris

Result diff between my validation and paper

Hi, Thank you very much for your great work! But I tried to run the code with everything unchanged, but the result is quite different.
image

The code I used to plot the result, btw I didn't include the cpu module in the features:
data_list = [
pd.read_csv(
os.path.join(
_get_directory_name(experiment_name, interval), "captured_returns_sw.csv"
),
# typ="series",
)
for interval in train_intervals
]
df = pd.concat(data_list)
pnls = df.groupby("time")["captured_returns"].sum()
cumulative_returns = (1 + pnls).cumprod() - 1

plt.figure(figsize=(10, 6))
cumulative_returns.plot()
plt.title('Cumulative Returns Over Time')
plt.xlabel('Date')
plt.ylabel('Cumulative Returns')
plt.grid(True)
plt.show()

How do you get ground true?

I am studying your paper and i I came across a doubt. In the paper you mentioned "We also use MACD indicators ..., defining the relationship between a short S and long signal L"

So is your a classification problem? What are the labels then? And do you generate labels (long/short) based on MACD signal?

Why do we need 'scaled_position' when calculate 'calc_net_returns'.

Hi,

As title, I saw you use the 'scaled_position' to calculate trading cost by calling diff on 'scaled_position'. I thought the output of the model is already a scaled position, since your label is a scaled return. Also don't you should call shift(1) https://github.com/kieranjwood/trading-momentum-transformer/blob/master/mom_trans/classical_strategies.py#L70 here when you calculate 'annualised_vol ', since you do the same here https://github.com/kieranjwood/trading-momentum-transformer/blob/master/mom_trans/classical_strategies.py#L137.

Error with the last step!

Hi Kieran,
Thank you for sharing your code. I ran the code till the last step but after running it with LSTM I got this error that _reported_step is not defined. I also tried with TFT but the error was as below:
Failed to find data adapter that can handle input: (<class 'tuple'> containing values of types set()), (<class 'dict'> containing {"<class 'str'>"} keys and {'(<class 'list'> containing values of types {"<class 'keras_tuner.engine.tuner_utils.TunerCallback'>", "<class 'mom_trans.deep_momentum_network.SharpeValidationLoss'>", "<class 'keras.callbacks.TerminateOnNaN'>"})', "<class 'numpy.ndarray'>", "<class 'int'>", "<class 'bool'>"} values)

I would appreciate it if you could comment on how this can be resolved.
Thanks

Problem with Generating Prediction

Thanks for sharing this code: it's a really great and novel approach, that is well developed

I managed to run a full TFT experiment using the prompt you gave in the last step and TFT. It ran for about 12-14 hours and produced all of the data in "results\experiment_quandl_100assets_tft_cpnone_len252_notime_div_v1", for each year. But when it came to output it just says "Predicting on test set" and then gave the below Tensorflow error.

Where does it generate a future forecast, or if it doesnt do that, where in the code can I pick up to understand the practical outputs either by way of using the trained architecture, or generating a graphical or text output for the time series data. At the moment it just seems to be testing, training and delivering that output to results\ folder and nothing else.

Output:

Best sharpe So Far: 3.190661052421784
Total elapsed time: 04h 01m 38s
Best validation loss = 3.0821421303297503
Best params:
hidden_layer_size = 40
dropout_rate = 0.5
max_gradient_norm = 1.0
learning_rate = 0.001
batch_size = 128
Predicting on test set...
performance (sliding window) = 0.21481784546357402
performance (fixed window) = 0.9377698822801832
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_1
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_2
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.decay
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.learning_rate
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.
g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.

StandardScaler on returns ? Is this correct ?

Please accept my apologies for the frequent messages. I am writing to you regarding a doubt that has arisen after analyzing your code.

While reviewing your code, I observed that you have used StandardScaler on both features and the return target variable in the function set_scalers(self, df). This approach is perfectly acceptable. However, I have concerns regarding the calculation of captured returns in the custom loss function defined in SharpeLoss class inside deep_momentum_network file.

Specifically, the function call(self, y_true, weights) uses scaled returns in the first argument. The captured returns are then calculated using this scaled return and the position size, captured_returns = weights * y_true. This results in a return that is not accurate since it is based on scaled returns rather than actual returns.

Furthermore, there is no provision to unscale the returns or position in the code as the only function that does that is format_predictions(self, predictions), which does never get called.

why transaction cost on diff of position ?

Hi,

Debugging the code i noticed that the difference of the position is being used to apply commissions. Why is that?

If the position size is 0.5 and the next position is 0.7 you calculate the commission on 0.7-0.6=0.1 * commission and then you subtract this to the captured return. Shouldn't you calulcate it on the actual position itself so 0.5 * 0.1 and then substract from captured return?

Also the position used is the scaled one so it would even be smaller than the actual real position size.

image

differences between paper and validation

It is a gread work for the methods sharing.
Validate steps are done following the orders listed from the README and the results caculated are listed bellow:
image
It is very different from results from the paper:
image

I am wondering the reason.
Thank you for your help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.