kieranjwood / trading-momentum-transformer Goto Github PK

This code accompanies the the paper Trading with the Momentum Transformer: An Intelligent and Interpretable Architecture (https://arxiv.org/pdf/2112.08534.pdf).

Home Page: https://kieranjwood.github.io/publication/momentum-transformer/

License: MIT License

Python 100.00%

deep-learning machine-learning momentum-trading-strategy quantitative-finance trading-strategies transformer

trading-momentum-transformer's People

Contributors

Stargazers

Watchers

Forkers

wesley1001 nrapanos fdoperezi gmetsov gillmac13 pyliut foxviii rsquared2016 kangalice gitram openself xuhang-1 shanren7 yutiansut overfittingstudyroom seanahmad webclinic017 hjl2014 saeed-capstone soban50 victorxie996 maxclchen yanghoonkim falconerchen jingmouren ngwangshu mbottjer jankim jc92 tushar-mb-goyal sukhmkang sinarahimzade taeyonkim karumanchi eazyreal s0ap d1on thetradingflow pm036929 funkyungjz jhleonjg ahaidichen joaoromeuferraz andrewlu2021 6c737133 noke8868 kaifang0813 jackal08 motoki-haga chanyk-joseph lukasbures sharavsambuu bwuebben devinearr juliocastilla blex-capital duggar mkllg mg64ve em4n0n pjpizzolato danielr078 vadimtk artemkolmykov annesh0 code0987 mickydowns nutdnuy coderiety-py dylanmckendry snakedragon dstadelman aiquants sc4recoin praveen686 minicod001 sbatgh makovez mzs0207 serignecisse sesatt quant2008 george-nigm yzyuan kadiexpert abeliansw fudongfang666 mjp152 benwaldner firmai-research willwill85 kic junyi95 amiravery alfredcyl julianrodert sixkingdoms jyjatbupt dyccolin hugging

trading-momentum-transformer's Issues

the labels v.s. output of predict

Here is the output of the best model's prediction, which is positions:

trading-momentum-transformer/mom_trans/deep_momentum_network.py

Line 474 in d7df00b

positions = model.predict(

But the labels that fit this model are "target_returns":
https://github.com/kieranjwood/trading-momentum-transformer/blob/d7df00bba31f5728e1c8bc735da0208892487142/mom_trans/model_inputs.py#L94C20-L94C20

why does the output of the fit have a different meaning from the output of the predicted function?

the last step

run：python -m examples.run_dmn_experiment TFT

error:
Failed to create a directory: results\experiment_quandl_100assets_tft_cpnone_len252_notime_div_v1\2016-2017\hp\experiment_quandl_100assets_tft_cpnone_len252_notime_div_v1\trial_89e4414cc44cc0adf7739485894108be/checkpoints; No such file or directory

Possible Data Leakage in both CPD?

Within the code in mon_trans/changepoint_detection.py at function changepoint_loc_and_score.

The script used the StandardScaler to fit transform the entire timeseries data, and I could not find anywhere else has a train/test split before generating CPD data. Is this a possible Data Leakage that improves the prediction result with CPD feature ?

 time_series_data[["Y"]] = StandardScaler().fit(Y_data).transform(Y_data)

Best Regards,
Chris

Result diff between my validation and paper

Hi, Thank you very much for your great work! But I tried to run the code with everything unchanged, but the result is quite different.

The code I used to plot the result, btw I didn't include the cpu module in the features:
data_list = [
pd.read_csv(
os.path.join(
_get_directory_name(experiment_name, interval), "captured_returns_sw.csv"
),
# typ="series",
)
for interval in train_intervals
]
df = pd.concat(data_list)
pnls = df.groupby("time")["captured_returns"].sum()
cumulative_returns = (1 + pnls).cumprod() - 1

plt.figure(figsize=(10, 6))
cumulative_returns.plot()
plt.title('Cumulative Returns Over Time')
plt.xlabel('Date')
plt.ylabel('Cumulative Returns')
plt.grid(True)
plt.show()

How do you get ground true?

I am studying your paper and i I came across a doubt. In the paper you mentioned "We also use MACD indicators ..., defining the relationship between a short S and long signal L"

So is your a classification problem? What are the labels then? And do you generate labels (long/short) based on MACD signal?

Is the project including the paper of Few-Shot Learning Patterns in Financial Time-Series for Trend-Following Strategies ?

thank you for source code share, can you open the paper's source code of Few-Shot Learning Patterns in Financial Time-Series for Trend-Following Strategies?

Why do we need 'scaled_position' when calculate 'calc_net_returns'.

Hi,

As title, I saw you use the 'scaled_position' to calculate trading cost by calling diff on 'scaled_position'. I thought the output of the model is already a scaled position, since your label is a scaled return. Also don't you should call shift(1) https://github.com/kieranjwood/trading-momentum-transformer/blob/master/mom_trans/classical_strategies.py#L70 here when you calculate 'annualised_vol ', since you do the same here https://github.com/kieranjwood/trading-momentum-transformer/blob/master/mom_trans/classical_strategies.py#L137.

how can i continue to train from checkpoint or predict from check point if session ends before complete training?

when i run the experiment, i also encounter session ends when idle for too long.

how can i continue to train from checkpoint or predict from check point if session ends before complete training?

Error with the last step!

Hi Kieran,
Thank you for sharing your code. I ran the code till the last step but after running it with LSTM I got this error that _reported_step is not defined. I also tried with TFT but the error was as below:
Failed to find data adapter that can handle input: (<class 'tuple'> containing values of types set()), (<class 'dict'> containing {"<class 'str'>"} keys and {'(<class 'list'> containing values of types {"<class 'keras_tuner.engine.tuner_utils.TunerCallback'>", "<class 'mom_trans.deep_momentum_network.SharpeValidationLoss'>", "<class 'keras.callbacks.TerminateOnNaN'>"})', "<class 'numpy.ndarray'>", "<class 'int'>", "<class 'bool'>"} values)

I would appreciate it if you could comment on how this can be resolved.
Thanks

Question about the calculation of 'captured_returns'.

Here https://github.com/kieranjwood/trading-momentum-transformer/blob/master/mom_trans/deep_momentum_network.py#L484 you use 'captured_returns = returns * positions', but this 'returns ' is a volatility scaled returns. Should you use the real daily return?

Problem with Generating Prediction

Thanks for sharing this code: it's a really great and novel approach, that is well developed

I managed to run a full TFT experiment using the prompt you gave in the last step and TFT. It ran for about 12-14 hours and produced all of the data in "results\experiment_quandl_100assets_tft_cpnone_len252_notime_div_v1", for each year. But when it came to output it just says "Predicting on test set" and then gave the below Tensorflow error.

Where does it generate a future forecast, or if it doesnt do that, where in the code can I pick up to understand the practical outputs either by way of using the trained architecture, or generating a graphical or text output for the time series data. At the moment it just seems to be testing, training and delivering that output to results\ folder and nothing else.

Output:

Best sharpe So Far: 3.190661052421784
Total elapsed time: 04h 01m 38s
Best validation loss = 3.0821421303297503
Best params:
hidden_layer_size = 40
dropout_rate = 0.5
max_gradient_norm = 1.0
learning_rate = 0.001
batch_size = 128
Predicting on test set...
performance (sliding window) = 0.21481784546357402
performance (fixed window) = 0.9377698822801832
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.iter
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_1
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.beta_2
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.decay
WARNING:tensorflow:Unresolved object in checkpoint: (root).optimizer.learning_rate
WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.
g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.

rebalance monthly instead of daily

StandardScaler on returns ? Is this correct ?

Please accept my apologies for the frequent messages. I am writing to you regarding a doubt that has arisen after analyzing your code.

While reviewing your code, I observed that you have used StandardScaler on both features and the return target variable in the function set_scalers(self, df). This approach is perfectly acceptable. However, I have concerns regarding the calculation of captured returns in the custom loss function defined in SharpeLoss class inside deep_momentum_network file.

Specifically, the function call(self, y_true, weights) uses scaled returns in the first argument. The captured returns are then calculated using this scaled return and the position size, captured_returns = weights * y_true. This results in a return that is not accurate since it is based on scaled returns rather than actual returns.

Furthermore, there is no provision to unscale the returns or position in the code as the only function that does that is format_predictions(self, predictions), which does never get called.

why transaction cost on diff of position ?

Hi,

Debugging the code i noticed that the difference of the position is being used to apply commissions. Why is that?

If the position size is 0.5 and the next position is 0.7 you calculate the commission on 0.7-0.6=0.1 * commission and then you subtract this to the captured return. Shouldn't you calulcate it on the actual position itself so 0.5 * 0.1 and then substract from captured return?

Also the position used is the scaled one so it would even be smaller than the actual real position size.

differences between paper and validation

It is a gread work for the methods sharing.
Validate steps are done following the orders listed from the README and the results caculated are listed bellow:

It is very different from results from the paper:

I am wondering the reason.
Thank you for your help.