capmarket's People
Forkers
glossner sangwoo3 debackerl aeim vuviethung1998 hedgefair ssquant-team rohhenry andrup avinashrocks1990 tejamoy vslobody bayuhebat beemd oriakiva nhu2000 btezergil chrisbatso lyonleelpl parei sasikanuri varunjain3 embraysitereal ginward tlennon140 lkh-1 eshkiya cl12102783 sumitsrv kukkerem sebadima dorienh dhrms margaritakartaviciute yrbahn citymap noahzuckerman domsooch ayxemma lucky7chess benwaldner balker0322 ivanletteri itemhsu evdokimovmaks fdoperezi fabble2202 paratra asrulsibaoel doubleyu celsosingoaramaki spytensor pedro1492 raitraidma halasnet godwinbenny galaxyh wesley1001 rottschaferanders cmajorsolo mixaural chocdoughnut xfx88 jessehenson pavithranr bbascbr ilew pureuniverse ibkvictor bigandsweet cwickniss tedpark cris-her wangsheng1991 ianderrington vinh-cao jetonbacaj parshkov zac-j-harris seanahmad ahaidichen stix2311 jol reinforcement-learning-experiments itsmemba13 q77wang marearts utahman chengxuncc starlord2222 hwang127 mikeissimo pablo-lg riaje 12lholt qli007 lq-ql ivanmkc diegoug leslielee1203capmarket's Issues
Shuffling data samples during training
Hi Jan,
may I ask you a question: why don't you shuffle your training data? As far as I know, shuffling a batch helps the model to converge fast and it prevents any bias during the training. Since you only use the encoder and slice fixed data samples, there is no temporal relationship between the samples anymore.
Can you share a little more insight about your training configuration of 1k stocks? I am struggling with the training, since most of the experiments result in straight line or very poor performance.
Have you tried the Spacetimeformer? It is a great idea to have cross attention on multivariate architecture.
Thanks for sharing your great works.
Best,
Vinh
If you are not shuffling ur files during training, it looks like that the last files that go into the generator have a lot of entries. What I can deriving from shape[3736448,256] is that ur are passing 3736448 sequences with a length of 256 into the model.
The 3736448 is the aggregated batch size of that file batch.
Just check whether u have very large file in ur dataset and potentially exclude it for now.
Originally posted by @JanSchm in #1 (comment)
Architecture question
Hi,
Got here from your article. I got a couple of quick questions on the choice of architecture for your model. Specifically, if we are using the transformer architecture, why are we aggregating the results with a pooling layer? The attention mechanism works best for sequential input and sequential output as it learns how each token is related to other tokens in the sequence. Aggregating all the token feels like it defeats the purpose of using a transformer model. I feel like using a mask with the decoder layer architecture will be more appropriate in this case?
Sorry if I am misunderstanding your approach and any clarifications will be greatly appreciated!
How to retrieve IBM dataset?
Hi,
since there is no reference to the datasets used for IBM: Am I correct that this refers to one of those downloadable from https://www.kaggle.com/borismarjanovic/price-volume-data-for-all-us-stocks-etfs or https://www.kaggle.com/jacksoncrow/stock-market-dataset?select=stocks?
Thanks in advance for clarification!
EDIT: I guess its not the same because the data only goes to 2017 or 2020-04-01 on the datasets. So it would be great if you could provide a source :)
hardware configuration used for execution of program
Sir,
Can you please tell what is the configuration and platform you used to execute the transformer code? This would be very helpful. please reply as soon as possible.
Question about residual connection
Got here from this article.
Firstly thank you so much about your work, giving insight for time series analysis beginners like me.
However, in the definition of TransformerEncoder
, I found in forward()
function, the residual connection is defined like this:
ff_layer = self.ff_normalize(x[0] + ff_layer)
According to the original paper (Attention is all you need), the residual connection should be defined in this way:
ff_layer = self.ff_normalize(attn_layer + ff_layer)
Any possibility that you could provide more explanation about the intuition behind this discrepancy?
question about data contamination
Training on multiple csvs
Hi, thanks for sharing your awesome work, I wanted to train the transformer model in multiple csvs which have the same time span, should I just contact them and make one big dataframe and train the model?
Thanks
Model predicting flat-line even after moving average implementation
Does Transformer Model perform better on more data? (How to avoid straight line prediction)
I know in the original post the author states that the straight line prediction is reasonable for this amount of data, but I wanted to ask if anyone has had success using this model in predicting more accurately than the straight line?
If so, what data was used? If not, what aspects of the model have others looked to change to create better predictions? data shuffling?
Thanks in advance,
Nate
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.