nourozr / stock-price-prediction-lstm Goto Github PK

OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network

Python 100.00%

deep-learning machine-learning neural-networks recurrent-neural-networks lstm-neural-networks algorithmic-trading machine-learning-for-trading machine-learning-for-finance

stock-price-prediction-lstm's People

Stargazers

Watchers

Forkers

aniruddh1 greatwallisme airob oppa3109 heyuhere andrealbh rohansaphal97 kamilpolak carlobernardi researchase chat19 spydermard sx5640 leoliveira00 theleveragedguess dandiestsquare1 oyjwhy tppppppppp a20180502 just4jc lonestar686 rohitbond teslaxhub quant108 aaron8tang abdullahmohammadkhan artit-d sahanduiuc saadmahboob currycham sgaikar1 vskynet kj2797 dfrsg ywlt dmdv lizihong gitouyou 4x iamxiamx markovchainmontecarlo xaviergoby brianasimba tanshinepan unclefanmaster sirocco77 phodu007 msteib boblee2000 aurazov shishir9kumar yahgoam faun777 vmancini tonylibing subaochen balamech92 mohammadsohail langgege-cqu ml-tina vshfrm scone-snu debdipta tobby2002 jiaodalpp amustapha ideaplexus mukulverma33 sovannit mkjiau stonemeisterw honchkrow kabiri-dev abedalbaset doken-tokuyama primekun mrcoderk derricksmith chengorangeju mohammadibrahim03 richardspaul98 yujun2019 gwliu213 yatingrewal elnazsn1988 dolik2019 yashyennam afrodeenn joe-nano yasderi quicksilverm25 rhyolithe nghia4007 muratyavas saifg1618 mervynwang samar-080301 jbris kwon-jh masknugget

stock-price-prediction-lstm's Issues

Measuring error

Nice work! I have a suggest:
You are testing the model with RMSE this way:

testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:,0]))

which is not a good metric to quantify the accuracy neither the performance of the model. Much more important that RMSE is the ability of the model to predict stock movements.
So I would like to see some metrics taking into account how much ups and downs are efficiently predicted. I suggest to implement recall and prediction measurements.

Mistake in Code

There is a big mistake in your code. In the StockPrediction.py , model.add('linear') is not right. We all know that the activation function have to be not linear, the linear activation will make the perdiction not right.

This model no longer produces accurate predictions.

I believe this may be an issue with keras / tensorflow updating, and not a 'code' problem.

absl-py==0.11.0
astunparse==1.6.3
cachetools==4.2.1
certifi==2020.12.5
chardet==4.0.0
cycler==0.10.0
flatbuffers==1.12
gast==0.3.3
google-auth==1.27.0
google-auth-oauthlib==0.4.2
google-pasta==0.2.0
grpcio==1.32.0
h5py==2.10.0
idna==3.1
joblib==1.0.1
Keras==2.4.3
Keras-Preprocessing==1.1.2
kiwisolver==1.3.1
Markdown==3.3.4
matplotlib==3.3.4
mplfinance==0.12.7a7
numpy==1.19.5
oauthlib==3.1.0
opt-einsum==3.3.0
pandas==1.2.2
Pillow==8.1.0
protobuf==3.15.3
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2021.1
PyYAML==5.4.1
requests==2.25.1
requests-oauthlib==1.3.0
rsa==4.7.2
scikit-learn==0.24.1
scipy==1.6.1
six==1.15.0
tensorboard==2.4.1
tensorboard-plugin-wit==1.8.0
tensorflow==2.4.1
tensorflow-estimator==2.4.0
termcolor==1.1.0
threadpoolctl==2.1.0
typing-extensions==3.7.4.3
urllib3==1.26.3
Werkzeug==1.0.1
wrapt==1.12.1

No errors of any kind, please advise.

TTM Squeeze Pro

TTM Squeeze Pro
{"m":"create_study","p":["cs_YcJ513Cvsuhb","st4","st1","sds_1","Script@tv-scripting-101!",{"text":"CJj4Ug3Yp5M7ZVNBdQXoRQ==_pHhbuDEwXDaCrSt+VDSNI9LL/zg2VhikNtpD6S8x8lri/FQ/Ko1DMDAYAC25z10cjq79NfyXj9qs+jiQMVCHm5rcQX6lirXbH9n+4RiT4An1nXIxb0nx6F++RV3uMAQa/9s2nygsfsh4KL1sKlc3e6ZcaXRqjSAemX74kYVPP/XSw2gEGYwAGXgVrb9NoWDi1it9eWk9scCPqvmsrgMMvNVVLmBOBhSShrCbA8B6NtIuuSiVw2sJpIvJHzqJuf6dBCo+7oIrIjX2mMhdWYfY+qL+xtu4p56QMFkSiI5RNtMnU/a8y09DfDA6W5+q/la23KcUu1awoRHH3leen8apCGTvMDtLFrJQ3m3cUYttpzu7YL/tuNlGzrnrhUzepe3BHjOjz5PFQK6MO0JyHSlvzUjLEa5JsUpWDBYgWF1UJkGMhu0yT2n87omqlm8zfp5hxhYl3BiCwUuVvfzHj56QmaGE5PaZBmhaci0DsPkUsHIh4atk7lXs7HMlz2sAxfOjWogEZo/cfWckWoQUEyLsEgAvB5hYrnGG9yofTVvKE90O3v5slNSo7lE/Df9iJUveUIzFQezCLUZTEGiUR0n8KVNwehGGRdTRnljSbbLLt7Y+wtYn2GQHEhMNzmufjP0LcZpySZfl4o2yIrOV1u1p7Ypg6PHK+ULBJccxQUmQq39hb6JPuAH4JTCtNBbWRR3PwCPtfK2p2dTn/h6Hsb542S1x1OtFr427I2Pi/cWxOEzuOTwHgsFHpSF4pawUyJ7bIlfEawpa5PwU3wfJX9/+xcllFfnvporcVTsa/GqI0CmDU/CiaFFIrFkVKZ0lgbdwbpKD6znXGllglWEt5kDBKw3P1F0vD75hhQoQOEzjsbeuGsOaNXuYtet8wSnJguhSCUIbcrH8yoN9frYEYe/GQsxAvPx9CSbRQSbvZEys28FOOHQNflD3oORqEx1HbER1xLpJ90VNVKo7OF43ul54Yne8jkcXz+7x3n5Q45ZQsfejoUPZYHIHRcEPTSFCReQDfPa/iM+JnW39Lq0mYJxQSQomf4L71c8GuWoEqGpZlFKkQYFWmg==","pineId":"PUB;7e9cf40f672c4ab88ac70c580a327870","pineVersion":"4.0","pineFeatures":{"v":"{"indicator":1,"plot":1,"ta":1,"math":1,"alert":1}","f":true,"t":"text"},"in_0":{"v":20,"f":true,"t":"integer"},"in_1":{"v":2,"f":true,"t":"float"},"in_2":{"v":1,"f":true,"t":"float"},"in_3":{"v":1.5,"f":true,"t":"float"},"in_4":{"v":2,"f":true,"t":"float"},"in_5":{"v":true,"f":true,"t":"bool"},"in_6":{"v":true,"f":true,"t":"bool"}}]}

PLEASE EXPLAIN THE REASON FOR IMPORTING PREPROCESSING

Hi. Thank you for this project.. Although its written in python 2.7. but there is a package named preprocessing (USED FOR TEXT PREPRPOCESSING) and cannot be install in python2. what can be suitable alternate function instead of using preprocessing.new_data()? preferably in scikit learn.
Thank you

Why this code.

last_val = testPredict[-1]
last_val_scaled = last_val/last_val
Why not just give
last_val_scaled=1

Wrong Prediction

Hi,

nice post but the results are wrong.

You are not predicting some days ahead but only one day ahead at a time.
You are not taking your prediction as input for your next prediciton, but you are taking the actual value.

This results in a lag of the actual signal, all your network has to do is produce a similar value to the last input of the price.

If you would take your prediction as the input for the next prediction you would see that the results are quite bad…

I see lot’s of LSTM price prediction examples but they all seem to be wrong and I don’t think it is possible to predict accuratly the next prices.

Add More Features?

This is awesome and helpful. I have been playing with it and have RMSE down on test data under 1. If I am reading the output right, it displays the previous days average you are using and then the next days prediction? Is there a way to look ahead more steps?

Also, you are taking the OHLC and turning it into a 1 value input. Is there a way to use this model to add more inputs such as features on top of the OHLC value?

Hi there can i ask a question?

First of all thank you for sharing your code. Im just looking for this type code (predict future price). Now im testing but i got wrong predicted values. All predicted values are increased. Why?

The outout predection is not good right

Last Day Value: 77.23725128173828
Next Day Value: 21.601646423339844

Issues with preprocessing

Hello ,

I am facing some issues with Preprocessing. When I a running the section with preprocessing this is what I get:

AttributeError: module 'sklearn.preprocessing' has no attribute 'new_dataset'

Here is the code of yours. Am I missing any steps?

#Edit Author: Ray

IMPORTING IMPORTANT LIBRARIES

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import math
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import LSTM
from sklearn import preprocessing # how to import preprocessing
import sklearn.preprocessing
import numpy as np

FOR REPRODUCIBILITY

np.random.seed(7)

IMPORTING DATASET

dataset = pd.read_csv('C:/Users/ray/Documents/Python Scripts/LSTM-Stock-prediction-master/apple_share_price.csv', usecols=[1,2,3,4])
dataset = dataset.reindex(index = dataset.index[::-1])

CREATING OWN INDEX FOR FLEXIBILITY

obs = np.arange(1, len(dataset) + 1, 1)

TAKING DIFFERENT INDICATORS FOR PREDICTION

OHLC_avg = dataset.mean(axis = 1)
HLC_avg = dataset[['High', 'Low', 'Close']].mean(axis = 1)
close_val = dataset[['Close']]

PLOTTING ALL INDICATORS IN ONE PLOT

plt.plot(obs, OHLC_avg, 'r', label = 'OHLC avg')
plt.plot(obs, HLC_avg, 'b', label = 'HLC avg')
plt.plot(obs, close_val, 'g', label = 'Closing price')
plt.legend(loc = 'upper right')
plt.show()

PREPARATION OF TIME SERIES DATASET

OHLC_avg = np.reshape(OHLC_avg.values, (len(OHLC_avg),1)) # 1664
scaler = MinMaxScaler(feature_range=(0, 1))
OHLC_avg = scaler.fit_transform(OHLC_avg)

TRAIN-TEST SPLIT

train_OHLC = int(len(OHLC_avg) * 0.75)
test_OHLC = len(OHLC_avg) - train_OHLC
train_OHLC, test_OHLC = OHLC_avg[0:train_OHLC,:], OHLC_avg[train_OHLC:len(OHLC_avg),:]

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
step_size = 1

FUNCTION TO CREATE 1D DATA INTO TIME SERIES DATASET

def new_dataset(dataset, step_size):
trainX, trainY = [], []
for i in range(len(dataset)-step_size-1):
a = dataset[i:(i+step_size), 0]
trainX.append(a)
trainY.append(dataset[i + step_size, 0])
return np.array(trainX), np.array(trainY)

TIME-SERIES DATASET (FOR TIME T, VALUES FOR TIME T+1)

trainX, trainY = sklearn.preprocessing.new_dataset(train_OHLC, 1)
testX, testY = sklearn.preprocessing.new_dataset(test_OHLC, 1)
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

RESHAPING TRAIN AND TEST DATA

trainX = np.reshape(train_OHLC, (train_OHLC.shape[0], 1, train_OHLC.shape[1]))
testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
step_size = 1

LSTM MODEL

model = Sequential()
model.add(LSTM(32, input_shape=(1, step_size), return_sequences = True))
model.add(LSTM(16))
model.add(Dense(1))
model.add(Activation('linear'))

MODEL COMPILING AND TRAINING

model.compile(loss='mean_squared_error', optimizer='adagrad') # Try SGD, adam, adagrad and compare!!!
model.fit(trainX, trainY, epochs=5, batch_size=1, verbose=2)

PREDICTION

trainPredict = model.predict(trainX)
testPredict = model.predict(testX)

DE-NORMALIZING FOR PLOTTING

trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])

TRAINING RMSE

trainScore = math.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train RMSE: %.2f' % (trainScore))

TEST RMSE

testScore = math.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
print('Test RMSE: %.2f' % (testScore))

CREATING SIMILAR DATASET TO PLOT TRAINING PREDICTIONS

trainPredictPlot = np.empty_like(OHLC_avg)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[step_size:len(trainPredict)+step_size, :] = trainPredict

CREATING SIMILAR DATASSET TO PLOT TEST PREDICTIONS

testPredictPlot = np.empty_like(OHLC_avg)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict)+(step_size*2)+1:len(OHLC_avg)-1, :] = testPredict

DE-NORMALIZING MAIN DATASET

OHLC_avg = scaler.inverse_transform(OHLC_avg)

PLOT OF MAIN OHLC VALUES, TRAIN PREDICTIONS AND TEST PREDICTIONS

plt.plot(OHLC_avg, 'g', label = 'original dataset')
plt.plot(trainPredictPlot, 'r', label = 'training set')
plt.plot(testPredictPlot, 'b', label = 'predicted stock price/test set')
plt.legend(loc = 'upper right')
plt.xlabel('Time in Days')
plt.ylabel('OHLC Value of Apple Stocks')
plt.show()

PREDICT FUTURE VALUES

last_val = testPredict[-1]
last_val_scaled = last_val/last_val
next_val = model.predict(np.reshape(last_val_scaled, (1,1,1)))
print "Last Day Value:", np.asscalar(last_val)
print "Next Day Value:", np.asscalar(last_val*next_val)

print np.append(last_val, next_val)

Fix me if i wrong

I think your predictions are delayed for one step and if you zoom in your graphic you'll see this
Predictions itself are just a data of the previous step with some lag

i don't think it's correct model
or maybe it's just a oddities of drawing....

please fix me if i wrong

Logic behind the code

Hello there,
I would like to know the logic behind the future prediction of the code snippets.
last_val = testPredict[-1] last_val_scaled = last_val/last_val next_val = model.predict(np.reshape(last_val_scaled, (1,1,1))) print ("Last Day Value:", np.asscalar(last_val)) print ("Next Day Value:", np.asscalar(last_val*next_val))