Quantcast
Viewing all articles
Browse latest Browse all 4448

Machine Learning_Predict Stock Price with R

@richie31 wrote:

Hello All,

I have studied a year data of PNB stock Prices in Nifty index in terms of patterns, seasonal components and have done a regressional analysis to predict the stock price of PNB with 95% confidence interval.

PNB stock price has fallen from Rs 160 to Rs 80 in one year time. I have not dig into the data in order to find out the reason for such a sharp fall. Instead, I performed a regression analysis to see the relationship among the data.
I have considered two variables to predict the high price of stock (PNB), one is each day’s open price and last day’s unit of traded quantity of the stocks.
In order to select these two variables, I have seen the relative importance of the variable in the final outcome( which is high price). Below is the bar graph for the relative importance of Total Traded Quantity & Open Price

Note: This code was provided by Dr. Johnson in the year 2000. This was adapted from SPSS program.

R script for Relative importance,

Library(relweights)
f<-read.csv("SP.csv")
fit_High.Price<-lm(High.Price~Open.Price+Total.Traded.Quantity,data=f)
relweights(fit_High.Price,col="Brown")

Output

Weights
Open.Price 95.185287
Total.Traded.Quantity 4.814713

It is quite evident from the picture that there is a strong correlation between open price and High price of the stock on a particular day. We will also include the total traded quantity of the stock as we know that this also influences the stock price due to demand and supply of the stock (liquid stock in the market). Higher the liquidity, lower is the stock price and lower the liquidity means speculation and pushing the stock price.

If you compare the below two graphs, it is evident whenever the liquidity of the stock is low, High price of the stock shot up. Therefore, there is a negative relationship between the High Price of the stock and Liquidity of the stock.

In order to get the below graphs, below is the R script,

Note: Please install below packages,

a) Install.packages(“zoo”)
b) Install.packages(“dygraphs”)
c) Install.packages(“lubridate”)

R-script for Total Quantity Traded
Library(lubridate)
x<-mdy(f$Date) # f is the data frame
library(zoo)
e<-zoo(f$Total.Traded.Quantity,x)
dygraph(e, main = "Total Traded Quantity from feb 2015 to Apr 2016") %>%
dyRangeSelector(dateWindow = c("2015-02-19", "2016-04-25"))

R-script for High Prices (PNB)
Library(lubridate)
x<-mdy(f$Date) # f is the data frame
library(zoo)
d<-zoo(f$High.Price,x)
dygraph(d, main = "Punjab National Bank(High Price)") %>%
dyRangeSelector(dateWindow = c("2015-02-19", "2016-04-25"))

Regression Analysis of PNB Stock Price

R Script for the lm method,

fit_High.Price<-lm(High.Price~Open.Price+Total.Traded.Quantity,data=f)
summary(fit_High.Price)

Call:
lm(formula = High.Price ~ Open.Price + Total.Traded.Quantity,
data = f)

Residuals:
Min 1Q Median 3Q Max
-7.8894 -1.0822 -0.3018 0.8450 11.1223

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.911e+00 6.815e-01 -4.272 2.63e-05 ***
Open.Price 1.027e+00 4.440e-03 231.372 < 2e-16 ***
Total.Traded.Quantity 2.304e-07 3.102e-08 7.427 1.23e-12 ***


Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.978 on 291 degrees of freedom
Multiple R-squared: 0.9954, Adjusted R-squared: 0.9953
F-statistic: 3.123e+04 on 2 and 291 DF, p-value: < 2.2e-16

Actual High Price vs Estimated High Price

Library(ggplot2)
f<-read.csv("SP.csv")
ggplot(data=f,aes(High.Price,fitted(fit_High.Price)))+geom_point(aes(size=residuals(fit_High.Price)),color="Black")+geom_smooth(aes(Open.Price,High.Price),color="White")

Prediction of the Stock Price(PNB)

R-Script for the prediction,
newdata=data.frame(Open.Price=90.15,Total.Traded.Quantity=7516256)
predict(fit_High.Price,newdata,interval="confidence")

Prediction for 27th April, 2016

fit_High Price Lower limit Upper Limit
91.43 91.03 91.81

The Prediction for High price for PNB on 27th April, 2016 is Rs 91.43 /-.

Machine Learning

Please install caret package for this particular script,
require(caret)
mod.High.Price.lm <- train(High.Price ~ Open.Price + Total.Traded.Quantity, data = f, method = "lm")
coef.icept <- coef(mod.High.Price.lm$finalModel)[1]
coef.slope <- coef(mod.High.Price.lm$finalModel)[2]

plot the data

ggplot(data = f, aes(x = Open.Price, y = High.Price))+geom_point()+geom_abline(intercept = coef.icept, slope = coef.slope, color = "red")

End Note:

I have not included any seasonal component in the prediction of the Stock Price, which actually will reduce the residual in significant manner and the probability of the predict price will be much accurate.
I request readers to comment and suggest on a model which includes time series analysis along with Regression analysis.

Thanks
Aritra ChatterjeeSP.csv (14.9 KB)

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles