@aryan_singh1993 wrote:
I am stuck on a SVM regression problem. Please help.
I have trained a SVR model using scikit learn that predicts the future price of bitcoin by using its closing price on previous dates. I have converted date into delta from the first available date using the following function:
btc['Date'] = pd.to_datetime(btc['Date']) btc['date_delta'] = (btc['Date'] - btc['Date'].min()) / np.timedelta64(1,'D')My dataframe’s head looks something like this:
date_delta Close 1654.0 7144.38 1653.0 7022.76 Then I do split into test and training dataset as follows:
msk = np.random.rand(len(btc_select)) < 0.8 btc_train = btc_select[msk] btc_test = btc_select[~msk]and do min max scaling of the dataset before training the model as follows:
from sklearn.preprocessing import MinMaxScaler scaler = MinMaxScaler() scaler.fit(btc_train) btc_train = scaler.transform(btc_train) btc_test = scaler.transform(btc_test)My model is trained using the following function and I find polynomial kernel gives the best result:
def predict_prices(dates_train, prices_train, dates_test, price_test): dates_train=np.reshape(dates_train, (len(dates_train),1)) dates_test=np.reshape(dates_test, (len(dates_test),1)) svr_lin = SVR(kernel='linear', C=1e3) svr_poly = SVR(kernel = 'poly', C=1e3, degree=8) svr_rbf = SVR(kernel='rbf', C=1e3, gamma=0.8) svr_lin.fit(dates_train,prices_train) svr_poly.fit(dates_train,prices_train) svr_rbf.fit(dates_train,prices_train) plt.figure(figsize=(14,10)) plt.scatter(dates_train, prices_train, color='black', label='Data') plt.plot(dates_train, svr_rbf.predict(dates_train), color='red', label='RBF model') plt.plot(dates_train, svr_lin.predict(dates_train), color='green', label='Linear model') plt.plot(dates_train, svr_poly.predict(dates_train), color='blue', label='Polynomial model') plt.xlabel('Date') plt.ylabel('Price') plt.title('Support Vector Regression') plt.legend() plt.show() print('Lin Score:', svr_lin.score(dates_test, price_test)) print('Poly Score:', svr_poly.score(dates_test, price_test)) print('Rbf Score:', svr_rbf.score(dates_test, price_test)) scores = cross_val_score(svr_poly, dates_train, prices_train, cv=6, scoring='neg_mean_squared_error') accuracy = metrics.r2_score(price_test, svr_poly.predict(dates_test)) print('R-Squared Value for the Polynomial Kernel:', accuracy) print('Cross Validation Mean Squared Error for the Polynomial Kernel:', scores) return svr_polyI got the following accuracy and cross validation scores:
Lin Score: 0.3290332147578777 Poly Score: 0.8724266575682722 Rbf Score: 0.836449334307112 R-Squared Value for the Polynomial Kernel: 0.8724266575682722 Cross Validation Mean Squared Error for the Polynomial Kernel: [-0.13853584 -0.00069995 -0.00043713 -0.00041959 -0.00341142 -0.00352207]But when I try to predict the btc price for a datapoint after transforming the date_delta and inverse transforming the predicted output the results are way off. Need help as to what is going wrong.
transform_inp = scaler.transform([[1654.0,0.0]]) transform_inp[0,0] 1.000604960677556 predicted_val = model.predict(np.array(transform_inp[0,0])) predicted_val array([0.73674025])Now doing the inverse transform I get the following:
scaler.inverse_transform([[predicted_val[0],0]]) array([[1217.83164131, 68.43 ]])The output is 1217 USD which is way off from the actual price of 7144 USD. Can you please tell me what is wrong here?
Posts: 1
Participants: 1