@AbhishekHP wrote:
Please consider the sample dataset below.
In simple terms,
Sensor is defective and hence measured incorrect values since 2000 and we have the data for 10 years with both: measured and actual.P.S. Although we dont have data for each combination of the application and sensor type on monthly basis.
Now, we want to have the actual from the algorithm for actual values.
We tried, XGBoost and CatBoost by creating another column named diff = measured- actual
and fed to the algorithm to identify the pattern. but not sure which algorithm is appropriate although suspecting Neural network or Time series (ARIMA) could work but not sure
because we have just 10 years data on monthly levellibrary(tidyverse) train_data <- data.frame( time = c(rep("01.2000",10),rep("02.2000",10),rep(".",3),rep("11.2010",10),rep("12.2010",10)), application = c(rep("factory",4),rep("residential",3),rep("research",3), rep("factory",2),rep("residential",5),rep("research",3), rep(".",3), rep("factory",2),rep("residential",2),rep("research",6), rep("factory",7),rep("residential",1),rep("research",2)), sensor = c(LETTERS[1:10],LETTERS[10:1],rep(".",3),LETTERS[c(5:1,10:6)],LETTERS[c(3:9,2,1,10)]), measured = c(26.4,2000,1001,23.9,100000,0,1234,12098,34567,0, 123,676,12,0,100,0,0,98,1,190, rep(".",3), 3454,0,101,9,1,0,14,1298,677,0, 264,20220,1851,3.9,1044,0,1764,0,34,0), actual = c(26.4,2010,1001,23.9,100100,237,1234,12098,34567,19583, 123,706,1112,156,100,650,109,98,10,190, rep(".",3), 3454,10,101,19,10,40,44,1298,760,50, 264,20220,1851,39,1048,870,1765,40,35,1110) ) # to forecast actual test_data <- data.frame( time = rep("01.2011",10), application = c(rep("factory",7),rep("residential",1),rep("research",2)), sensor = LETTERS[c(1,4,5,9,3,2,8,6,7,10)], measured = c(26.4,100000,0,0, 123,12, 3454,0,20220,1851) ) How can we predict/forecast the actual values for 01.2011 data (test_data) ?
Posts: 1
Participants: 1