Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Converting data frame into Time Series using R

$
0
0

@abishekpoddar wrote:

I have a Time Series data of the format

                Time Ask Bid Trade Ask_Size Bid_Size Trade_Size
2016-11-01 09:00:12  NA 901    NA       NA      100         NA
2016-11-01 09:00:21  NA  NA   950       NA       NA          5
2016-11-01 09:00:21  NA 950    NA       NA        5         NA
2016-11-01 09:00:21 905  NA    NA       10       NA         NA
2016-11-01 09:00:24  NA 921    NA       NA      500         NA
2016-11-01 09:00:28  NA 879    NA       NA        2         NA

The structure of the dataframe is

 str(df)

'data.frame':   35797 obs. of  7 variables:
$ Time      : POSIXct, format: "2016-11-01 09:00:12" "2016-11-01 09:00:21" ...
$ Ask       : num  NA NA NA 905 NA NA 1040 NA NA 905 ...
$ Bid       : num  901 NA 950 NA 921 879 NA NA 950 NA ...
$ Trade     : num  NA 950 NA NA NA NA NA 950 NA NA ...
$ Ask_Size  : num  NA NA NA 10 NA NA 6 NA NA 10 ...
$ Bid_Size  : num  100 NA 5 NA 500 2 NA NA 5 NA ...
$ Trade_Size: num  NA 5 NA NA NA NA NA 5 NA NA ...

I am trying to convert it to Time Series using the code

library(zoo)
library(xts)
library(lubridate)

df_ts <- xts(x = df, order.by = df$Time)

but am getting weird output as

                Time                        Ask       Bid      Trade Ask_Size Bid_Size Trade_Size
2016-11-01 01:00:03 "2016-11-01 01:00:03"   NA        "938.10" NA    NA       " 203"   NA
2016-11-01 01:00:04 "2016-11-01 01:00:04"   NA        "937.20" NA    NA       " 100"   NA
2016-11-01 01:00:04 "2016-11-01 01:00:04" " 938.00"    NA       NA    "  28"   NA       NA
2016-11-01 01:00:04 "2016-11-01 01:00:04"   NA        "938.10" NA    NA       " 203"   NA
2016-11-01 01:00:04 "2016-11-01 01:00:04" " 939.00" NA       NA    "  11"   NA       NA
2016-11-01 01:00:05 "2016-11-01 01:00:05"   NA        "938.15" NA    NA       "  19"   NA

The time in the column "Time" is appearing twice and also the starting time is from 1:00 pm. The order of the time is not as per the original dataformat. (The starting time of the original dataframe is from 9:00 am). Please help.

Posts: 2

Participants: 2

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles