@mtare wrote:
Hi ,
I have around 100 GB of log data in CSV format and I wish to do Exploratory analysis on this data. As pandas loads data in memory, I am looking for possible alternatives. I have tried Graph Lab's Sframe on my 8GB RAM machine, but it takes too much time to process a subset of data. Another alternative is using Spark Data frame or a MPP database ?Can you please suggest best approach for handling the above amount data? Also as the data set is large, to visualize the data, what viz libraries can be used ?
Posts: 5
Participants: 2