Quantcast
Channel: Data Science, Analytics and Big Data discussions - Latest topics
Viewing all articles
Browse latest Browse all 4448

Help required to choose processing architecture

$
0
0

@pathardepavan wrote:

I am trying to put together a solution for below mentioned use case. We are an aws mobile analytics customer generating more than 2 million events at peak hours in a day with half a million users. This event data is exported to S3 on a hourly basis.

Now we want to process this data and create our own tables on a hourly basis and simultaneously scale with increase in users count. We should be at least be able to process 20-40 Million events per hour. Data is stored as compressed file containing json per line.

  1. What is the best data store to create the tables ?
  2. What is the best architecture to process the data ? Is the last solution proposed at Link:http://theburningmonk.com/2016/04/aws-lambda-use-recursive-function-to-process-sqs-messages-part-1/ using recursive lambda functions good ? Please suggest
  3. We need to be able to store all our data in one platform which can be used by business applications. Redshift seems slow for us with 300 GB Data and 4 nodes (dc1.large).

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 4448

Trending Articles