Kafka is a scalable, distributed, durable and fast distributed messaging system. It can scale to handle hundreds of megabytes, read-write operations from several thousand clients, per second. Kafka presents a very robust solution for messaging abstractions, allowing for a highly scalable data sol...Read more.
Amazon Redshift is a petabyte-scale data warehouse service from AWS, it is highly scalable and cost effective. In the recent times, AWS team has come up very rapidly with new features and capabilities. We have used Redshift in our solutions where we wanted to support analytics workload on a very ...Read more.
For our analytics solution, we have relied on Kinesis to handle massive amount of real time streaming data. One of our solution requires a realtime dashboard on a very busy media site, which can have a few thousand concurrent users at any point of time. Kinesis is a major part of this puzzle, bec...Read more.
When it comes to realtime analytics, nothing comes close to the feature-set provided by Storm. It provides a way to have continuous computation on the incoming streams of data, making it possible to incrementally implement realtime analytics. Along with the ETL capabilities, the fact that it supp...Read more.
Apache Spark is an open source cluster computing framework by Apache Software Foundation. Spark's in-memory capabilities enable very fast performance in specific use cases. The most suitable use case for Apache spark is where the data can be loaded into cluster's memory and then queried multiple ...Read more.