Saturday, February 28, 2015

eBay Pulsar

Pulsar is popular word for Bike riders in India.  In BigData industry, the term has different context.

This week, eBay announce Pulsar – an open-source, real-time analytics platform and stream processing framework. Pulsar can be used to collect and process user and business events in real time, providing key insights and enabling systems to react to user activities within second.

Pulsar uses a SQL-like event processing language to offer custom stream creation through data enrichment, mutation, and filtering. Pulsar scales to a million events per second with high availability. It can be easily integrated with metrics stores like Cassandra and Druid.

eBay business has few  large-scale, real-time analytics use cases as:

  • Real-time reporting and dashboards
  • Business activity monitoring
  • Personalization
  • Marketing and advertising
  • Fraud and bot detection

Traditional batch-oriented platform won't fit and so eBay Pulsar covers the demand collection and processing of vast numbers of events in near real time (within seconds), in order to derive actionable insights and generate signals for immediate action.

Pulsar Home Page:

Friday, February 20, 2015

Microsoft Cosmos

A new Microsoft data crunching framework is set to launch - Cosmos, which is a potential competitor to both Hadoop and eventually Google’s homegrown Dataflow.

Microsoft Cosmos is used extensively within the company to aggregate data from every major service into a shared pool. These services include Azure, Skype, and search engine Bing.

It is similar to MapReduce, the heart of Hadoop, as it uses a structured query interface. However, Cosmos has the additional ability of directed acyclic graphs (DAGs), a method of modeling to connect different kinds of information.

Prior to Cosmos, Microsoft’s homegrown alternative to Hadoop’s batch processing platform was developed until 2011 and was hailed as a potential Hadoop challenger.

Microsoft CEO Satya Nadella outlined the company’s path to deliver a platform for ambient intelligence at a past customer event, stressing a “data culture.”  It was inked in my earlier blog: