Monday, June 27, 2016


Apache Zeppelin is an open source GUI which creates interactive and collaborative notebooks for data exploration using Spark. You can use Scala, Python, SQL (using Spark SQL), or HiveQL to manipulate data and quickly visualize results.

Zeppelin notebooks can be shared among several users, and visualizations can be published to external dashboards. Zeppelin uses the Spark settings on your cluster and can use Spark’s dynamic allocation of executors to let YARN estimate the optimal resource consumption.

To run the prediction analysis, you need to create notebooks that generate prediction % and are scheduled to run daily. As part of the prediction analysis, we needed to connect to multiple data sources, like MySQL and Vertica for data ingestion and error rate generation. This enabled us to aggregate data across multiple dimensions, thus exposing underlying issues and anomalies at a glance.

Using Zeppelin, we applied many A/B models by replaying our raw data in AWS S3 to generate different prediction reports, which in turn helped us move in the right direction and provide better forecasting.

Zeppelin helps us to turn the huge amounts of raw data, often from across different data stores, into consumable information with useful insights.

Slide share reference is available at

Wednesday, June 15, 2016

Amazon Lambda

What is AWS Lambda?
AWS Lambda is a compute service where you can upload your code to AWS Lambda and the service can run the code on your behalf using AWS infrastructure.

AWS Lambda runs your code on a high-availability compute infrastructure and performs all of the administration of the compute resources, including server and operating system maintenance, capacity provisioning and automatic scaling, code monitoring and logging.

All you need to do is supply your code in one of the languages that AWS Lambda supports (currently Node.js, Java, and Python)

Best Use Cases
AWS Lambda executes your code only when needed and scales automatically, from a few requests per day to thousands per second.

Itz an ideal compute platform for many application scenarios, best 5 use cases are listed as below:

Production Site is well known as the serverless start-up, entirely built around AWS, but leveraging only the Amazon API Gateway, Lambda functions, DynamoDb, S3 and Cloudfront.

This is not only a really scalable solution with an almost infinite peak capacity, but also a very cheap solution (< $90K) as per their publication.

On back of my mind, Martin Fowlers' Infrastructure As Code, is bumping.  He said "Infrastructure as code is the approach to defining computing and network infrastructure through source code that can then be treated just like any software system."
Awesome experience on learning the emerging technology and its business value.

Saturday, June 4, 2016

I/O 2016

As I wrote in my weekly blog @ last month, Google I/O has just completed with some exciting industry updates.  Letz see what are those new offers?

Google I/O 2016, Alphabet’s annual developer conference, has wrapped up for the year between May 18-20 at the Shoreline Amphitheater in Mountain View, California. This year, the announcements came thick and fast but many were not quite ready for general demoing or product release, so we’ll have to wait a little longer to see them in action.

Herez few major announcements from Google I/O 2016, interested to me.

Google Assistant
Assistant is a kind of revamped version of Google Now. Itz a beefed-up voice assistant that can help you do much more than ever before.

It can now provide users with more natural two-way conversations and do much more than just schedule events and perform quick Google searches.

Google Home
Home is the central hub for connecting Google Assistant with your connected home, smart devices and more. Itz a direct competitor to the Amazon Echo.

Like Echo, Google Home integrates a built-in bluetooth speaker and microphone all in small, sleek package. Home is meant to be your hub to all thing Google, right from your living room, kitchen, or wherever you place.

Google Allo
Allo is a new telephone number-based messaging app, which has 3 main aspects as self-expression, Assistant integration, and security privacy.

As Allo features end-to-end encryption, incognito chats, private notifications and expiring chats, it will be the “first home for Google Assistant”.

Google Duo
Duo is a new cross-platform video calling app, companion of Allo. Duo lets you initiate fast one to one video calls.

Duo is a companion video chat app that allows you to see the person calling you – in a live video preview – before you even answer the call.

Daydream VR (Virtual Reality)
Itz Google’s vision for an affordable mobile virtual reality platform. The first Daydream-ready devices will arrive later this year.

Android N will feature a VR Mode for putting chip sets into “performance mode”, add head-tracking algorithms, support sub-20 ms latency on mobile devices and render incoming messages and calls in 3D to appear in stereo.

On the software front, Daydream provides a standard VR interface for mobile and adds a VR category to the Play Store.

Google Chrome OS
Contrary to the rumors that Chrome OS would be folded into Android, Google officially denied the claim.

After tackling wearables, TVs, and autos, Google is in need of a real computer operating system. So, Chrome OS will live on in some form, even if it’s just functionality integrated into Android.

Android "N"
Itz the code name of an upcoming release of the Android OS (Operating System). It has few key features like multi window support, direct reply notification, data saver, picture in picture & no device flashing. Release schedule of N will be:

Watch List
Complete set of Google I/O 2016 videos are available at YouTube series. Enjoy it at

Stay tuned for more on all the development from Google I/O