Sunday, January 14, 2018
Monday, January 1, 2018
AWS Training and Certification recently released free digital training courses that will make it easier for you to build your cloud skills and learn about using AWS Big Data services. This training includes courses like Introduction to Amazon EMR and Introduction to Amazon Athena.
You can get free and unlimited access to more than 100 new digital training courses built by AWS experts at aws.training. It’s easy to access training related to big data. Just choose the Analytics category on our Find Training page to browse through the list of courses. You can also use the keyword filter to search for training for specific AWS offerings.
Reference link: https://www.aws.training/
Just getting started, or looking to learn about a new service? Check out the following digital training courses:
Introduction to Amazon EMR (15 minutes)
Covers the available tools that can be used with Amazon EMR and the process of creating a cluster. It includes a demonstration of how to create an EMR cluster.
Introduction to Amazon Athena (10 minutes)
Introduces the Amazon Athena service along with an overview of its operating environment. It covers the basic steps in implementing Athena and provides a brief demonstration.
Introduction to Amazon QuickSight (10 minutes)
Discusses the benefits of using Amazon QuickSight and how the service works. It also includes a demonstration so that you can see Amazon QuickSight in action.
Introduction to Amazon Redshift (10 minutes)
Walks you through Amazon Redshift and its core features and capabilities. It also includes a quick overview of relevant use cases and a short demonstration.
Introduction to AWS Lambda (10 minutes)
Discusses the rationale for using AWS Lambda, how the service works, and how you can get started using it.
Introduction to Amazon Kinesis Analytics (10 minutes)
Discusses how Amazon Kinesis Analytics collects, processes, and analyzes streaming data in real time. It discusses how to use and monitor the service and explores some use cases.
Introduction to Amazon Kinesis Streams (15 minutes)
Covers how Amazon Kinesis Streams is used to collect, process, and analyze real-time streaming data to create valuable insights.
Introduction to AWS IoT (10 minutes)
Describes how the AWS Internet of Things (IoT) communication architecture works, and the components that make up AWS IoT. It discusses how AWS IoT works with other AWS services and reviews a case study.
Introduction to AWS Data Pipeline (10 minutes)
Covers components like tasks, task runner, and pipeline. It also discusses what a pipeline definition is, and reviews the AWS services that are compatible with AWS Data Pipeline.
Wednesday, December 20, 2017
Interesting read on 5 Overlooked Opportunities in Agile Estimation
- Learn from Others
- Identify Alternatives
- Validate the Story
- Improve Teammates’ Estimating Ability
- Reinforce Scrum Values
"The reward for work well done is the opportunity to do more."
Merry Xmas and Happy New Year 2018.
Sunday, November 26, 2017
As indicated in early Jul'17, Microsoft now officially leveraged Apache Spark in Azure HDInsight. Early Ref: http://www.zdnet.com/article/spark-comes-to-azure-hdinsight/
Last week, Microsoft announced Azure Databricks service, new Cosmos DB features, enterprise AI capabilities and more at its annual Connect(); event in New York
Microsoft is getting the Apache Spark religion, introducing a new cloud service in preview, called Azure Databricks. This is noteworthy for a number of reasons.
First, the service was developed jointly by Microsoft and Databricks (the company whose founders are Spark's very creators), to deliver this Spark-based Big Data analytics service as a first-party Azure offering, and not a mere partner service on the Azure Marketplace.
Second, the service works independently of Databricks' own cloud service for Spark and of Azure HDInsight, Microsoft's own Big Data as a Service platform, on which Spark also runs.
Azure Databricks has nonetheless been designed form the ground up to take advantage of, and be fully optimized for, various Azure services, including blob storage, Data Lake Store, virtual networking, Azure Active Directory and Azure Container Service.
While Azure Databricks, like HDInsight, is still based on the creation a dedicated cluster, with the number and type of nodes (servers) being determined by the customer, it nonetheless has built-in auto-scaling and auto-termination, to grow the cluster as necessary and shut it down once it's no longer needed.
Sunday, November 19, 2017
Mid of last week, ElasticSearch 6 GA (General Availability) was released with tech upgrades like
- migration assistant
- index sorting.
Saturday, November 11, 2017
Kinesis Analytics now gives you the option to preprocess your data with AWS Lambda. This gives you a great deal of flexibility in defining what data gets analyzed by your Kinesis Analytics application. You can also define how that data is structured before it is queried by your SQL.
It continuously reads data from your Kinesis stream or Kinesis Firehose delivery stream. For each batch of records that it retrieves, the Lambda processor subsystem manages how each batch gets passed to your Lambda function. Your function receives a list of records as input. Within your function, you iterate through the list and apply your business logic to accomplish your preprocessing requirements (such as data transformation)
The input model to your preprocessing function varies slightly, depending on whether the data was received from a stream or delivery stream
Saturday, October 21, 2017
Last week, an interesting industry news that Microsoft and Amazon announced a surprise partnership to build AI platform (named Gluon) for an enterprise. Gluon makes it easier for developers to build AI/machine learning systems, and related Apps with open source concept.
I've an interesting dimension of this technology partnership to challenge Google's big area of AI dominance using Tensorflow.
Google TensorflowGoogle already has a head start with a tool called Tensorflow, which is free and open source and aimed at helping developers build machine learning apps. Tensorflow is immensely popular with developers.
In fact, it's the the fifth most popular project (by stars) on GitHub out of the over 2 million hosted on that site where open source projects are shared. Quick introduction video is shown at
Amazon MXNetNaturally, Amazon has a competitor to Tensorflow called MXNet. Deep learning on AWS with MXNet, is shown at https://www.youtube.com/watch?v=emDxDLI9FRw
Microsoft CNTKMicrosoft has a competitor tool for Tensorflow, called CNTK (Cognitive Tool Kit). Microsoft's open source deep-learning toolkit is shown at https://www.youtube.com/watch?v=Rb9K10JwR2g
Machine learning and AI are the next big things in cloud computing, with the potential to cause significant changes to the cloud business that Amazon and Microsoft have long dominated.
Microsoft and Amazon have been known to cuddle up on other AI types of tech.
In August, the two announced they were partnering to make their two voice assistants work better together , Amazon Alexa and Microsoft Cortana.
Joint Venture GluonMicrosoft and Amazon have joined forces to help spread artificial intelligence across apps. They released a new tool for developers called Gluon as a free and open source project, meaning anyone can use it or work on it and contribute to it for free.
Gluon's role is to add a layer that makes MXNet and CNTK easier to use, work with and program. Only the MXNet version was released now; but the CNTX version of Gluon is promised to come soon. Short introduction is shown at https://www.youtube.com/watch?v=NxAYuw5QQ_8
Ease for AI DevelopmentIn any case, the competition to create more AI tools for developers, and make them easier to use. Demand of Artificial Intelligence in various industries, are reflected in 3 years scorecard