Yahoo! is enhancing its web properties and mobile applications to provide its users personalized experience based on interest profiles. To compute user interest, we process billions of events from our over 700 million users, and analyze 2.2 billion content every day. Since users' change interest over time, we need to update user profiles to reflect their current interests.
Enabling low-latency big-data processing is one of the primary design goals of Yahoo!’s next-generation big-data platform. While MapReduce is a key design pattern for batch processing, additional design patterns will be supported over time. Stream/micro-batch processing is one of design patterns applicable to many Yahoo! use cases.
Yahoo! big-data platform enables Hadoop applications and Storm applications to share data via shared storage such as HBase. Yahoo! engineering teams are developing technologies to enable Storm applications and Hadoop applications to be hosted on a single cluster.