Saturday, July 9, 2011

Azure Table


Windows Azure Table storage causes a lot of confusion/query among developers becaz of their traditional RDBMS work experience. Most of their experience with data storage is with relational databases that have various tables, each containing a predefined set of columns, one or more of which are typically designated as identity keys

By design, Windows Azure Table services provides the potential to store enormous amounts of data, while enabling efficient access and persistence. A table doesn’t have a specified schema. It’s simply a structured container of rows/entities either stores one particular type or rows with varying structures in a single table, as shown in the attached image. Windows Azure Tables use keys that enable efficient querying and load balancing when the table service decides time to spread your table over multiple servers. How?

In Windows Azure Tables, the string PartitionKey and RowKey properties work together as an index for the table, so when defining them, you must consider how your data is queried. Each entity in a table must have a unique PartitionKey/RowKey combination, acting as a primary key for the row. PartitionKey is used to fetch the result from query. Also used for physically partitioning the tables, which provides for load balancing and scalability. Letz watch an illustration.

Letz considern EmployeeMaster and has PartitionKeys that correspond to job designation types, such as Director, Manager, Developer, etc. During DeveloperSummit, the rows in Developer partition might be very busy (hope of having niche developers!). The service can load balance EmployeeMaster table by moving the Developer partition to a different server for better handling the many requests made to that partition. If you anticipate more activity on that partition than a single server can handle, you should consider creating more-granular partitions such as DeveloperJunior and DeveloperSenior. This is because the unit of granularity for load balancing is the PartitionKey.

Isn't it Cool for load balancing and scalability? Cloud storage developers just deal with data, data, data.. BigData !

2 comments:

  1. It is realy nice to know the importance of PartitionKey in a cloud environment. In what way the cloud db is going to give more power than a traditional db? While creating the partifion for developer junior and senior should the site be brought offline or can it be performed while the db is online?

    ReplyDelete
  2. Partition key management is highly dynamic in cloud environment. Users of cloud-based storage should assess the responsiveness, availability, and scalability of their hosting service. In terms of cost, it reduces the cost of owning and managing storage. Thatz power of cloud DB.

    ReplyDelete