Monday, February 27, 2012

Azure Appfabric Cache: Difference between Partitioned Cache and Replicated Cache

Appfabric cache is something which is not new to distributed computing world. Microsoft released it as code named "Velocity" on Windows Server 2008 family few years back and Azure Appfabric caching is more like extension of this on Azure Platform.

What is Appfabric Cache?
Before getting into different types of Distributed Cache, let us quickly understand the definition of Distributed Cache.

distributed caching is a form of caching that allows the cache to span multiple servers so that it can grow in size and in transactional capacity. Distributed caching has become feasible now for a number of reasons. First, memory has become very cheap, and you can stuff computers with many gigabytes at throwaway prices. Second, network cards have become very fast, with 1Gbit now standard everywhere and 10Gbit gaining traction. Finally, unlike a database server, which usually requires a high-end machine, distributed caching works well on lower cost machines (like those used for Web servers), which allows you to add more machines easily.

Distributed caching is scalable because of the architecture it employs. It distributes its work across multiple servers but still gives you a logical view of a single cache. For application data, a distributed cache keeps a copy of a subset of the data in the database. This is meant to be a temporary store, which might mean hours, days or weeks. In a lot of situations, the data being used in an application does not need to be stored permanently. In ASP.NET, for example, session data is temporary and needed for maybe a few minutes to a few hours at most.

Difference between Partitioned Cache and Replicated Cache:
Appfabric Cache supports three types of caching techniques.
1>Partitioned Cache
2>Replicated Cache
3>Local Cache.

I'm going to ignore Local Cache in this topic as this is something which is known to computing world for ages now.

To understand the difference between Partitioned Cached and Replicated Cache. let us imagine Azure Platform having a Cache Custer with 10 Cache Hosts (Servers).
now if you have 100 CLR objects which needs to be cached on this Cache Cluster.

Partitioned Caching will store 10 CLR objects each on 10 Cache Host and Application can do Get and Put operation of Cluster and Cluster in tern picks cached data from the host on which it stored.

Advantage: Partitioned Cache can store huge amount of data as data is distributed across the hosts. I.e, if you have 1GB of Data to be cached and if the cluster has 10 hosts, then each host will be storing 100 MB of data. Put operation will be fast.

Disadvantage: If any of the host server goes down, then data will be lost. in general High Availability is not possible and GET operation is going to be slow.

Replicated Caching will store all the 100 objects on each of the 10 Cache host on the above scenario. In general each cache host will be having replica of all the Cache Data. I.e, if you have 1 GB of data to be cached and if the cluster has 10 host, then each host will be having complete 1 GB of data. This option provides high availability but it comes will lot of memory cost. Put option is going to be slow as data has to be synchronized across all the Hosts.

No comments:

Post a Comment