NCache provides a rich set of clustering topologies to let you pick the one that suits your requirements best. Please note that NCache clustering is not the same as Windows Clustering. NCache forms its own cache-level cluster by using either UDP Multicasting or TCP protocols and can do so on Windows XP, 2000, and 2003 platforms (standard, professional, and server editions).
NCache clustering topologies for object caching are many. However, session clustering is a specific application of clustered object caching and therefore may require a different clustering preference as compared to object caching. For example, object caching usually has a master data source (relational database or a mainframe) but sessions don't. Therefore, sessions must have duplication of data in order to prevent any loss of session data. NCache can easily handle all of this. Below are the recommended topologies for session clustering with NCache.
Please note that these are only a subset of all the clustering topologies that NCache provides for object caching. You can find out more about all the clustering topologies that NCache provides by clicking at the link below:
Object Caching Clustering Topologies
Generally speaking, if your web farm contains 2-6 web servers (6 is not a hard maximum but a general guideline), then your best option is to create a replicated session cache on your web servers. This means that your web servers are now nodes in the cache cluster. With this option, you'll always have sessions available local to each web server.
Additionally, if you are configuring your ASP.NET environment to only have one worker process per web server then you can even access your sessions InProc which would further speed up session access because there is no inter-process communication required. However, please note that this is only possible when you have only one ASP.NET worker process. Otherwise, you'll have to access session as OutProc on each web server. Below is a diagram showing how this would look like.

Generally speaking, if your web farm contains 6 or more web servers (6 is not a hard minimum but a general guideline), then your best option is to create a separate session caching tier and access it remotely from your web servers. If you don't have a lot of session data to cache, then you can even make some of your web servers cache the session data and have others access it remotely. Otherwise, you could have multiple dedicated servers caching sessions and then have your web servers access them remotely.
When setting up separate caching tier, you have the option of either choosing replicated cache or partitioned cache with replicas. Partitioned cache is good if you feel you have so much session data for the entire web farm that it could not fit on one cache server and the only option is to partition it into multiple nodes. Otherwise, you can keep replicated session cache.
So, how many servers should use for caching sessions in a separate tier? The best way to answer this is to first determine how many web servers you have. If you have 10-20 web servers, you could probably get by with 3-5 caching servers in replicated cache. For partitioned cache, you would need as many servers as it takes to partition your session data to it would fit on one box plus equal number of additional servers to keep backup data. If you have enough servers, you can use them as backups of each other or you can have dedicated backup servers. NCache calls these backup servers "replicas" since they are also active servers.

NCache allows a remote client to connect to a specific caching server but then provides the ability to automatically connect to another server if the first caching server goes down. This reconnection is totally transparent to the client and NCache keeps track of it by itself. This ensures high availability to your application. The reason for this approach over load balancing is that NCache uses a keep-alive paradigm where all the client nodes keep a connection open to the caching servers instead of opening a connection every time it needs to talk to the server. This provides a faster performance.