Exaforge

Cloud, DevOps, Evangelism

Pushing Limits - Running View 5 on iomega PX6

Introduction

One of the tasks I undertook for VMworld 2011 Prep was a demonstration of View 5 running on small, limited hardware.  We were asked to 'see what it can do'.  The ultimate goal was to run as many VMware View 5 guests as we could on a small set of hardware - something that could be reasonably acquired for a small remote office to use for virtual desktops.  We started with a budget of about $15,000 and wanted something that could fit in a small half rack that you stuff in the corner.

Hardware

For hardware, we used the following:

  • iomega PX6
    •  2x 256GB MLC SSD
    •  4x 1024GB SATA 7200 RPM drives
  • Cisco C200
    • 2 x 6-core 2.93 GHz CPUs
    • 96 GB RAM
  • Small Cisco Switch (3750)

If you were to buy this (or similar equipment) on the open market today, your could do it for right about $15,000 US if you shop for a few deals (but still getting server class hardware from Cisco or Dell or similar).

Our initial goal was to hit 30 virtual machines.  We chose this goal to be in line with what we feel a real small office might do.  If you were building out a brand new branch office, you'd go off and purchase somewhere around 30 machines for your users, you'd spend around $1250 or so per machine, making your total outlay (excluding software licensing) somewhere around $38,000.  Our though was that instead, you'd spend $15K on this solution, $15,000 on thin client's (laptop style or desktop style, as needed), and still have 8K left over for a decent backup solution or similar.

So how did we set it up, and more importantly, validate our architecture & design?

Software

We started with a base of VMware vSphere 5 installed locally to the C200's disks.  This was totally unncessary of course, as ESXi could easily run from USB, but we had trouble getting the C200 booting from USB, so this was the path of least resistance.

On top of that we installed 2 Windows 2008 R2 virtual machines that had been joined to an existing Active Directory environment.  One of them had VMware vCenter 5 installed andthe other had the VMware View 5 connection broker.  Lastly, we also installed View Composer on the vCenter server (its required to be co-resident).

We then built a single 'golden image' virtual machine that would be the virtual desktops that our users would access.  This was a Windows XP SP3 virtual machine with 1GB RAM and Office 2010 & VMware Tools installed.

Array Configuration

The px6 array was very easy to configure over its web interface.  The disks were configured into 2 pools as follows:

  • ReplicasClones:  This pool was a RAID1 set of both SSDs in the system, and by its name you could guess that it was designed to hold the high IO aspects of the View environment.
  • UserData: This pool was a RAID5 set of the 4 remaining 1TB SATA drives, giving a useable capacity of around 2.8TB useable after RAID & base2/base10 conversions.  It intended purpose was to hold

Host Configuration

The ESXi was configured with defaults for the most part.  Using its 4-port NIC, we connected 2 of the ports on the array to 2 ports on the host with short patch cables.  This was done so that we could maximize the bandwidth and paths going into the px6 array from the host.  The replica datastore was mounted via NFS over vmnic2, while the desktop-ssd (linked clones) datastore was mounted over NFS over vmnic3.  As a result, these 2 high IO datastores have the best possible, lowest latency and contention-free paths to the storage.For external access, all management traffic & virtual machine traffic is routed through vmnic0 to the outside world.

VMware View 5 Configuration

After getting View 5 installed (which is very easy, by the way - just Next, Next, Next, Finish), we began to configure a single pool using Linked Clones and persistent disks:

We let it process and deploy the desktops for a bit (about 20 minutes or so), and voila, we have 50 desktops ready to go!

Very cool stuff!

Performance Testing

As cool as that easy deployment is, we need to test more than just booting and running 50 idle desktop.  While a previous boss of mine always said that his "favorite network is one with no users", I don't think I could swing $30K for a 0-user system.

As a result, I needed to find a way to generate some real world load on the environment.  There are a few reasonable choices out there, but the easiest (and cheapest) one I found was the LoginVSI tool from Login Consultants.  This tool uses a clever combination of AD group policies, shared folder, batch files and keystroke generators to simulate real world user behaviors with Excel, Word, Outlook and Internet Explorer.  Even better, for testing with a 'Medium' workload and up to 50 users, its free!

The tool constantly monitors response time to user actions as the number of virtual machines active increases to determine the maximum number of systems that can be running before the user experiences poor performance.  It ends up outputting a graph like this:

That blue line is the average response time, and you really dont want it above ~2500ms.

As the tests run, its fascinating to watch them go.  You get to see the system open RDP, login, run the batch file, and loop through the testing procedure:

Results

So where did we end up?  First let me demonstrate the host's view of the performance:

Looking at the summary view on the top right, you can see we consumed 74GB of the memory on the system, and over 22GHz of CPU performance, so we are definitely starting to push the limits of what this host can do.  Lets take a look at the core CPU & Storage metrics:

Clearly the CPU is pretty healthy, peaking out at around 60%.  Now the most interesting part - the storage performance:

This is really fantastic performance.  We see that the overall average response times (both read and write) for the disks as less than 2.8ms, well below the ~10-15ms value that we'd call 'good' in the storage world.  Even better, the read latencies were less than half a millisecond.  Clearly, this array has the horsepower to handle this workload with aplomb.

So where did we get?  What did LoginVSI says is our max?  Well here's the graph:

We see here that user response times are always sub 1.2 seconds, and average less than 1s overall.  What does this mean for capacity?  It means this system with this design can easily handle 50 users.  I bet it could handle at least 75, but I simply didn't have the testing resources to try to go any larger (more hosts, non-free LoginVSI license).

If you'd like to see a ScreenFlow of this whole process with captions, you could check out this YouTube video:

[yframe url='http://www.youtube.com/watch?v=W7uZ2krVruM']

Design Considerations

I can hear people screaming already.  'This isn't viable in the real world - single controller no backups blah blah blah'.  Yes, I know.  I would not implement this in a business critical environment.  This was simply intended as a demonstration to show what View 5 can do with absolutely minimal resources.  Please, for the love of all things production, do not use this design for anything that matters to your business.

As usual, questions, comments, flames below :)