Archive

Archive for July, 2010

Week Two of Cisco UCS Implementaion Completed

Progress has been made!!

The first few days of the week involved a number of calls back to TAC, the UCS business unit, and various other Cisco resources without much progress.  Then on Thursday I pressed the magic button and all the sudden our fabric interconnects came alive in Fabric Manager (MDS control software).  What did I do? I turned on SNMP.  No one noticed that it was turned off (default state).    Pretty sad actually given the number of people involved in troubleshooting this.

This paragraph subject to change based on confirmation of accuracy from Cisco. So here’s the basic gist of what was going on.  We are running an older version of MDS firmware and the version of Fabric Manager that comes with this firmware is not really “UCS aware”.  It needs a method of communicating with the fabric interconnects to fully see all the WWNs.  The workaround is to use SNMP.   I created an SNMP user in UCS and our storage admin created the same username/password in Fabric Manager.  Of course having the accounts created does nothing if the protocol they need to use is not active.  Duh.

The screenshot below shows the button I am talking about.  The reason no one noticed that SNMP was turned off was because I was able to add traps and users without any warnings about SNMP not being active.  Also, take a look at the HTTP and HTTPS services listed above SNMP.  They are enabled by default.  Easy to miss.

.

.

With storage now presented, we were able to complete some basic testing.  I must say that UCS is pretty resilient if you have cabled all your equipment wisely.  We pulled power plugs, fibre to Ethernet, fibre to storage,  etc and only a few did times did we lose a ping (singular PING!).   All our data transfers kept transferring, pings kept pinging, RDP sessions stayed RDP’ing.

We did learn something interesting in regards to the Palo card and VMware.  If you are using the basic Menlo card (standard CNA), then failover works as expected.  Palo is different.  Suffice it to say that for every vNIC you think you need, add another one.  In other words, you will need two vNICS per vSwitch. When creating vNICs, be sure to balance them across both fabrics and note down the MAC addresses.  Then when you are creating your vSwitches (or DVS) in VMware, apply two vNICs to each switch using one from fabric A and one from fabric B.  This provides the failover capabilities.    I can’t provide all the details because I don’t know them, but it was explained to me by one of the UCS developers that this is a difference in UCS hardware (Menlo vs Palo).

Next up: testing, testing, and more testing with some VLANing thrown in to help us connect up to two disjointed L2 networks.

.

Week One of Cisco UCS Implementation Complete

July 5, 2010 2 comments

The first week of Cisco UCS implementation has passed.  I wish I could say we were 100% successful, but I can’t.  We’ve encountered two sticking points which are requiring some rethinking on our part.

The first problem we have run into revolves around our SAN.  The firmware on our MDS switches is a bit out of date and we’ve encountered a display bug in the graphical SAN management tool (Fabric Manager).  This display bug won’t show our UCS components as “zoneable” addresses.  This means that all SAN configurations relating to UCS have to be done via command line.   Why don’t we update our SAN switch firmware?  That would also entail updating the firmware on our storage arrays and it is not something we are prepared to do right now.  It might end up occurring sooner rather than later if doing everything via command line is too cumbersome.

The second problem involves connecting to two separate L2 networks.  This has been discussed on various blogs such as BradHedlund.com and the Unified Computing Blog.  Suffice it to say that we have proven that UCS was not designed to directly connect to two different L2 networks at the same time.  While there is a forthcoming firmware update that will address this, it does not help us now.  I should clarify that this is not a bug and that UCS is working as designed.  I am going to guess that either Cisco engineers did not think that customers would want to connect in to two L2 networks or that it was just a future roadmap feature.  Either way, we are working on methods to get around the problem.

For those who didn’t click the links to the other blogs, here’s a short synopsis:  UCS basically treats all uplink ports equally.  It doesn’t know about the different networks so it will assume any VLAN can be on any uplink port.  ARPs, broadcasts, other terms and how they all work apply here.  If you want a better description, please go click the links in the previous paragraph.

But the entire week was not wasted and we managed to accomplish quite a bit.  Once we get passed the two hurdles mentioned above, we should be able to commence our testing.  It’s actually quite a bit of work to get this far.  Here’s how it pans out:

  1. Completed setup of policies
  2. Completed setup of Service Profile Templates
  3. Successfully deployed a number of different server types based on Service Profiles and Server Pool Policy Qualifications
  4. Configured our VM infrastructure to support Palo
  5. Configure UCS to support our VM infrastructure
  6. Successfully integrated UCS into our Windows Deployment system

Just getting past numbers 1 and 2 was a feat.  There are a number of policies that you can set so it is very easy to go overboard and create/modify way too many.   The more you create, the more you have to manage and we are trying to follow the K.I.S.S principle as much as possible.   We started out by having too many policies, but eventually came to our senses and whittled the number down.

One odd item to note: when creating vNIC templates, a corresponding port profile is created under the VM tab of UCS Manager.  Deleting vNIC templates does not delete the corresponding port profiles so you will have to manually delete them.  Consistency would be nice here.

And finally, now that we have a complete rack of UCS I can show you the just how “clean” the system looks.

Before

The cabling on a typical rack

After

A full rack of UCS - notice the clean cabling

.

Let’s hope week number two gets us into testing mode…..

.