Archive

Archive for August, 2010

My Thoughts on Our Cisco UCS Sales Experience

August 31, 2010 2 comments

This is a topic that when I think about it, I jump around in my head from subtopic to subtopic.  To make things easier on myself, I am going to write a bunch of disjointed paragraphs and tie them together in the end.

Disjoint #1

I’ve never worked on Cisco gear in the past.  Everywhere I worked where I had access to network/server equipment, Cisco was not a technology provider.  I don’t know why, other than I’ve heard Cisco had the priciest gear on the market.  I’ve also heard/read that while Cisco is #1 in the networking gear market, their products are not necessarily #1 in performance, capacity, etc.  Throw in the perception of the 800lb gorilla and you get a lot of negative commentary out there.

Disjoint #2

When I was 19, I started my career in the technology field as a bench tech for a local consumer electronics store.  The owner (Ralph) was a man wise beyond his years. He saw something in me and decided to take me under his wing, but because I was 19, I did not understand/appreciate the opportunity that he was bestowing upon me.

While I learned some of the various technical aspects of running a small business, I did not do so well on the human side of it.  I was a brash, cocky 19yr old who thought he could take over the world.  However, there is one thing Ralph said that I remember very well and that is, “If no one has any problems, how will they ever find out what wonderful customer service we have”.

It’s not that he wanted people to have problems with the equipment they purchased.  He knew that by selling thousands of answering machines, telephones, T.Vs, computer, etc there would be some issues at some point and he felt that he  should do his best to make amends for it.

Ralph truly believed in customer service and would go out of his way to ensure that all customers left feeling like they had been taken care of extremely well.  If there was poster child for exemplary customer service, it would be Ralph.

Disjoint #3

A number of vendors with broad product lines have somehow decided that the SMB market does not need robust, highly available (maybe even fault tolerant) equipment.  Somehow, company size and revenue have become equated with technical needs.  Perceptions of affordability have also played into this, meaning, if you can’t afford it, then you don’t need it.

Why do I bring this up?  Way back in one of my earlier posts, I mentioned that we had a major piece of equipment fail and received poor customer service from the vendor.  The vendor sales rep kept saying that we bought the wrong equipment.  We didn’t buy the wrong equipment, we bought what we could afford.   In hindsight it wasn’t the equipment that failed us, but the company behind it.

.

Tieing all this together…

When we first started looking at UCS, some folks here had trepidations about doing business with Cisco.  There were preconceived notions of pricing and support.  Cisco was also perceived to have a reputation of abandoning a market where they could not be number one in sales.

I must also admit that there are technical zealots in my organization that only believe in technical specifications.  These folks try to avoid products that don’t “read” the best on paper or have the best results in every performance test.

However, my team diligently worked to overcome these objections one by one and we couldn’t have done it without the exceptional Cisco sales team assigned to us.

In the early part of the sales process, we pretty much only dealt with the Product Sales Specialist (PSS) and her System Engineer (SE).  The rest of the account team entered the picture a month or so later.

These two (PSS and SE) had the patience of Job.   The sales team took copious amounts of time meeting with us to explain how UCS was different from the other blade systems out there and how it could fit into our environment and enable us to achieve our strategic goals.  All questions were answered thoroughly in a timely manner.  Not once did I ever get the feeling that they (Cisco) felt they were wasting their time.

When the infamous HP-sponsored Tolly report (and other competing vendor FUD) came out, Cisco sales took the time to allay our concerns.   As we read and talked about other competing products, not once did they engage in any negative marketing.  Cisco took the high road and stuck to it.

We had phone calls with multiple reference accounts.  We had phone calls with product managers.  We had phone calls with the Unified Computing business unit leaders.   We had phone calls with…you get the idea.  Cisco put in a great amount of effort show us their commitment to be in the server business.

On top of all this, there was no overt pressure to close the sale.  Yes, the sales team asked if they could have the sale.  That’s what they are supposed to do.  But they didn’t act like car salesman by offering a limited duration, once in a lifetime deal.   Instead, they offered a competitive price with no strings attached. (Disjoint #1)

Needless to say, we bought into UCS and have transitioned to the post sales team.  This means we now interact more with our overall account rep and a generic SE rather than the PSS and her SE.  I call our new SE generic because he is not tied to a particular product but represents the entire Cisco product line.  He’s is quite knowledgeable and very helpful in teaching the ways of navigating Cisco sales and support.

So has everything gone perfectly?  No. We’ve had a few defective parts.  If you have read of my other posts, you know that we have had some integration issues.  We’ve also found a few areas of the management system that could use a bit more polish.  So in light of all this, do I regret going with UCS?  Not at all.  I still think it is the best blade system out there and I truly think the UCS architecture is the right way to go.

But with defective parts, integrations issues, etc…”Why do I still like Cisco?” you ask.  For starters, I don’t expect everything to be perfect.  That’s just life in the IT field.

Second, go re-read Disjoint #2.   Cisco must have hired Ralph at some point in time because their support has been phenomenal.    Not only do the pre and post sales teams check in to see how we are doing, any time we run into an issue they ask what Cisco can do to help.  It’s not that they just ask to see if they can help, they actually follow through if we say “yes”.  They are treating us as if we are their most important customer.

Finally, to tie in Disjoint #3, any time we run into something where other vendors would say we purchased the wrong equipment, Cisco owns the issue and asks how they can improve what we already have purchased.   It’s not about “buy this” or “buy that”.  It’s “How can we make it right?”, “What can we do to improve the product/process/experience?”, and “What could we have done differently?”   These are all questions a quality organization asks themselves and their customers.

I don’t know what else I can write about my Cisco sales experience other than to say that it has become my gold standard.  If other vendors read this post, they now know what standard they have to live up to.

To other UCS customers: What was your sale experience like?

.

Advertisements

Some Photos of the Cisco C-210 M2

August 25, 2010 1 comment

I posted yesterday some thoughts on the C-210.  Here are a few photos to help visualize what I was referring to.

This first photo is of the C-210 internals.  Notice the size of the fans and the wasted space between the fans and the motherboard.  Click on the photos for larger images.

Photo of Cisco C-210 M2 server insides

.

This next photo attempts to show the depth differences between an HP DL380 and a C-210.  It also shows how long the entire server/cable arm combo is.   It’s a bit hard to tell from the photo, but the HP cable arm is just a hair longer than the cable mgmt tray on the right.  The C-210’s cable arms sticks out past the cable tray a few inches.

C-Series and HP cable arms in use.

This last photo is a closeup (sort of) of the C-210 cable arm.  Ignore the purple cables.

Larger image of C-210 cable arm in use

Categories: cisco Tags: ,

A Major Milestone Has Been Reached!!

August 24, 2010 Leave a comment

We did it, and we did it early.  We completed the move of our existing VMware infrastructure onto the Cisco UCS platform.    At the same time, we also moved from ESX 3.5 to vSphere.  All-in-all, everything is pretty much working.  The only outstanding issue we haven’t resolved yet involves Microsoft NLB and our Exchange CAS/HUB/OWA servers.  NLB just doesn’t want to play nice and we don’t know if the issue is related more to vSphere, UCS, or something else entirely different.

Next up: SQL Server clusters, P2Vs, and other bare metal workloads.

SQL Server migrations have already started and are going well.  We have a few more clusters to build and that should be that for SQL.

P2Vs present a small challenge.  A minor annoyance that we will have to live with is an issue with VMware Converter.  Specifically, we’ve run into a problem with resizing disks during the P2V process.  The process fails about 2% into the conversion with an “Unknown Error”.  It seems a number of people have also run into this problem and the workaround provided by VMware in KB1004588 (and others) is to P2V as-is and then run the guest through Converter again to resize the disks.  This is going to cause us some scheduling headaches, but we’ll get through it.   Without knowing the cause, I can’t narrow it down to being vSphere or UCS related.  All I can say is that it does not happen when I P2V to my ESX 3.5 hosts.  Alas, they are HP servers.

.

We’ve gone all-in with Cisco and purchased a number of the C-Series servers, recently deploying a few C-210 M2 servers to get our feet wet.  Interesting design choices to say the least.  I will say that they are not bad, but they are not great either.   My gold standard is the HP DL380 server line and as compared to the DL380, the C-210 needs a bit more work.  For starters, the default drive controller is SATA, not SAS.  I’m sorry, but I have a hard time feeling comfortable with SATA drives deployed in servers.  SAS drives typically come with a 3yr warranty; SATA drives typically have a 1yr warranty.  For some drive manufacturers, this stems from the fact that their SAS drives are designed for 24/7/365 use, but their SATA drives are not.

Hot Plug fans?  Nope..These guys are hard-wired, and big.   Overall length of the server is a bit of a stretch too, literally.   We use the extended width/depth HP server cabinets and these servers just fit.   I think the length issue stems from the size of the fans (they are big and deep) and some dead space in the case.  The cable arm also sticks out a bit more than I expected.  With a few design modifications, the C-210 M2 could shrink three, maybe four inches in length.

I’ll post some updates as we get more experience with the C-Series.

Our Current UCS/vSphere Migration Status

August 17, 2010 Leave a comment

We’ve migrated most of our virtual servers over to UCS and vSphere.  I’d say we are about 85% done, with this phase being completed by Aug 29.  It’s not that it’s taking 10+ days to actually do the rest of the migrations.  It’s more of a scheduling issue.  From my perspective, I have three more downtimes to go.  Not much at all.

The whole process of migrating from ESX to vSphere and updating all the virtual servers has been interesting to say the least.  We haven’t encountered any major problems; just some small items related to the VMtools/VMhardware version (4 to 7) upgrades.   For example, our basic VMTools upgrade process is to right-click on a guest in the VIC and click on the appropriate items to perform an automatic upgrade.  When it works, the guest installs VMTools, reboots,  and comes back up without admin intervention.  For some reason, this would not work for our MS Terminal Servers unless we were logged into the target terminal server.

Here’s another example, this time involving Windows Server 2008:  The automatic upgrade process wouldn’t work either.  Instead, we had to login and launch VMTools from the System Tray and select upgrade.  The only operating system that went perfectly was Windows Server 2003 with no fancy extras (terminal services, etc).  Luckily, that’s the o/s most of our virtual workloads are running.  I am going to hazard a guess and say that some of these oddities are related to our various security settings, GPOs, and the like.

All-in-all, the vm migration has gone very smoothly.  I must say that I am happy with the quality of the VMware hyerpvisor, Virtual Center, and other basic components.  There has been plenty of opportunity for something to go extremely wrong, but so far, nada. (knock on wood)

So what’s next?  We are preparing to migrate our SQL servers onto bare metal blades.  In reality, we are building new servers from scratch and installing SQL server.  The implementation of UCS has given us the opportunity to update our SQL servers to Windows Server 2008 and SQL Server 2008.   Other planned moved include some Oracle app servers (on RedHat) as well as domain controllers, file share clusters, and maybe some tape backup servers.  This should take us into September.

Once we finish with the blades, we’ll start deploying the Cisco C-series rackmount servers.  We still have a number of instances where we have to go rackmount.   Servers in this category typically need multiple NICs, telephony boards, or other specialized expansion boards.

.

Upgrade Follies

August 12, 2010 Leave a comment

It’s amazing how many misconfigured, or perceived misconfigured, items can show up when doing maintenance and/or upgrades.  In the past three weeks, we have found at least four production items that fit this description that no one noticed because things appeared to be working.  Here’s a sampling:

During our migration from our legacy vm host hardware to UCS, we broke a website that was hardware load-balanced across two different servers.  Traffic should have been directed to Server A, then Server B, then Server C.  After the migration traffic was only going to Server C, which just hosts a page that says the site is down.  It’s a “maintenance” server, meaning that whenever we take a public facing page down, the traffic gets directed to Server C so that people can see a nice screen that says, “Sorry down for maintenance …..”

Everything looked right in the load balancer configuration.  While delving deeper, we noticed that server A was configured to be the primary node for a few other websites.  An application analyst whose app was affected chimed in and said that the configuration was incorrect.  Website 1 traffic was to go first to Server A, then B.  Website 2 traffic was supposed to go in the opposite order.   All our application documentation agreed with the analyst.  Of course, he wrote the documentation so it better agree with him 🙂  Here is the disconnect: we track all our changes in a Change Management system and no one ever put the desired configuration change into the system.  As far as our network team is concerned; the load balancer is configured properly.  Now this isn’t really a folly since our production system/network matched what our change management and CMDB systems were telling us.  This is actually GOODNESS.  If we ever had to recover due to a disaster, we would reference our CMDB and change management systems so they had better be in agreement.

Here’s another example:  We did a mail server upgrade about six months ago and everything worked as far as we could tell.  What we didn’t know was that some things were not working but no one noticed because mail was getting through.  When we did notice something not correct (a remote monitoring system) and fixed the cause, it led us to another item, and so on and so on.  Now, not everything was broken at the same time.  In a few cases, the fix of one item actually broke something else.  What’s funny is that if we didn’t correct the monitoring issue, everything would have still worked.  It was a fix that caused all the other problems.  In other words, one misconfiguration proved to be a correct configuration for other misconfigured items.  In this case, multiple wrongs did make a right.  Go Figure.

My manager has a saying for this: “If you are going to miss, miss by enough”.

.

I’ve also noticed that I sometimes don’t understand concepts when I think I do.  As part of our migration to UCS, we are also upgrading from ESX3.5 to vSphere.   Since I am new to vSphere, I did pretty much what every SysAdmin does: click all the buttons/links.  One of those buttons is the “Advanced Runtime Info” link that is part of the VMware HA portion of the main Virtual Center screen.

This link brings up info on slot sizes and usage.  You would think that numbers would add up, but clearly they don’t.

How does 268 -12 = 122?  I’m either obviously math challenged or I really need to go back and re-read the concept of Slots.

.

Let the Migrations Begin!!

August 7, 2010 2 comments

It’s been a few weeks since I last posted an update on our Cisco UCS implementation.  We’ve mostly been in a holding pattern until now.  Yes, we finally got the network integration component figured out.  Unfortunately, we had to dedicate some additional L2 switches to accommodate our desired end-goal.  If you look back a few posts, I covered the issues with connecting UCS to two disjointed L2 networks.  We followed the recommended workaround and it seems to be working.  It took us a bit to get here since my shop did not use VLANs, which turn out to be part of the workaround.

So now we have been in a test mode for a bit over a week with no additional problems found.  Now it’s time for real workloads.  We migrated a few development systems over Wednesday to test out our migration process.  Up until then, it was a paper exercise.  It worked, but required more time that we thought for VMtools and VM hardware version upgrades.  The real fun starts today when we migrate a few production workloads.  If all goes well, I’ll be very busy over the next 45 days as we move all our VMware and a number of bare metal installs to UCS.

Since we chose to migrate by moving one LUN at a time from the old hosts to the new hosts, and also upgrade to vSphere, our basic VM migrations process goes like this:

  1. Power off guests that are to be migrated.  These guests should be on the same LUN.
  2. Present the LUN to the new VM hosts and do an HBA rescan on the new hosts.
  3. In Virtual Center, click on a guest to be migrated.  Click on the migrate link and select Host.    The migration should take seconds.
  4. Repeat for all other guests on this LUN.
  5. Unpresent the LUN from the old hosts.
  6. Power up guests
  7. Upgrade VM tools (now that we are on vSphere hosts) and reboot.
  8. Power the guests down.
  9. Upgrade VM hardware.
  10. Power up the guests and let them Plug-n-Play the new hardware and reboot when needed.
  11. Test

We chose to do steps 6 through 10 using no more than four guests at a time.  It’s easier to keep track of things this way and the process seems to be working so far.

We are lucky to be on ESX 3.5.  If we started out on ESX4, the LUN migration method would require extra steps due to the process of LUN removal from the old hosts.  To properly remove a LUN from ESX4, you will need to follow a number of convoluted steps as noted in this VMware KB.  With ESX3.5, you can just unpresent and do an HBA rescan.

I don’t know the technical reason for all these extra steps to remove a LUN in vSphere, but it sure seem like a step backwards from a customer perspective.  Maybe VMware will change it in the next version.

Categories: UCS, VMware Tags: , ,