For the past few days, I’ve been working on troubleshooting a problem in UCS that I, admittedly, caused. The problem in question has to do with an error code/msg that I received when trying to move a service profile from one blade to another. The error code is: F0327.
According to the UCS error code reference guide, it translates as:
Service profile [name] configuration failed due to [configQualifier]
The named configuration qualifier is not available. This fault typically occurs because Cisco UCS Manager cannot successfully deploy the service profile due to a lack of resources that meet the named qualifier. For example, the following issues can cause this fault to occur:
•The service profile is configured for a server adapter with vHBAs, and the adapter on the server does not support vHBAs.
•The local disk configuration policy in the service profile specifies the No Local Storage mode, but the server contains local disks.
If you see this fault, take the following actions:
Step 1 Check the state of the server and ensure that it is in either the discovered or unassociated state.
Step 2 If the server is associated or undiscovered, do one of the following:
–Discover the server.
–Disassociate the server from the current service profile.
–Select another server to associate with the service profile.
Step 3 Review each policy in the service profile and verify that the selected server meets the requirements in the policy.
Step 4 If the server does not meet the requirements of the service profile, do one of the following:
–Modify the service profile to match the server.
–Select another server that does meet the requirements to associate with the service profile.
Step 5 If you can verify that the server meets the requirements of the service profile, execute the show tech-support command and contact Cisco Technical Support.
While helpful in providing me lots of things to try to fix the problem, none of them worked. It took me a while, but I figured out how to reproduce the error, a possible cause, and a workaround.
Here’s how to produce the error:
- Create a service profile without assigning any HBAs. Shutdown the server when the association process has completed.
- After the profile is associated, assign an HBA or two.
- You should receive this dialog box:
You will then see this in the general tab of the service profile in question:
Now here is where the error can be induced:
- Don’t power on. Keep in mind that the previous dialog box said that changes wouldn’t be applied until the blade was rebooted (powered on).
- Now disassociate the profile and associate it with another blade. The “error” is carried over to the new blade and the config process (association process) does not run.
Powering up the newly associated blade does not correct the issue. What has happened is that the disassociation/association process that is supposed to occur above does not take place due to the service profile being in an error state.
- Reboot after adding the HBA. This will complete the re-configuration process, thus allowing disassociation/association processes to perform normally. This is also the proper procedure. Or
- Go to the Storage tab of the affected service profile and click on “Change World Wide Node Name”. This forces the re-configuration to take place.
I’ve opened a ticket with TAC on this asking for a few documentation updates. The first update is to basically state the correct method for applying the HBAs and that if not followed, the error msg will appear.
The second update is for them to update the error code guide with a 6th option – Press “Change World Wide Node Name” button.
I am going to go out on a limb and say that they probably didn’t count on people like me doing things that they shouldn’t be doing or in an improper manner when they wrote the manuals. :)
We did it, and we did it early. We completed the move of our existing VMware infrastructure onto the Cisco UCS platform. At the same time, we also moved from ESX 3.5 to vSphere. All-in-all, everything is pretty much working. The only outstanding issue we haven’t resolved yet involves Microsoft NLB and our Exchange CAS/HUB/OWA servers. NLB just doesn’t want to play nice and we don’t know if the issue is related more to vSphere, UCS, or something else entirely different.
Next up: SQL Server clusters, P2Vs, and other bare metal workloads.
SQL Server migrations have already started and are going well. We have a few more clusters to build and that should be that for SQL.
P2Vs present a small challenge. A minor annoyance that we will have to live with is an issue with VMware Converter. Specifically, we’ve run into a problem with resizing disks during the P2V process. The process fails about 2% into the conversion with an “Unknown Error”. It seems a number of people have also run into this problem and the workaround provided by VMware in KB1004588 (and others) is to P2V as-is and then run the guest through Converter again to resize the disks. This is going to cause us some scheduling headaches, but we’ll get through it. Without knowing the cause, I can’t narrow it down to being vSphere or UCS related. All I can say is that it does not happen when I P2V to my ESX 3.5 hosts. Alas, they are HP servers.
We’ve gone all-in with Cisco and purchased a number of the C-Series servers, recently deploying a few C-210 M2 servers to get our feet wet. Interesting design choices to say the least. I will say that they are not bad, but they are not great either. My gold standard is the HP DL380 server line and as compared to the DL380, the C-210 needs a bit more work. For starters, the default drive controller is SATA, not SAS. I’m sorry, but I have a hard time feeling comfortable with SATA drives deployed in servers. SAS drives typically come with a 3yr warranty; SATA drives typically have a 1yr warranty. For some drive manufacturers, this stems from the fact that their SAS drives are designed for 24/7/365 use, but their SATA drives are not.
Hot Plug fans? Nope..These guys are hard-wired, and big. Overall length of the server is a bit of a stretch too, literally. We use the extended width/depth HP server cabinets and these servers just fit. I think the length issue stems from the size of the fans (they are big and deep) and some dead space in the case. The cable arm also sticks out a bit more than I expected. With a few design modifications, the C-210 M2 could shrink three, maybe four inches in length.
I’ll post some updates as we get more experience with the C-Series.