vSphere Client Installation fails with “Failed to install hcmon”

February 3rd, 2018

This was a rather annoying error. It seems to stem from having to support multiple different versions of vSphere and thus having many versions of the client installed along with VMware Remote Console (VMRC).

There are a couple of articles from VMware on how to try and resolve this:

  • kb2130850 – Uninstall all vSphere clients and start over
  • kb2006486 – Uninstall hcmon and rename hcmon.sys
  • kb2053281 – Install .NET 3.5.1

None of these however helped me. It was worth digging around in vminst.log which you can find in: C:\Users\ACCOUNTNAME\AppData\Local\Temp

What was interesting in the log was the following snippet:

2018-01-17 07:14:33| USBDeviceInstUtil-build-5395284| Begin Logging
2018-01-17 07:14:33| USBDeviceInstUtil-build-5395284| — VMUninstallHcmon(): Uninstalling hcmon service
2018-01-17 07:14:33| USBDeviceInstUtil-build-5395284| Util_GetKeyValueString(HKEY_LOCAL_MACHINE\SOFTWARE\VMware, Inc.\VMware USB\InstallPath) = “C:\Program Files (x86)\Common Files\VMware\USB\”
2018-01-17 07:14:33| USBDeviceInstUtil-build-5395284| Uninstalling HCMON in “C:\Program Files (x86)\Common Files\VMware\USB\”
2018-01-17 07:14:33| USBDeviceInstUtil-build-5395284| — CallVNLUninstallDriver()
2018-01-17 07:14:33| USBDeviceInstUtil-build-5395284| Attempting to call function VNL_UninstallHcmon in C:\Program Files (x86)\Common Files\VMware\USB\vnetlib.dll
2018-01-17 07:14:33| USBDeviceInstUtil-build-5395284| Loading vnetlib from “C:\Program Files (x86)\Common Files\VMware\USB\vnetlib.dll” [508a0000]
2018-01-17T07:14:33.342+02:00| inst-build-6966790| I1: VNLUninstallLegacyInf: driverId:hcmon cmd:uninstall hcmoninf args:5;Win7
2018-01-17T07:14:33.342+02:00| inst-build-6966790| I2: GetVnetParameter: vmnet: ” ” ‘InstallPath’
2018-01-17T07:14:33.342+02:00| inst-build-6966790| E2: GetVnetParameter: vnetlib path doesn’t exist so can’t open
2018-01-17T07:14:33.342+02:00| inst-build-6966790| E2: GetSpecificProductInstallPath: could not find an InstallPath key for product
2018-01-17T07:14:33.342+02:00| inst-build-6966790| I2: VNLWorkstationInstalled: didn’t find install path for VMware Workstation
2018-01-17T07:14:33.342+02:00| inst-build-6966790| E2: GetSpecificProductInstallPath: could not find product registry key
2018-01-17T07:14:33.342+02:00| inst-build-6966790| E2: GetSpecificProductInstallPath: could not find an InstallPath key for product
2018-01-17T07:14:33.342+02:00| inst-build-6966790| I2: VNLPlayerInstalled: didn’t find install path for player
2018-01-17T07:14:33.342+02:00| inst-build-6966790| E1: VNL_GetProductInstallPath: could not find a InstallPath key anywhere
2018-01-17T07:14:33.342+02:00| inst-build-6966790| E1: VNLSpawn64BitVnetlibTask: failed to get generic product install path
2018-01-17T07:14:33.342+02:00| inst-build-6966790| E1: VNLUninstallLegacyInf: Failed to handle 64-bit properly
2018-01-17T07:14:33.342+02:00| inst-build-6966790| E2: VNLDeleteSystemFile: failed to delete 0x00000002 ‘C:\WINDOWS\system32\drivers\hcmon.sys’
2018-01-17T07:14:33.342+02:00| inst-build-6966790| E2: VNLDeleteSystemFile: failed #2 to delete 0x00000002 ‘C:\WINDOWS\system32\drivers\hcmon.sys’
2018-01-17 07:14:33| USBDeviceInstUtil-build-5395284| ERROR: Failed calling VNL_UninstallHcmon
2018-01-17 07:14:33| USBDeviceInstUtil-build-5395284| Freeing library: 1351221248
2018-01-17 07:14:33| USBDeviceInstUtil-build-5395284| ERROR: Failed to uninstall hcmon
2018-01-17 07:14:33| USBDeviceInstUtil-build-5395284| End Logging

and

2018-01-17 07:14:56| USBDeviceInstUtil-build-5395284| Begin Logging
2018-01-17 07:14:56| USBDeviceInstUtil-build-5395284| — VMInstallHcmon(): Installing hcmon service
2018-01-17 07:14:56| USBDeviceInstUtil-build-5395284| Getting Property CustomActionData = C:\Program Files (x86)\Common Files\VMware\USB\;5
2018-01-17 07:14:56| USBDeviceInstUtil-build-5395284| Installing HCMON in “C:\Program Files (x86)\Common Files\VMware\USB\”
2018-01-17 07:14:56| USBDeviceInstUtil-build-5395284| Attempting to call function VNL_InstallHcmon in “C:\Program Files (x86)\Common Files\VMware\USB\”
2018-01-17 07:14:57| USBDeviceInstUtil-build-5395284| Loading vnetlib from “C:\Program Files (x86)\Common Files\VMware\USB\vnetlib.dll” [508a0000]
2018-01-17T07:14:57.010+02:00| inst-build-6966790| I1: VNLInstallLegacyInf: driverId:hcmon cmd:install hcmoninf args:5;Win7
2018-01-17T07:14:57.010+02:00| inst-build-6966790| I2: GetVnetParameter: vmnet: ” ” ‘InstallPath’
2018-01-17T07:14:57.010+02:00| inst-build-6966790| E2: GetVnetParameter: vnetlib path doesn’t exist so can’t open
2018-01-17T07:14:57.010+02:00| inst-build-6966790| E2: GetSpecificProductInstallPath: could not find an InstallPath key for product
2018-01-17T07:14:57.010+02:00| inst-build-6966790| I2: VNLWorkstationInstalled: didn’t find install path for VMware Workstation
2018-01-17T07:14:57.010+02:00| inst-build-6966790| E2: GetSpecificProductInstallPath: could not find product registry key
2018-01-17T07:14:57.010+02:00| inst-build-6966790| E2: GetSpecificProductInstallPath: could not find an InstallPath key for product
2018-01-17T07:14:57.010+02:00| inst-build-6966790| I2: VNLPlayerInstalled: didn’t find install path for player
2018-01-17T07:14:57.010+02:00| inst-build-6966790| E1: VNL_GetProductInstallPath: could not find a InstallPath key anywhere
2018-01-17T07:14:57.010+02:00| inst-build-6966790| E1: VNLSpawn64BitVnetlibTask: failed to get generic product install path
2018-01-17T07:14:57.010+02:00| inst-build-6966790| E1: VNLInstallLegacyInf: Failed to handle 64-bit properly
2018-01-17 07:14:57| USBDeviceInstUtil-build-5395284| ERROR: Failed calling VNL_InstallHcmon()
2018-01-17 07:14:57| USBDeviceInstUtil-build-5395284| Freeing library: 1351221248
2018-01-17 07:14:57| USBDeviceInstUtil-build-5395284| ERROR: Failed to install hcmon
2018-01-17 07:21:56| USBDeviceInstUtil-build-5395284| End Logging

I had tried the KB article hack to manually uninstall hcmon and rename the sys file without success, but the fix along with the logs led me to go dig in the common files folder.

So I took a chance and renamed “C:\Program Files (x86)\Common Files\VMware\USB” to “C:\Program Files (x86)\Common Files\VMware\USB.bad” and ran the vSphere client install and it worked!

Raspbian WiFi with RADIUS and TLS 1.2

January 10th, 2018

After our recent enforcement of TLS 1.2 for our WiFi setup our Raspberry Pi3 dropped off the network. I spent a fair amount of time running wpa_supplicant in debug mode along with Wireshark on the RADIUS server and it was clear that the Pi was insisting on using TLS 1.0, which was the cause of the failure.

Once you get into the wonderful world of wpa_supplicant.conf variables and examples, you’ll find there are a number of ways to do things and everyone’s mileage seems to vary. In my case I knew what I wanted: The Pi need to be told to use TLS 1.2. I discovered some settings for phase2 that allow you to disable the various TLS versions. So my thinking was to just disable 1.0 and 1.1 so I ended up with the following additional line in wpa_supplicant.conf

phase2=”auth=MSCHAPV2 tls_disable_tlsv1_0=1 tls_disable_tlsv1_1=1″

This unfortunately did not solve my problem. Even though I could see from the debug output the flags were being set, the Pi still negotiated for TLS 1.0.

Some more digging around on google suggested that TLS 1.2 was set as default in wpa_supplicant 2.4. As I was still running Wheezy, my version was 2.3. So after an upgrade to Stretch I was able to install wpa_supplicant 2.4 and the Pi could once again connect to WiFi. It also turns out you don’t really need all the phase1 and phase2 settings (with Windows RADIUS at least). So for reference my working wpa_supplicant.conf looks as follows:

ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
update_config=1

network={
ssid=”MyAccessPointName”
proto=RSN
key_mgmt=WPA-EAP
pairwise=CCMP
auth_alg=OPEN
eap=PEAP
identity=”domain\username”
password=”yoursupersecurepassword”
}

 

Windows 2012 R2 RADIUS Authentication TLS Troubleshooting

December 23rd, 2017

I’ve steadily been working on improving the security on our internal systems. One of the recommendations is to disable SSL 3.0 and TLS 1.0 on your Windows Servers along with weak ciphers. As with all things IT, there is always some unexpected repercussions when making changes 🙂 Shortly after pushing out a GPO update to disable SSL 3.0 TLS 1.0 users could not longer connect to the WiFi network using RADIUS authentication. The error was: “Can’t connect because the sign-in requirements for your device and the network aren’t compatible. Contact your IT support person”. So I reverted the changes so I could dig into it a bit more.

We use a Fortigate with FortiAPs and I assumed (incorrectly) that the RADIUS setup on the Fortigate was to blame. I assumed that the Fortigate somehow brokered the RADIUS authentication, but it only acts as a “facilitator”. Fortinet support were nice enough to point out that my problem was more likely between my RADIUS server and the supplicant and sent me the following summary of the authentication steps:

Yes, indeed Fortigate plays a part in the WPA2 AES authentication with EAP-PEAP MSchapv2, to summarize.

There are three parts to this authentication.
One: Supplicant – which can be windows/linux/mac/ios/android device that connect to wifi.
Two: Authenticator (Radius client) – which converts the EAPOL packets to RADIUS packets (RADIUS Request/Radius Challenge/Radius Reject etc) and send to Radius server.
-which can be the networking device such as FortiAP + Fortigate OR Cisco AP + Cisco Controller etc.
Three: Authentication server (radius server) – which processes the Radius requests sent by the Authenticator on behalf of client.

Once the outer tunnel is formed, the SSL handshake happens directly between the supplicant and the radius server. (in your case/as per the configuration)
So, in the case of SSL handshake, it depends on the Supplicant SSL version/ cipher suite capability and the radius server SSL Version/cipher suite capability.

So with some testing I figured out that disabling TLS 1.0 was the thing that broke RADIUS auth. Some reading suggested that Windows 10 clients preferred TLS 1.2, so given where I was making changes my issue was seemingly on Windows 2012 R2, but technical articles suggested it was supposed to support TLS 1.2?

After a bit more googling I found a fantastic article by Jim Vajda over at Frame by Frame. He demonstrates how to use Wireshark to verify cipher suites. At first I tried this on the client side, but the capture didn’t yield any results for RADIUS. I then ran the capture on the RADIUS server and discovered the problem: In the “client hello” I could see the Windows clients were sending TLS 1.2 as preferred, yet when my RADIUS server responded it was sending back TLS 1.0 as preferred and they would negotiate on what the server offered. When I disabled TLS 1.0 completely I would then get a Access-Reject error.

Back to google with a more focused search revealed that you can set the preferred TLS version for EAP in the registry. As a final hurdle, I got stymied by a copy and paste “feature”. If you create the registry entry and copy paste “0xC00” into the field it will happily accept it, however if you manually type it in, you quickly discover that the “x” is invalid and all you really need is “C00”. With EAP now forced to 1.2, TLS 1.0 disabled and services restarted the server finally responded with TLS 1.2 as preferred and my Windows 10 clients could once again connect.

Here are some samples from Wireshark. These are captured on the Windows RADIUS server. You can simply use RADIUS as a filter in Wireshark to find the correct packets.
Client Hello with TLS 1.2

Server Hello with TLS 1.0

Rejection Error (When disabling TLS 1.0 completely. Which results in the error seen on the Windows 10 clients)

Server Hello with TLS 1.2 (after applying EAP registry change)

Anyway hope you enjoy this fun bit of troubleshooting and that it’s of some use to you.

SUSE SLES 11 for SAP HANA install fails with rebootException

June 10th, 2017

This document applies to the the custom SLES image built for easier SAP HANA deployments located here: www.suse.com/slesb1hana. At the time of writing the link pointed to: sles11_sp4_b1.x86_64-0.0.10.preload.iso. My first problem with using this image is that there is no checksum to validate your dowload. Here is my md5sum: 8b5e4a223b85a7b144b55b86676938e3

I had to do a fresh load on a Dell R620 using iDRAC. As you will soon discover with SAP HANA, the hardware requirements are pretty strict and often don’t make sense (well to me at least). This particular installation had 8 x 300GB SAS, which the SAP documentation allocates to 6 x 300GB in RAID5 and 2 x RAID0 volumes for the remaining disks which it then puts into a software RAID0 for logs (why not just use the HW RAID you spent $$$ on and mirror them?). Anyway the tip for creating multiple RAID volumes on this platform is to use the Ctrl+R option at boot rather than the Wizard in the life cycle controller.

So I used iDRAC to mount the ISO as virtual CD/DVD ROM and set it to boot from there. The installation is pretty simple, you boot, select a disk, it writes out the image file and asks a few post install questions. In my case however I was presented with the target disks, but regardless of which key I pressed the installation would crash to console:

The keywords here are:

  • System Installation cancelled
  • rebootException: reboot in 120sec…

If you go digging in part 2 of the “How to install SUSE Linux Enterprise Server for SAP Business One products on SAP HANA” you will find a small section under troubleshooting that mentions a similar error. The “solution” isn’t really a solution, just prevents the auto reboot so you can troubleshoot.

So I fed back to the SAP consultant who was patiently waiting for the OS load that I was having this issue and we initially thought the ISO might be corrupt (hey wouldn’t a md5sum be handy right about now?). We downloaded a second copy each and all three matched, so that wasn’t it. He then thought about it some more and mentioned that someone had once powered the server off before getting it to work. So I used the cold boot function in iDRAC and that didn’t work either. I dug around a bit more for clues with this power thing in the back of my head and without really thinking it would make any difference whatsoever I used iDRAC to power the server down and then power it up. Would you believe it this time round I could select a disk and the installation continued successfully!

HP ML110 G6 CPU & RAM Upgrades

January 22nd, 2016

I recently upgraded some internals in my HP ML110 G6 and thought I’d post some useful details on RAM and CPU combinations. The whole thing started with a refresh of my ZFS NAS. I wanted to move from 4x2TB RAIDZ1 to 6x4TB RAIDZ2. The first thing that was needed was some additional RAM. The recommendation seems to be a minimum of 8GB RAM for ZFS and suggested is 1GB RAM per 1TB usable space. I did have 1 x 2GB HP and 1 x 4GB HP ECC modules spare and found this nice site suggesting that 4 & 8GB Kingston ECC would also work and that you can install up to 32GB (HP specs say max 16GB).

My supplier could no longer get 4GB modules so I ordered 1 x KVR1333D3E9S/8G. Unfortunately when I installed it my system would only see 2GB and then freeze. I checked my CPU specs (Pentium G6950) and it supports max 16GB. Unfortunately it seems it only does that in 4 x 4GB modules. I then managed to locate a Xeon X3430 and it successfully could see all 8GB RAM. I then added the 2 spare HP modules as well for a funny total of 14GB RAM. As this is a home NAS I’m not too worried about perfectly balanced RAM.

Some comments I saw said you needed an add-on graphics card for the Xeon processor, but this is not the case. I simply removed the Pentium and added the Xeon (with some new thermal grease of course) and it all worked. The PSU also seems to handle the 6 x 4TB Seagate NAS drives without any issues. NAS4Free runs off USB directly on the motherboard.

Here is a download link for the most recent BIOS 2011.08.26 (SP54622). Requires a dropbox account.

Edit 2017/04/25: BIOS link updated.

OpenStack: Unable to access the floating IP

September 15th, 2015

So as part of my foray into OpenStack I had allocated floating IPs, but had never actually tested that I could access services on them, until recently. I spent quite a bit of time delving into the router config, looking at iptables rules and tracing packets with tcpdump all in vain. Before you get in deep and dirty first check your default security group rules. It turned out to be my problem and was really easy to fix.  I was using OpenStack Kilo on CentOS 7 with Neutron Networking. selinux and iptables where enabled.

The default security group does not allow ingress traffic to pass by default. You can change that in the dashboard: Compute > Access & Security > Security Groups > select default > Manage Rules. Here you can add ICMP and other inbound mappings likes SSH and HTTP.

This CLI example allows ICMP and SSH

neutron security-group-rule-create –protocol icmp –direction ingress –remote-ip-prefix 0.0.0.0/0 default

neutron security-group-rule-create –protocol tcp –port-range-min 22 –port-range-max 22 –direction ingress –remote-ip-prefix 0.0.0.0/0 default

If this isn’t your problem then you can start checking your router config and iptables. Two really good guides I used were:

OpenStack: Fix “Missing” External IPs in Neutron and The Quantum L3 router and floating IPs  (References quantum, but still applies to Neutron)

This also provided a nice overview of floating IPs, but uses Nova Networking: Configuring Floating IP addresses for Networking in OpenStack Public and Private Clouds

 

 

CentOS on VMware: vmxnet3 Failed to activate dev: error 1

August 26th, 2015

I’ve been experimenting with vSphere 6 and vRealize Automation in my lab environment and hit an interesting problem when deploying CentOS 6 & 7 VMs. I had created a network in NSX for the tenant, which created 2 distributed port groups on my distributed switch: vxw-dsv-XX-virtualwire-1-sid-XXXX-name and vxw-vmknicPg-dvs-XX. For some reason I could only see the xw-vmknicPg-dvs group in vRealize so I made a poor assumption and assigned it to the tenant.

I deployed the VMs and when networking in CentOS started I got the following error:

vmxnet3 0000:03:00.0 ens160: intr type 3, mode 0, 2 vectors allocated

vmxnet3 0000:03:00.0 ens160: Failed to active dev: error 1

I could reproduce this error by running

rmmod vmxnet3

modprobe vmxnet3

I tried upgrading vmware tools and switched to open vm tools, but this made no difference. Eventually I manually changed the attached port group in vCenter and the error went away and the driver loaded correctly. I then went back into the tenant reservations and the vxw-dsv-XX-virtualwire now appeared and I could assign it.

glance Invalid OpenStack Identity credentials.

May 20th, 2015

I’ve recently been experimenting with OpenStack Juno on CentOS 7 and hit an annoying problem where the glance image-create and image-list commands would both fail with “Invalid OpenStack Identity credentials”. All my other services were fine, keystone was happy and returned all the correct information. I worked through countless posts online all describing the same problem, most were caused by issues with keystone, database setup or the auth_uri and identity_uri formatting. I checked my config files over and over and they were all correct.

I then pushed the verbosity and debug up and got the following in the logs:

DEBUG keystoneclient.session [-] REQ: curl -i -X GET http://controller:35357/ -H “Accept: application/json” -H “User-Agent: python-keystoneclient” _http_log_request /usr/lib/python2.7/site-packages/keystoneclient/session.py:155
INFO urllib3.connectionpool [-] Starting new HTTP connection (2): controller
WARNING keystonemiddleware.auth_token [-] Retrying on HTTP connection exception: Unable to establish connection to http://controller:35357/

So I ran the curl command from the CLI and got:

HTTP/1.1 300 Multiple Choices
Vary: X-Auth-Token
Content-Type: application/json
Content-Length: 757
Date: Wed, 20 May 2015 10:34:15 GMT

{“versions”: {“values”: [{“status”: “stable”, “updated”: “2013-03-06T00:00:00Z”, “media-types”: [{“base”: “application/json”, “type”: “application/vnd.openstack.identity-v3+json”}, {“base”: “application/xml”, “type”: “application/vnd.openstack.identity-v3+xml”}], “id”: “v3.0”, “links”: [{“href”: “http://controller:35357/v3/”, “rel”: “self”}]}, {“status”: “stable”, “updated”: “2014-04-17T00:00:00Z”, “media-types”: [{“base”: “application/json”, “type”: “application/vnd.openstack.identity-v2.0+json”}, {“base”: “application/xml”, “type”: “application/vnd.openstack.identity-v2.0+xml”}], “id”: “v2.0”, “links”: [{“href”: “http://controller:35357/v2.0/”, “rel”: “self”}, {“href”: “http://docs.openstack.org/”, “type”: “text/html”, “rel”: “describedby”}]}]}}[root@cm01 support]#  cu curl -i -X GET http://controller:35357/ -H “Accept: application/json” -H “User-Agent: python-keystoneclient”

Trying the URL in a browser also worked. So to me it looked like the service was running correctly.

So I thought about it and double checked that the firewall was disabled (it was). I then disabled selinux completely knowing that OpenStack was supposed to work in harmony with it. After a reboot many OpenStack services didn’t start so I then tried permissive rather than enforcing and glance image-list started working! So I checked the manual again and found I had missed a crucial step which was:

yum install openstack-selinux

The strange thing that not installing this only seems to affect glance and none of the other services.

Hope this helps someone else in the future 🙂

OSX Remote Desktop Client Fails With ‘404 not found’

July 18th, 2014

I recently had a strange issue with the OSX version of Remote Desktop Client connecting to a RDWeb server over the internet. The client kept reporting “The gateway failed to connect with message 404 not found”. Digging into the logs under the about box I found the following:

[2014-Jul-18 11:23:53] RDP (0): —– BEGIN ACTIVE CONNECTION —–
[2014-Jul-18 11:23:53] RDP (0): client version: 8.0.24875
[2014-Jul-18 11:23:53] RDP (0): Protocol state changed to: ProtocolConnectingNetwork(1)
[2014-Jul-18 11:23:53] RDP (0): correlation id: ec36d635-6a34-ba40-9e8a-76ac60330000
[2014-Jul-18 11:23:53] RDP (0): Resolved ‘www.contoso.com’ to ‘x.x.x.x’ using NameResolveMethod_DNS(1)
[2014-Jul-18 11:23:53] RDP (0): Resolved ‘www.contoso.com’ to ‘x.x.x.x’ using NameResolveMethod_DNS(1)
[2014-Jul-18 11:23:54] RDP (0): HTTP RPC_IN_DATA connection redirected from https://www.contoso.com:443/rpc/rpcproxy.dll?localhost:3388 to https://www.contoso.com/RDWeb/rpc/rpcproxy.dll
[2014-Jul-18 11:23:54] RDP (0): HTTP RPC_OUT_DATA connection redirected from https://www.contoso.com.com:443/rpc/rpcproxy.dll?localhost:3388 to https://www.contoso.com/RDWeb/rpc/rpcproxy.dll
[2014-Jul-18 11:23:54] RDP (0): Resolved ‘www.contoso.com’ to ‘x.x.x.x’ using NameResolveMethod_DNS(1)
[2014-Jul-18 11:23:54] RDP (0): Resolved ‘www.contoso.com’ to ‘x.x.x.x’ using NameResolveMethod_DNS(1)
[2014-Jul-18 11:23:54] RDP (0): Exception caught: Exception in file ‘../../librdp/private/httpendpoint.cpp’ at line 217
User Message : The gateway failed to connect with the message: 404 Not Found
[2014-Jul-18 11:23:54] RDP (0): Exception caught: Exception in file ‘../../librdp/private/httpendpoint.cpp’ at line 217
User Message : The gateway failed to connect with the message: 404 Not Found
[2014-Jul-18 11:23:54] RDP (0): Protocol state changed to: ProtocolDisconnecting(7)
[2014-Jul-18 11:23:54] RDP (0): Protocol state changed to: ProtocolDisconnected(8)
[2014-Jul-18 11:23:54] RDP (0): —— END ACTIVE CONNECTION ——

Which was equally useless.

Some digging around on google landed me on a MSDN blog post. Reading through the comments on page 2 (you know sometimes there is useful stuff!) someone mentioned the same error and that the redirection in IIS was to blame. Sure enough I had setup a redirect from the Default Web Site to point at the RDWeb location to make things easier for the users. Disabling the redirect fixed the problem. I then did some additional testing and enabling the redirect and selecting “Only redirect requests to content in this directory (not subdirectories)” allowed me to retain the redirect and allow the OSX RDP client to work.

IIS Redirect Settings

 

 

Dell VRTX with Nvidia Quadro K2000 and RemoteFX

April 2nd, 2014

I recently had to setup a HyperV environment for VDI which had to support RemoteFX for a client and I really struggled to get clarity on supported graphics cards. One of the first things I discovered was that the list of certified graphics cards for Windows Server 2012 R2 is very short and the list of recommended RemoteFX cards is even shorter. AMD is listed but they only have drivers up to Windows 2008 R2. AMD tech support confirmed this in February 2014: “The driver for Server 2012 R2 is not available at this time”. The Windows Server Catalog confirms this, yet you will find details around the internet showing Windows 2012 working with AMD, which adds to the confusion.

I didn’t have the luxury of loan cards or proven working configuration from our local suppliers, so faced with a limited set of cards I fell back onto the requirements for RemoteFX which are well documented and then worked through what the suppliers did have in stock:

  • A SLAT-capable processor
  • A DirectX 11-capable GPU with a WDDM 1.2 compatible driver.

With this in hand I managed to find the Nvidia Quadro K2000 and K4000 cards that a local supplier had in stock. There were not on the Windows Server Catalog, but they were listed as supported for Windows Server 2012 R2 on the Nvidia site by consulting the driver download details and supported DirectX 11.

These were to be installed into a Dell VTRX which allows you to map PCIe slots through to individual blades. One of the problems with adding a video card to a server is the auxiliary power requirement, fortunately the K2000 doesn’t require additional power. I did however subsequently discover that each of the full height PCIe slots do have auxiliary power connectors, but require a Dell cable (Part No. CPL-X5DNV). Coincidently the factory installed supported card from Dell is a AMD FirePro 7000. I am unsure if the power cable can be ordered separately.

For my initial testing I used Windows Server 2012 R2 and the 332.50 WHQL drivers and thankfully the card was detected and supported for RemoteFX. I then tried the same thing on HyperV Server 2012 R2 and hit an interesting bug with installing the Nvidia drivers. Running setup the installer starts, extracts the files and presents the EULA. When you accept the EULA it disappears into the background. When launching Task Manager I saw the installer still running, so I ended the process, opened a command prompt, navigated to the extracted Nvidia driver folder and ran: “setup –s –k”. This does a silent install and reboots on completion. This did the trick and got the drivers installed correctly.

Here is a screen shot of my working Nvidia Quadro K2000 card in Windows Server 2012 R2.

Windows 2012 R2 with K2000

A nice step-by-step guide for installing HyperV with RemoteFX can be found here.

Edit: After pushing this into production and ramping up users, I quickly discovered that there is a huge requirement on VRAM. A card with 2GB of VRAM could only service about 9-10 users. Of course once this problem reared it’s ugly head a quick google revealed the VRAM requirements are based on screen resolution. So caution to prospective RemoteFX users. Another consideration I then subsequently realised is that in a fail-over state the machines in the failed node that relied on RemoteFX would not be able to start if there was no spare VRAM headroom on other nodes in the cluster.

Edit: Updated part no. for the Dell cable. Thanks Greg!