Thursday, August 26, 2010

Improve CRM Performance with Kerberos Auth Persistence for IIS

[UPDATE:  Thanks to Carsten Groth, who discovered the IIS 7.0 way to do things!  Also, my problem isn’t fixed.  So, this article is simply about performance improvements.]

I recently became engaged with Microsoft Support trying to fix a problem that occurred when the Application Pool for the CRM Site recycled its process worker(s).  I have a split-server deployment of CRM, and the affected machine was the Platform server.  When it would recycle its workers, many System Jobs and Workflows would become stuck in the “Waiting” state with an error message, which claimed an exception of either HTTP Status 400 (Bad Request), or HTTP Status 401 (Unauthorized).

This would occur for all jobs that ran in a 4 or 5 hour period after the recycle, but the situation would always ultimately resolve itself.  Every morning, I would have to resume a large number of jobs.  Also, I found that if I forced an application pool recycle, or restarted IIS, the problem would resurface immediately.

The technicians over at Microsoft reviewed my configurations and logs, and made a recommendation to apply KB 917557 to IIS on my web server.  The server runs Windows 2003 and IIS 6.0, and the logs would show that each request to the CRM website would first encounter an HTTP Status 401 message, which would force the client to submit authentication for a second connection—that would result in an HTTP Status 200 (OK).  Thinking that this could be causing my problem, I was asked to enable Kerberos Auth Persistence using the article above.

One thing to note about enabling Kerberos Auth Persistence is that it effectively cuts the number of requests being processed by IIS in half.  When I inquired about security considerations regarding this setting, I was told that there were none as the feature is inherently bound to HTTP Keep-Alive sessions.  For IIS 7.0, as pointed out by Carston Groth, reference KB 954873.


  1. This is an excellent tip, thank you so much! We have been struggling a lot with CRM performance over WAN connections with high latency. The 401 issue has been noted from IIS logs and we've been manually disabling NTLM authentication for static content such as images and script files, as well as creating custom 401 error pages with reduced size. If only I had known that you can achieve even better results with simply adding the EnableKerbAuthPersist that would have saved a lot of time...

  2. Thanks for the article David. 401's have been an issue at one of our client sites as well. Off to the lab for testing

  3. The 401 problem has apparently resurfaced just today for no apparent reason. So, while the problem I'm having may not be solved yet, the article above should still positively influence IIS and UI performance.

  4. Tested it on IIS 7.0/w2k8 with authPersistNonNTLM and this is working too. Thanks for this article setting my focus to test it on an IIS 7.0 system

  5. Carsten, I read a little German (very, very little), but I found your article with instructions on IIS 7 very informative. I will update this post with the link to the KB you referenced in yours.

  6. Excellent thanks Dave.

    Just to return the favour, for your original issue, I'd use Debug Diag (link below) to attach to the app pools (w3wp) & try to determine whats causing them to shut down.


  7. I know the processes are getting recycled according to our pool recycling schedule. Odd thing is, before UR10, I had never encountered this problem. It's as if the Platform server isn't "initialized". That's the best way to explain it. I can reproduce the situation at will, by either restarting IIS or recycling the CRM application pool. However, to fix it, all I need to do is browse to the CRM site on that server... and voila, workflows and system jobs start processing again. Odd issue.

  8. Additional updates on my particular problem: once IIS resets, the "Network Service" account loses authorization to call the CrmService.asmx file. It's not until somebody else connects to this service, on the Platform Server, that "Network Service" again becomes authorized. Very strange.


Unrelated comments to posts may be summarily disposed at the author's discretion.