SharePoint 2010 Search: The underlying connection was closed: Could not establish trust relationship for the SSL/TLS secure channel.

Last night we stumbled intoo an issue at one of our clients.The clients supplier had recently performed a lot maintenance activities in a maintenancewindow on their SharePoint 2010 environment. One of them was replacing the existing SSL certificates with a new wildcard certificate. The SharePoint farm exists of two WFE’s, one APP and a SQL server. The regular three-tier-farm you see alot.

Usually this change activity should not be a problem however, the supplier was somewhat to enthasiastic with replacing the SSL certificates. They unconsciously replaced the self-signed SharePoint Services SSL certificate on the SharePoint Web Services site in IIS with the new wildcard SSL certificate. Without knowing this history we started troubleshooting SharePoint Search which was causing issues. When browsing to the Search Service Application Administration component it showed this error message on the Administration Page:

The underlying connection was closed: Could not establish trust relationship for the SSL/TLS secure channel.

Users were not getting back any search hits and when browsing to the Content Sources in Central Administration, SharePoint threw a Correlation ID.

After going through the logging and finding the error related to the correlation ID we noticed this error in the ULS logging:

An operation failed because the following certificate has validation errors:\n\nSubject Name: CN=*contoso.com, OU=IT, O=”Contoso”, L=Paris, S=Paris, C=FR\nIssuer Name: CN=Thawte SSL CA – G2, O=”Thawte, Inc.”, C=US\nThumbprint: AB12CD34XX123456ABCDEFG123XXX\n\nErrors:\n\n SSL policy errors have been encountered.  Error code ‘0x2’

This error pointed us to SharePoint Search trying to communicate over HTTPS to it’s back-end web-services with the use of the clients wildcard SSL certificate and failing. After troubleshooting further we noticed the wilcard certificate was indeed binded on the HTTPS listener on port 32844. Now you would think, that’s an easy fix to do. Just remove the certificate on the binding and there you go. Well, when configuring a SSL certificate on a binding in IIS you are required to select a SSL Certificate located in the servers Personal Certificate store. And that’s not the correct store for SharePoint web-services.

The rootcause:

SharePoint uses it’s own “SharePoint Services” self-signed certificate to securely communicate over HTTPS with it’s web-services and other farm memberservers. You can find this specific certificate looking in the Local Computer Certificate Store using the MMC snapin. I discovered a folder called SharePoint which had three certificates in it, all issued by the Sharepoint Root Authority:

  • SharePoint Security Token Service
  • SharePoint Security Token Service Encryption
  • SharePoint Services

The solution:  

Now to undo the faulty configuration and reconfigure the correct “SharePoint Services” SSL certificate on the SharePoint Web Services IIS site – which can’t be done with the IIS Manager Console –  you can either use good old netsh command-line or use PowerShell. Now the last method is the method I personally prefer so that’s what you’re getting.. =)

$iisBinding = “SharePoint Web Services”
$webservice = Get-WebBinding -name $iisBinding -protocol “https” -port 32844
if ($webservice)
{
$pfx = get-childitem cert:\\localmachine\sharepoint | where { $_.subject -contains “^CN=SharePoint Services” }
if (!$pfx)
{
throw “No certificate found with the name “SharePoint Services” in the SharePoint certificate store in MMC. The script has stopped”
}

[void]$webservice.AddSslCertificate($pfx.ThumbPrint, “SharePoint”)
}
else
{
throw “The certificate cannot be assigned to the designated IIS Binding. Check servers eventviewer for further information.”
}

if (!(Get-WebBinding -name $iisBinding -protocol “http” -port 32843))
{
New-WebBinding -name $iisBinding -ip “*” -port 32843 -protocol “http”
}

This script looks for an existing IIS site or binding with the port 32844 (which SharePoint uses OOTB). Once found it checks whether it can find an existing valid SSL certificated in the computer which contains the name “SharePoint Services”. If both are found it adds the correct certificate to the designated SharePoint Web Services binding. An action which can’t be done via the ISS Management Console.

This recovers the Search Service Application to it’s original healthy state. In our specific scenario we had to run a “Full Crawl” to on all the Content Sources since the index became inconsistent too. I always use this small PowerShell script to initiate the Full Crawls:

$searchapp = Get-SPEnterpriseSearchServiceApplication “Search Service Application” 
Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $searchapp | where-object { $_.Type -contains “Sharepoint”} | foreach-object { $_.StartFullCrawl() }

If this still doesn’t fix your issue you’re being really unlucky. You will still be getting this error message prompted when initiating searchresults:

“Property doesn’t exist or is used in a manner inconsistent with schema settings” and not receiving any People results.

But don’t worry, there’s still one option to be checked. We found out that after this issue is dependant on the actual webparts used by search (including People Search). For this the solution in the end was to edit the People Core Result Web Part on the search results page and expand the “Display properties” and make sure “Use Location Visualization” is checked.

ULS log settings. What setting should I use?

I get this question surprisingly often: “what is the best setting for my SharePoint Farm diagnostic logging (ULS logs)?
Unfortunately; there is no best answer here because it really depends on how your farm is configured and, most importantly, how it is managed and runs.

So there isn’t an answer then?” Well.. there is, kinda, because there are guidelines! And knowing how to use these guidelines got me triggered to write up some advice.
The other day a customer called us and told us that their SharePoint environment did not work anymore. Their hosting company rebooted the server and it worked again, but could not explain the issue so they asked if we could help out. After logging in it didn’t take long to see what happened: the guidelines were not followed.

  • The ULS logs were on the system drive
  • The ULS logs were configured with Verbose logging

What we did here was move the ULS log file directory away from the system drive and configure the ULS logs to their default setting.

Basically this last one sentence should be your best practice in most SharePoint environments.
You should never have your ULS log files on your system drive. Yes, ULS is designed for knowing when a disk space issue is imminent and reduce logging when this happens, but issues can still occur in certain cases!
Furthermore never use Verbose when you are not actively troubleshooting one or more issues on a environment. If you do use it, remember to set the level back to what you need when you are done with troubleshooting.

So, back to the guidelines.
I’ve created an overview of when to use what setting. Use it the right way and you are well on your way to a properly managed SharePoint environment!

The default setting.
Use this in 90% of the SharePoint environments:

  • SharePoint 2007: STSADM -o SetLoggingLevel –Default
  • SharePoint 2010: (start the SharePoint Management Shell) Clear-SPLogLevel
  • SharePoint 2010: (start the SharePoint Management Shell) Clear-SPLogLevel

The-very-good-managed-SharePoint-environment setting (yeah, I just made that up).
Use this when you need almost no logging since you have very few to no issues most of the time:

  • SharePoint 2007: Central Admin, select “All” categories, and “Error”, “Error”.
  • SharePoint 2010: (SharePoint Management Shell) Set-SPLogLevel -TraceSeverity Unexpected -EventSeverity Error
  • SharePoint 2013: (SharePoint Management Shell) Set-SPLogLevel -TraceSeverity Unexpected -EventSeverity Error

The troubleshoot setting.
Use this when you need to troubleshoot an issue (remember to set the level back to what you need when you are done with troubleshooting)

  • SharePoint 2007: Open Central Admin > Operations > Diagnostic Logging. Then set ‘select a category’ to ‘All’ categories, set ‘Least critical event to report to the event log’ box value to ‘Warning’. Set ‘Least critical event to report to the trace log’ box value to ‘Verbose’.
  • SharePoint 2010: (SharePoint Management Shell) Set-SPLogLevel -TraceSeverity Verbose -EventSeverity Verbose
  • SharePoint 2013: (SharePoint Management Shell) Set-SPLogLevel -TraceSeverity Verbose -EventSeverity Verbose

There is more to this, of course, but if you obey and maintain these guidelines you can’t go wrong.

Remember that a good performing SharePoint environment starts with who manages it. So I’ll leave you with what I tell my customers who manage their SharePoint environment themselves:

don’t prepare your environment for SharePoint but make sure you are prepared for SharePoint.

Make sure you properly know how it works and users will love their environment!

 

GetSafeOrdinal FATAL ERROR Could not find column LastModifiedTime in PartitionProperties

Migrating from one SharePoint version to another can be harder then it seems at first glance. Thinking through your migration strategy and prerequisites is the first and foremost important step, but no migration is without issues, and no issue is without an error you nor Google has seen before.

In our case this issue happened when migrating from SharePoint 2010 to SharePoint 2013 and upgrading the User Profile Service Application. This caused the error:

GetSafeOrdinal: FATAL ERROR: Could not find column ‘LastModifiedTime’ in PartitionProperties. Please check if the Content and Federated Services farm are of compatible versions.

After rerouting our steps we found out that our customer had ordered and installed a SharePoint .iso in their native language and not in English, where the old SharePoint environment was in English but with a language pack installed in their native language. So where the environment seems Dutch on both versions when using Central Administration, basically these are actually two different languages. Migrating the UPS Sync database from one environment to another caused this error, but recreating the UPS Sync database (and keeping the migrated and upgraded UPS Social and UPS Profile databases) solved this error. All other databases could be migrated and upgraded but this particular one caused a little headache, but nothing is unsolvable – right? 🙂