Data encoding in the SharePoint Profile Service

I was goofing around with a PowerShell script that would compare all of the SharePoint Profile Properties for a single user profile with their mapped counterparts in AD. If you’ve used FIM and 2010 long enough, you just get to the point where you don’t trust it…at all…ever.

Sounds pretty simple right? Connect to the profile service, get the profile properties, fire up Get-ADUser and compare. Bang, done. Or maybe not…

I started noticing that for some fields, the compare was coming back $false, when it should have been $true. I figured there were control characters in the fields, either in AD or SharePoint so I stripped everything out I could think of, no luck. I counted the number of characters in each object property, they were the same. Hmmm…

Then I noticed that all of the failures had characters like “&” in them. So what, I mean, & is & right? Wrong. It would appear that when data is written to the SharePoint profile, it’s encoded in ASCII. When data is written into AD (or everything else), it’s UTF-8. So, since ‘&’ is different in ASCII than it is in UTF-8, the compare fails. Great, awesome, perfect. Thanks SharePoint builders…

Fear not, there’s actually an easy way to fix this. Before you compare the SharePoint profile property with what’s in the AD property, just run the ‘ole “.Normalize(“FormKD”) method on the SharePoint property, that should take care of it. For example, the “Title” field in SharePoint (assuming $spProfile is an object that contains the user profile from SharePoint):

($spProfile[‘Title’].Value).Normalize(“FormKD”)

And you’re done…