Friday, August 23, 2013

SharePoint Configuration Cache

I recently had a big issue with the SharePoint Configuration Cache.

It started as a simple attempt to reset the cache. I ran the following powershell script(taken from http://woutersdemos.codeplex.com/releases/view/85474)

Add-PSSnapin Microsoft.SharePoint.PowerShell
$Servers = Get-SPServer | ? {$_.Role -ne "Invalid"} | Select -ExpandProperty Address
Write-Host "This script will reset the SharePoint config cache on all farm servers:"
$Servers | Foreach-Object { Write-Host $_ }
Write-Host "Press enter to start."
Read-Host
Invoke-Command -ComputerName $Servers -ScriptBlock {
try {
Write-Host "$env:COMPUTERNAME - Stopping timer service"
Stop-Service SPTimerV4
$ConfigDbId = [Guid](Get-ItemProperty 'HKLM:\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\14.0\Secure\ConfigDB' -Name Id).Id
$CacheFolder = Join-Path -Path ([Environment]::GetFolderPath("CommonApplicationData")) -ChildPath "Microsoft\SharePoint\Config\$ConfigDbId"
Write-Host "$env:COMPUTERNAME - Clearing cache folder $CacheFolder"
Get-ChildItem "$CacheFolder\*" -Filter *.xml | Remove-Item
Write-Host "$env:COMPUTERNAME - Resetting cache ini file"
$CacheIni = Get-Item "$CacheFolder\Cache.ini"
Set-Content -Path $CacheIni -Value "1"
}
finally{
Write-Host "$env:COMPUTERNAME - Starting timer service"
Start-Service SPTimerV4
}
}

This script seemed to work, and I went about my way. However, that night, we noticed our intranet was running VERY slowly. Many pages were not being served at all.

I determined that there were no Timer Jobs running at all, even though the service was running. I thought it would go away, and I would address it in the morning.

This morning, I learned that the problem was much more apparent. The help desk were getting several calls about the slowness of the Intranet.

I found the following steps from http://blogs.msdn.com/b/jamesway/archive/2011/05/23/sharepoint-2010-clearing-the-configuration-cache.aspx:

  1. Stop the Timer service. To do this, follow these steps:
    1. Click Start, point to Administrative Tools, and then click Services.
    2. Right-click SharePoint 2010 Timer, and then click Stop.
    3. Close the Services console.
  2. On the computer that is running Microsoft SharePoint Server 2010 and on which the Central Administration site is hosted, click Start, click Run, type explorer, and then press ENTER.
  3. In Windows Explorer, locate and then double-click the following folder:
  4. %SystemDrive%\ProgramData\Microsoft\SharePoint\Config\GUID
  5. Notes
    1. The %SystemDrive% system variable specifies the letter of the drive on which Windows is installed. By default, Windows is installed on drive C.
    2. The GUID placeholder specifies the GUID folder. There may be more than one of these.
    3. The ProgramData folder may be hidden. To view the hidden folder, follow these steps:
      1. On the Tools menu, click Folder Options.
      2. Click the View tab.
      3. In the Advanced settings list, click Show hidden files and folders under Hidden files and folders, and then click OK.
      4. You can also simply type this directly in the path if you do not want to show hidden files and folders.
  6. Back up the Cache.ini file. (Make a copy of it. DO NOT DELETE THIS FILE, Only the XML files in the next step)
  7. Delete all the XML configuration files in the GUID folder (DO NOTE DELETE THE FOLDER). Do this so that you can verify that the GUID folders content is replaced by new XML configuration files when the cache is rebuilt.
    Note When you empty the configuration cache in the GUID folder, make sure that you do NOT delete the GUID folder and the Cache.ini file that is located in the GUID folder.
  8. Double-click the Cache.ini file.
  9. On the Edit menu, click Select All.
  10. On the Edit menu, click Delete.
  11. Type 1, and then click Save on the File menu. (Basically when you are done, the only text in the config.ini file should be the number 1)
  12. On the File menu, click Exit.
  13. Start the Timer service. To do this, follow these steps:
    1. Click Start, point to Administrative Tools, and then click Services.
    2. Right-click SharePoint 2010 Timer, and then click Start.
    3. Close the Services console.
  14. Note The file system cache is re-created after you perform this procedure. Make sure that you perform this procedure on all servers in the server farm.
  15. Make sure that the Cache.ini file in the GUID folder now contains its previous value. For example, make sure that the value of the Cache.ini file is not 1.
  16. Check in the GUID folder to make sure that the xml files are repopulating. This may take a bit of time.

Step 15 was not correct. My cache.ini file still said 1. so instead of relying on the powershell script, I did it manually. However, I found one file would not allow me to delete it. It continued to say "Access Denied." this couldn't be possible, because I was a local administrator. Well, I determined that the file was locked by another process (http://www.codeovereasy.com/2012/12/locked-file-in-sharepoint-configuration-cache/).

I downloaded LockHunter and forcefully deleted the file.

Restarted Timer Job service. Now, the cache.ini file is updated, xml cache files are being populated, time jobs are running and scheduled, and the intranet is no longer unresponsive.