Thursday, December 27, 2007

Update on chkdsk wiping out permissions on folders

After spending considerable time with Microsoft on the phone and an ezassist session on the server in question this is what I found out. There are two hotfixes available. One for W23k servers with SP1 and one for W23k servers with SP2. You must apply these hot fixes and reboot the server BEFORE you run chkdsk on any volumes. There are no easy ways to replace the permissions on the user folders once the damage is done. Option A is the GUI routine where you right click on the user folder and readd the users or global groups and then select to push to all files and folders. Option 2 is to use a xcacls.vbs script to run lcacls.exe and associated commands to push the permissions out to each file and folder. If you need to apply global changes only to a group of folders you can go to the top root folder and make your changes and then select inheritance for all sub folders and files under the root. That will not work for individual folder security permission needs which is much more granular.

I dodged a bullet with regards to replacing the security permissions on my file server today. Since the office is closed and no one is accessing the server or making changes to data I was able to revert the E volume back to 12/26/07 at 5 am via the Volume Shadow Copy. It took some time for this to happen but once it was complete and I rebooted the server I spot checked several folders and everything was back to where it was before I ran chkdsk and accidentally stripped out the permissions. If this had been a normal work day and all the users had been making changes to their data or adding data to their user directories then I would have been forced to go the lcacls.vbs route or the manual point and click GUI routine to replace the permissions. I can not begin to tell you how much fun that is.

chkdsk disaster on W2k3 NTFS partition

In October of this year I ran Chkdsk on a volume on one of my W2k3 servers. I ran this utility because we had evidence of file or folder corruption on that particular user share. I was rewarded for my actions by chkdsk removing ALL the security permissions on every single folder. Needless to say I was a huge hit with the rest of the server support team who wound up helping me to replace all the security permissions on these folders so that folks could access their data. In four days we got this accomplished. We did our research and according to Microsoft knowledgebase articles this was something that happens to Windows 2000 servers. No where did we see where Windows 2003 servers with SP1 were supposed to be susceptible to this type of problem. We did find an obscure article that hinted that there was an NTFS related file that SP2 would update that would keep this from happening. Fast forward to December and this same server is now updated with SP2 and all of the Microsoft Updates that are available for Windows 2003 Standard Edition. We are in the midst of Christmas holidays and everyone other than us unlucky contractors are out doing other things than work. I figured this was a wonderful opportunity to let chkdsk do its magic and fix whatever data corruption that might be on the user share of this server. I run chkdsk with the /f switch so it dismounts the volume and off to the races it goes. The volume is close to 1.5 TB in size so it takes several hours to do. Phase 1 and 2 go along just fine with nothing being reported back to the screen as an error. Phase 3 checks the indexes and that is when I got the dreaded “replacing invalid security id with default security id for file (numbers). Needless to say I was not really excited about the prospect of having to manually replace the permissions on all these folders. I called my boss, told her the problem and she authorized me to call Microsoft and get paid support. I am hoping that they can identify the problem for us so that I can document it and make sure it never happens again on our hundreds of other servers and that they can provide me with a script from AD that will allow me to replace the permissions with the correct permissions from the command line. I will let you know what I find out when the dust settles and things are fixed.