Yesterday I was supposed to be spending the day thinking about SharePoint at an event Combined Knowledge were running (Vijay posted the details here)
I did spend the day thinking about SharePoint but for different reasons
First a little bit of background
We run two SharePoint sites internally that pretty much run our business
There is the SBS “companyweb” SharePoint site (WSS v2) – we’ve been using this from the beginning and we have a ton of information in here. Contract details, company calendar, contacts, customer network information, etc (you get the picture)
Not that long ago I did the side-by-side install to get WSS v3 up and running
The idea long term is to move everything over to the WSS v3 site but we’re doing it a bit at a time with the main function of that site currently being our helpdesk system
So back to the story!
I’d blogged a couple of times about problems I was having with emails and workflows so when I saw the details of an “infrastructure update” on the Microsoft download site I thought this may be the answer I’d been looking for
So I eagerly downloaded the update and this is where I made a fatal error
I’ll hold my hands up and say recently I haven’t been treating our internal systems with the same attention we would one of clients systems. We keep drumming into our clients that their systems run their business and why you need to look after them properly so I’m really disappointed in myself
So I installed the update and it failed
The “friendly error message” message was MOST unhelpful.
“Configuration of SharePoint Products and Technologies failed”
I was then informed that nothing would be rolled back and that I should correct the problem and re-run the update
This is where I first failed. Instead of taking my time and trying to figure out what the problem was I did a couple of searches and found solutions that seemed to fit some error messages I found in the logs and tried those
It made it even worse. I couldn’t get to the WSS v3 site or the v2 site (I still don’t understand why that was the case)
So at this point you’d think. Ok go back to the backup you took before you started.
Second failure. I’d just jumped in at the deep end on this one. Very careless of me
However, the overnight backup had taken a full copy of the v2 site so it wasn’t too long before I was able to get that up and running
My main panic was over as so much data was in there. Since a lot of the WSS v3 stuff is still work in progress most of the data was available somewhere else. If the worst came to the worst I’d have to start over and build it from scratch
Then I realised my next failing.
I’d been getting some notifications from the backups recently telling me “backup completed with exceptions” – basically it couldn’t backup some files so just skipped over them
I’d had a quick look and added it to my “to-do” list.
This is when I wished I’d treated the problem the same as I would a clients system and given it immediate attention. The files it skipped just happened to be the WSS v3 SQL database files…..argh!
This was when I got lucky. Before I’d started the update I had SharePoint designer open as I’d been working on some workflows and even though I wasn’t expecting anything to go wrong I took a backup from here – just in case
The difference between this and the WSS v2 site though was that it didn’t matter that the site was down. The restore fixed that!
To restore my SharePoint designer backup I needed a working SharePoint site!
Since I’d been so careless up till now I decided to get back to doing things right
I fired up a virtual machine and configured a SharePoint installation from scratch, then connected to it using SharePoint designer and verified my backup would restore ok
Once I was happy with this it was just a matter of removing SharePoint and reloading it back onto the SBS where I was then able to create a blank site and restore my backup file
It may sound so simple but it took up the whole of my day and I did my final restore at 1am
My workflows are now broken and all the alerts have gone but it could have been a lot worse
So another lesson learnt. I’ve added our internal systems onto our help desk system so it will now be treated in the same way as any other system we look after. I won’t jump in head first “just because it’s our system” and treat it no differently to any other server we look after
The next question I asked myself is why I did I get into this situation?
Impatience I guess.Things have been very busy lately and there were a ton of other things I wanted to get on with instead of testing a patch in a controlled environment to then put it on our own server. My attitude to the running of our own network was very wrong here
As with any mistakes I make I’ve certainly learnt from this one
I was a bit dubious about posting this but I’m treating it as my punishment (even though I feel like I’ve been punished twice as I missed the SharePoint event as well! :-) )