You may or may not of heard about the massive mistake McAfee made last week.
On Wednesday 25th April they released a virus definition file (5958 – April 21st) that incorrectly identified svchost.exe as a threat and deleted it on systems running Windows XP SP3.
Svchost is used for launching services (full description here) and any individual instance can run a group of services. This means its a pretty critical process!
Unfortunately for us a large chunk of our client base is running McAfee anti-virus software, the others run Trend Micro.
We knew something wasn’t quite right when we received several calls all around the same time with similar symptoms. However, while the symptoms were similar they weren’t identical so initially we didn’t quite know what was going on. Unfortunately the one thing they did have in common was a loss of network connectivity which meant we couldn’t fully diagnose the issue.
Later that day McAfee issued a notice, an updated definition file and details of how to fix the issue.
Basically we had to,
Boot into safe mode
Add an EXTRA.dat to the c:\program files\commonfiles\mcafee\engine folder (or just run the 5959 Super DAT which is quicker)
Recover a copy of svchost from the service pack cache c:\windows\ServicePackFiles\i386\ or if not present, C:\WINDOWS\system32\dllcache\
Restart the computer
McAfee released an automated tool for this the following day (It’s in this KB article)
A simple enough fix to but as I said earlier every PC we’d seen with this issue had no network connectivity.
This meant we potentially had to physically visit ever single PC we look after.
I say potentially because this only impacts running Windows XP SP3, we do have some clients running Vista or Windows 7. But most of our clients still currently run Windows XP. Also VirusScan 8.7 systems were harder hit. Some of the PCs were still running 8.5.
Still, for some people it would be every PC they own
Now regardless of the size of your company ask yourself some questions.
How long would it take you to spend 5-10 minutes on every PC you look after?
Did you factor travel time into that?
Who do you make a priority when everyone is offline?
Fortunately we got a little lucky
We configure the McAfee products to fetch updates from the global McAfee update site every hour. Any servers on site will then check for and get updated every hour
PCs check every 2-3 hours but we also put a random delay on this. The main reason is so that on larger sites we don’t want lots of PCs all generating network traffic at the same time. By putting in the random offset it’s staggered through the day. So this is in combination with the fact McAfee actually got the DAT update out the same day meant that lots of PCs never actually received the faulty update.
That said. We still had a LOT of work to do.
We visited as many sites as we physically could over a two day period and some other sites that had some tech savvy people on site we managed to go through it on the phone with them.
I also had to cancel other appointments which I hate doing and some other promises I made were a little strained.
I’m sure we’ll still be dealing with issues at the start of next week
Obviously for our contract customers this was all at our expense.
I can’t even begin to think what this will cost McAfee as customers start to move away at their next renewal period.
McAfee have an FAQ here as well as a couple of blog post apologies.
As you can imagine there has been a lot of commentary on this and other vendors are jumping in to take advantage.
http://www.pcmag.com/article2/0,2817,2363018,00.asp
http://blogs.zdnet.com/Bott/?p=2031
Especially since it turned out this down to poor quality testing.
As the IT world always seems to throw odd coincidences, on Friday i got an email inviting me to the McAfee stand at the InfoSec exhibition next week – I imagine that stand is either going to be very busy……or very empty
This scenario is truly a management nightmare – an automated update that renders a PC unusable that can only be repair by hand. On top of this we’re going to have our own PR exercise to sort out.
All our end-users see is a broken PC. It’s our responsibility to keep them up and running and while we still fixed the problem. They’ll still be asking US questions as to why it
For new installations we moved away from McAfee long ago (there are other McAfee posts on this blog)
Our existing customers have been using McAfee for a variety of reasons but when the renewals come up we’ll be making a concerted effort to get them away.
Latest posts by Andy Parkes (see all)
- Dishley parkrun, Loughborough - August 1, 2023
- Woodgate Valley Country Park parkrun - July 22, 2023
- Abbey Park parkrun - June 8, 2023