When Things That Are Supposed to Protect You Try to Kill You

This past week I was presented with a very unique issue.  A call came in with a production issue on a tier one application.  Unfortunately it was on a system that utilizes SQL 2000 SP4.  Having a rather small toolset to use for support I had to rely on Perfmon, sysprocesses and the usual old school tools.

What I found was page life expectancy was higher than usual, CPU was lower than usual, memory was usual, disk metrics were way lower than normal, but thread count was out the roof with pageiolatch waits.

We had all tech areas on the call and we brought in all our top DBA’s.  Earlier in my analysis I came across a scan64.exe that was running.  Our virus protection is setup to best practice for database servers on the “On Access Scan” being that we exclude the normal files.  I did not recognize scan64.exe, noticed it was only using 6% CPU so I looked up what processes was using it.  I found that it was indeed a virus scan product so I did what any other admin would do, I tried to kill it.  I couldn’t kill the process so I moved on since it wasn’t using much CPU.

In times past when on access scan would be killing a server, CPU would be off the chart.  Since scan64.exe wasn’t I didn’t give it much focus.  Another DBA brought it up and once we got Microsoft on the phone and mentioned scan64.exe they stated that belonged to a filter driver product with virus protection.  After much more research we found that this was scanning at the application layer ‘sqlserv.exe’.  All operations were being ran through that driver thus applying a DOS attack on the server.

After shutting down that driver and rebooting the server, everything went back to normal.

Lesson learned for me, anything out of the norm should be a suspect!

References for virus protection best practices – http://timradney.com/virusscanbestpractices

Filter drivers on SQL Servers – http://timradney.com/filterdrivers

3 Comments

  • Hi Tim

    I’ve seen that before. Not good. You might want to double-check the rest of the exceptions there to ensure that they’re not also scanning the SQL data files… I’ve seen that before too…

    Reply
  • I also recommend excluding .ckp files (been burned there by AV with regards to snapshots), FILESTREAM directories, and the log directory.

    Given that I have seen the filter drivers nuke the network stack and cause unexpected drops (and every major AV provider now has one embedded), I’m also of the opinion to remove AV from SQL Server, to disable Internet access from said SQL Server, and to use another server to scan appropriate files and directories on a regular basis.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *