Tech Off Thread

5 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

Anti-Virus Solution Question

Back to Forum: Tech Off
  • User profile image
    Shark_M

    HI C9rs.

    My question is about Antivirus solutions. Why does it take a huge amount of time to fully scan your computer?

    Why does the antivirus scan the file content of every file in my pc?

    We all know about MD5 Hashes, and that each MD5 Hash would in all probability correspond to a single unique file. So why cant antiviruses when we install them take the MD5 Hash of all files in the pc, and then compare that MD5 Hash with a list of known MD5 Hashes for known viruses and trojans??

    instead of scanning the files content?

  • User profile image
    eckes

    Reading the files (for the MD5 checksum or the scanning) takes nearly the same time, it is IO bound. And having a signature under the file done with old virus patterns does not mean the file is clean.

    But you are right, there some smarter concepts (beside on-access scanners). For example if you can trust your file server you can just switch off the option to modify the archive bit from clients. That way you can use that as a trusted flag if scanning is required.

    Bernd

  • User profile image
    Shark_M

    eckes wrote:
    

    Reading the files (for the MD5 checksum or the scanning) takes nearly the same time, it is IO bound. And having a signature under the file done with old virus patterns does not mean the file is clean.

    But you are right, there some smarter concepts (beside on-access scanners). For example if you can trust your file server you can just switch off the option to modify the archive bit from clients. That way you can use that as a trusted flag if scanning is required.

    Bernd



    What about constant rehashing. Recalculate the MD5 Hash of all files, and store them in something that we can query, like mini-SQL Server, and check the MD5 Hashes against the antivirus list of md5 hashes, if they match then that is a virus and the virus engine would then kill it after identifying it.

    The rehashing engine would work like the indexing service, and will check for changed or altered files and then recalculate the MD5 hash.

    This way I dont have to spend a whole day scanning my computer for viruses. It expedites the search for viruses.

  • User profile image
    Manip

    An MD5 hash from a file is formed something like this:

    Enter File(Filename) 
    Loop
       b = ReadByes(Filename, 50) 
       if( lasterr == EOF ) 
          if ( b ) 
             MD5 = CalculateMD5(MD5 + b) 
          exit
       else 
          MD5 = CalculateMD5(MD5 + b)
    End Loop

    Small amounts of data are thrown though the MD5 algorithm directly and the hash in generated. But for files it is far safer to calculate a new hash for every block, and the previous hash... So the final hash is equal to MD5(block4 + MD5(block3 + MD5(block2 +MD5(block1)))).

    It varies from implementation to implementation but generally computers aren't even powerful enough to calculate a hash for a 1 MB file let alone a 500 MB movie (for example). So it is broken down into lots of smaller jobs, each depending on the last for its new result. 

    Note: Worst explanation EVER.  

    My point is, that you have to scan the file top to bottom regardless of if you use hashes or if you're using some type of signature identification agorithm... And a signature identification algorithm is far more versatile (e.g. Heuristic Scanning)...

    Viruses are very well written these days, sometimes with professional teams of spammers behind them... If you just stuck to MD5 signatures, the virus writers could easily just change a resource within the binary and bypass the engine... The Windows APIs would happily help the virus do so. If you MD5'ed everything except the resources, then the virus writers could place their code in that location and call to it from the normal binary (or have a non-virus executable accidentally call executable code from a resource without knowing the original content has been replaced).



  • User profile image
    koorb

    All files have to be rescanned because a lot of viruses carry polymorphic stealth engines. They essentially change their structure to make themselves very difficult to detect. And a lot of viruses like excel document macro viruses in-bed themselves inside files so a file that was once clean could become a carrier.

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.