Part 2. Performance and Functionality Evaluation of Antivirus’

Antivirus needs and limitations

Antivirus is a computer program that has the ability to find and remove malicious software, and it is often referred to as ‘virus scanning software.’ You may be familiar with the name ‘computer antivirus’ in Korea due to Ahn Chul-Soo Labs (now AhnLab) V3. Antivirus identifies and detects viruses based on the characteristics or behavioral information (signatures) of viruses. In the past, many specialized products were released according to the type or characteristics of malware, such as viruses, spyware, and adware. However, as malware has become more intelligent and sophisticated in recent years, products that comprehensively specialize in various malicious acts have become irrelevant. Antivirus products that integrate functions to detect malware are now in the spotlight.

We can recognize that antivirus is so important because of the ever-increasing malware. According to ‘AV-TEST,’ in 2022, the number of malware outbreaks increased by about 14% from the previous year to more than 100 million, and in 2023, based on the current increase, it is expected to increase at a similar rate to the previous year or even more, so you can see that the threat of malware is growing every day.

Fig 1. Increase in malware (Source: ‘AV-TEST’)

Antivirus detection methods can be classically divided into signature-based and behavioral analysis-based. Heuristics and machine learning are applied to traditional detection methods to improve detection accuracy and analysis effectiveness.
Particularly in recent years, as malware has become more sophisticated and advanced, there have been many efforts to improve detection accuracy by fusing different analysis and detection methods.

  • Signature-based: Detect malware by comparing patterns of known attacks
  • Behavioral analysis-based: Detects malware by detecting attack patterns through checking on system changes and system monitoring
  • Heuristic-based: Similarity-based detection based on profiling information by analyzing the features of malware
  • Machine learning-based: Detects malware using a model trained based on feature information collected and analyzed

Fig 2. Process of malware detection and analysis

However, even with all of these different detection methods, antiviruses have limitations. Typical antivirus manufacturers update their engines as new malware emerges, with security experts performing manual analysis and creating signatures or detection models that can detect malware. This makes it difficult for human experts to analyze new malware and keep up with the pace of new malware emergence. Intelligent and sophisticated antiviruses are also subject to obfuscation and encryption techniques(anti-reversing) to disrupt evasion techniques and reverse engineering, resulting in an increasingly slow response time. In addition, one of the most devastating limitations of antiviruses is the problem of false positives(misdiagnosis), where the patterns(signatures) of some malware and the patterns of legitimate programs are very similar, making them very difficult to detect.

  • In 2008, Avast misdiagnosed NateOn and some programs as malware
  • In 2011, V3 Lite misdiagnosis deleted Windows critical files, making it unbootable
  • In 2017, Avast mass misidentification incident, where a large number of legitimate files, including system files, were misidentified as malware
  • In 2022, Sophos’ antivirus product conflicted with Windows 11, causing a blue screen of death
  • On August 30, 2022, Alyac misdiagnosed ransomware

As a result, legitimate programs are often misdiagnosed as malware.
Therefore, our antivirus trial analysis research team created test scenarios based on these limitations to evaluate the functionality and performance of the antivirus.

Why do we evaluate antivirus features and performance?

Private companies in the U.S., Germany, and the U.K. have implemented antivirus certification systems. Because they are all foreign certification authorities, they do not consider domestic users’ PC environment. The detection evaluation of HWP malware, mainly distributed in Korea, or various types of document malware, is insufficient. Therefore, to reflect these deficiencies, we conducted tests that included a large number of malware that fit the domestic user environment in Korea, installed applications such as ‘KakaoTalk’ and ‘Pod Player’ that are frequently used by domestic users, and utilized malware that was actually introduced through the government agencies and our own crawling systems, such as Korean HWP malware, for function and performance evaluation. We also introduced test methods not used by existing certification authorities, such as USB flash drive, large amounts of malware, and recently collected malware, to conduct tests and performance evaluations.

Fig 3. Different antivirus evaluation environments

The selection criteria for antiviruses to be evaluated for functionality and performance are as follows.
(1) Select antiviruses with high domestic and international market share.
(2) Unique products by antivirus engine in criterion 1
(3) Products that can be installed on Windows operating systems
(4) Free antiviruses ClamWin and Windows Defender were included for comparison diversity.

We selected ten antiviruses using four criteria, including V3, Alyac, Nod32, McAfee, etc. The malware to be used in the evaluation of antivirus functions and performance was randomly selected by considering the type, appearance, and characteristics of the malware, including a large amount of malware provided by Virustotal for research purposes, malware from government agencies and our own crawling system, malware collected from malware DB sites such as MalShare and VirusShare, and finally, fully analyzed malware provided by a domestic security company, Enki. In addition, depending on the evaluation scenario, six malware samples were composed as follows, and samples for each scenario were used.

(Sample 1) 100 malware collected by extensions such as Exe, pdf, hwp, etc.
(Sample 2) 20,330 malware in large quantities selected by random sampling
(Sample 3) 151 malware collected in the last three months from government organizations and our own crawling system
(Sample 4) 6,363 malware from security firm ENKI and our own analysis
(Sample 5) 50 malware packed with a commercial packing tool
(Sample 6) 25 executable malware with malicious behavior

Fig 4. Top, Architecture of the test bed environment
Bottom, Developed scenario-specific test tool and system

To prevent secondary damage, we implemented two firewalls to completely separate the inside and outside of the test bed, as shown at the top of Figure 4. Inside the firewalled network, we installed a server to store test logs, a controller, and 10 PCs with each antivirus installed. The test PCs used for functional and performance evaluation all used the same hardware and OS and were configured to be as similar as possible to real user PCs by installing applications commonly used by users in Korea, such as ‘KakaoTalk’, ‘Alzip’, and office programs. In addition, to minimize external intervention in the test, we developed command execution software (Controller and Agent) that can perform scenario-specific actions, as shown at the bottom of Figure 4. Using the Controller and Agent, we can test 10 PCs simultaneously, including downloading, executing, decompressing, and rebooting malware. We can also save test logs of commands executed by each PC and files transferred to the DB.

Establish evaluation criteria and key tests for assessing antivirus detection performance and functionality

A total of 58 evaluation criteria were established to test the functionality and performance of the antivirus, which were based on the ‘Software Technical Performance Evaluation Guidelines’ in Korea and evaluation items from foreign antivirus certification organizations. We conducted 21 performance tests based on seven scenarios and used analyzed malware provided by Enki, a domestic security company, for detection accuracy and verification.

Table 1. Antivirus performance and feature evaluation items
CategoryItemsCriteria
FunctionalityAccurancy(1) Real-time detection accuracy on a single, individually sent (per piece) for 100 total malware*.
* For real-time detection, excludes ClamWin, which does not have the feature
(1)-1 Detection accuracy for Decompressing 100 malware
(2) Detection accuracy with 25 malware executions
(3) Detection accuracy for USBs containing a large number (20,330) of malware
(4) Real-time detection accuracy for decompressing a large (20,330) malware
(5) 151 malware detection accuracy for the latest malware (within three months)
(6) Detection accuracy of analyzed malware (6,363)
(7) Detection accuracy of 50 packed malware
(7)-1 Detection accuracy for 50 unpacked malware
Time Efficiency(8) Average rate of real-time detection speed on a single, individually sent (per piece) for total 100 malware
(9) Average rate of detection speed for USBs containing a large number (20,330) of malware
(10) Average rate of detection speed for decompressing a large (20,330) malware using Alyac
(11) Average rate of 151 malware detection speed for the latest malware (within three months)
(12) Average rate of detection speed of analyzed malware (6,363)
(13) Average rate of detection speed of packed malware
Resources Efficiency#1 CPU Usage(14) Average CPU usage before evaluation 1 on a PC with the same conditions
#1 Memory Usage(15) Average Memory usage before evaluation 1 on a PC with the same conditions
#2 CPU Usage(16) Average CPU usage before evaluation 3 on a PC with the same conditions
#2 Memory Usage(17) Average Memory usage before evaluation 3 on a PC with the same conditions
Real-time CPU Usage(18) Real-time CPU usage when evaluating item (1)
Real-time Memory Usage(19) Real-time memory usage when evaluating item (1)
On Deep Scan, CPU Usage(20) CPU usage before evaluation 3 on a PC with the same conditions
On Deep Scan, Memory Usage(21) Memory usage before evaluation 3 on a PC with the same conditions
ReliablityOperational Reliability(22) Did the antivirus cause any failures after installation?
UsabilityUsability of User Learning(23) Possibility to change other languages while using the product?
(24) How many languages does the antivirus offer?
(25) Does the antivirus provide help service inside the software except by linking the website?
Interface Adjustability(26) Possible to modify the menu and structure to the user’s desire?
Input Data Support(27) How many ways to specify the target to scan for in a quick and deep scan?
(e.g., email files, other folders, compressed files, etc.)
Easily Track Progress(28) Does the antivirus provide a UI/UX to understand the current progress of the tasks being performed?
Installation Environment Suitability(29) What types of operating system environments are available (Windows/Linux/Unix/Mac/Android/iOS)?
(30) Does it encourage the installation of external programs?
Usability of Uninstallation(31) Does the product uninstall correctly?
Generate reports(32) Possible to generate reports on detection and quarantine results?
(33) How many formats can be generated as reports?
User-Defined Detection/Inspection(34) Possible to exclude certain conditions (folder, filename, extension, detection name, etc.) from detection and scanning?
Free or not(35) Is there a free version of the antivirus product?
Add-OnsReal-time Detection(36) Possible to specify a location (entire system, specific folder, etc.) for real-time detection?
(37) Does it have Antimalware Scan Interface (ASMI)* capabilities?
* Detects obfuscated scripts such as JavaScript, VBScript, Powershell, etc.
(38) Possible to set actions (quarantine and delete) for detected malware after real-time detection?
Manual Scanning(39) How many manual scanning methods does the antivirus offer?
(40) Possible to specify a location (file, drive, specific folder, etc.) to scan during a manual scan?
(41) Possible to scan for specific malicious behavior (Anti Rootkit, process, memory)?
(42) Does it have a scheduling inspection feature?
Network Security(43) Possible to set up a firewall within the antivirus?
(44) Possible to block harmful sites or manage custom sites?
(45) Possible to prevent or detect specific network-based intrusions (spoofing, remote, etc.)?
(46) Does it have a VPN or proxy?
System Security(47) Possible to look up a history of recently created files?
(48) Does it have ransomware-specific detection or blocking capabilities?
(49) Possible to access to storage media (USB, external hard, CD/DVD, etc.)?
(50) Does it have a registry cleanup feature?
Privacy Protection(51) Does it have the ability to delete temporary files?
(52) Possible to delete my browsing history?
(53) Possible to delete traces of the user (recently opened files, list of running documents, etc.)?
(54) Possible to completely erase files from recovery (BCWipe, CCleaner, etc.)?
Support from the VendorMaintenance(55) Are there regular product updates and feature additions?
Set Update Frequency(56) Possible to update the engine automatically, manually, or on demand?
Troubleshooting and Support(57) Does it have a Q&A or FAQ on its website or within its antivirus product?
(58) Does it have such a quick contacting service system (such as a chatbot)?
Antivirus performance and evaluation results

Below are the primary topics of the antivirus evaluation, with seven scenarios used in the quantitative evaluation.

Experiment 1: Real-time detection accuracy of malware determined to be malicious by Virus Total (extension classification)
Experiment 2: Real-time detection by executing malware that performs malicious behavior
Experiment 3: Real-time detection of connecting a USB stick containing a large amount of malware to a PC
Experiment 4: Real-time detection accuracy of unzipping a compressed file containing a large amount of malware
Experiment 5: Real-time detection accuracy of the latest malware within three months
Experiment 6: Real-time detection accuracy for self-analyzed malware
Experiment 7: Real-time detection accuracy for packed malware utilizing a commercial packing tool

Experiment 1: Real-time detection accuracy of malware determined to be malicious by Virus Total
This is a basic test to evaluate the performance of real-time malware detection by extension. It tests a total of 100 malware by malware type (Exe: 17/ xlsx:17/ Pdf: 17/ pptx: 16 / Hwp: 17/ docx: 16). The test method is to send 100 malware to the PC through the Agent and Controller without executing the malware, and test how well it detects the transmitted malware. The result shows that EXE and PDF files were the most accurate types of malware extensions among the detection results.

Experiment 2: Real-time detection by executing malware that performs malicious behavior
  This experiment tests the real-time detection performance of malware by directly executing malware that performs malicious behavior (downloading, creating, etc.) without sending malware. We put 25 malware in the exception folder, executed one malware at a time, and checked the detection results.

Experiment 3: Real-time detection of connecting a USB flash drive containing a large amount of malware to a PC
The test detects malware by preparing a USB flash drive with a large amount of malware (20,330) in advance and physically connecting it to a PC. A few antiviruses’ asked the user to click a button for scanning.

Experiment 4: Real-time detection accuracy of unzipping a compressed file containing a large amount of malware
We used the same 20,330 malware samples as in [Experiment 3]. While Experiment 3 was an experiment to measure detection accuracy using a USB flash drive, [Experiment 4] was a test to detect malware-compressed files (20,330) in advance by storing them in an exception folder and decompressing them into a specific folder with Controller. Due to the decompression time, we waited until the decompression was finished, and each antivirus stopped detecting to check the detection results. Unusually, even though it was the same malware sample, we could see a clear difference in the accuracy between Experiments 3 and 4, presumably due to the difference between deep scanning (Experiment 3, deep scanning is performed when the media is connected) and real-time scanning.

Experiment 5: Real-time detection accuracy of the latest malware within three months
This experiment is a detection test for 151 malicious codes collected by government agencies and our own crawling system within three months. The collected malware is an experiment to check the detection accuracy of unknown or recently occurred malware in Korea and abroad. The experimental method is the same as [Experiment 1], and 151 malware is detected by sending 151 malware using Controller and Agent.

Experiment 6: Real-time detection accuracy for self-analyzed malware
We obtained 6,363 fully analyzed malware samples from a Korean security company (Enki) to test real-time detection accuracy. The malware samples used in the experiment are characterized by a high percentage of documented malware (over 80%).

Experiment 7: Real-time detection accuracy for packed malware utilizing a commercial packing tool
This experiment checks the detection accuracy of packed malware using Themida, a commercial packing tool. Since Themida can only pack executable files, we packed 50 executable malware and used them as test samples. The test was conducted by sending a PC using Agent and Controller.

Conclusion

After conducting the tests and summarizing the results, our research team came to the following conclusion: although the goal of antivirus is to detect malware well, there is no ‘best antivirus’ because the features and performance of different products vary somewhat depending on how they detect and what they prioritize (network, file type, event, behavior, medium, storage, etc.). Therefore, selecting and installing an antivirus that fits your PC and cyber environment, updating detection DB, and running virus diagnostics regularly are major things for protecting your PC and valuable digital assets from cyber threats. As we have done so far, our team will consistently perform annual evaluations on antivirus products, plus, we will also publish detailed findings. So stay tuned.

강식 신

KAIST 사이버보안연구센터 악성코드분석 팀원으로 악성코드 분석 프로그램 및 연구를 수행하고 있다.

영락 유

KAIST 사이버보안연구센터 악성코드분석팀 연구원으로 블록체인 및 소프트웨어 테스팅 연구를 진행하고 있다.

1 명이 이 글에 공감합니다.

답글 남기기