一种基于软计算的大规模多种类恶意代码家族分类方法
首发时间:2018-09-19
摘要:目前,在恶意代码分类领域已有诸多成熟的方法,而最为常用的依然是各反病毒厂商所采用的基于病毒特征码和文件哈希值的检测方法,然而,上述检测方法仅能简单将计算机软件区分为良性软件或恶意软件,随着海量恶意代码家族复杂的加壳、混淆、反沙箱、反反病毒等技术的出现,过去十年间新兴恶意代码家族已呈现出爆发式增长势头,并涌现出大量新兴恶意代码种类,当前,仅将样本进行简单二元区分,已无法针对各类网络攻击进行有效防御。本文提出一种基于软计算的大规模多种类恶意代码分类方法,首先,我们借助开源工具Peframe和VirusTotal提取大规模恶意样本的PE结构化特征,然后,进行特征选择与融合构建样本特征库,并将特征向量输入计算模型来训练和构建分类模型,最后,依据微软反病毒库对测试样本的判别结果,将本文采用的软件算方法和传统硬计算方法从分类精度和泛化性等方面进行性能评估对比。
For information in English, please click here
Study of Soft Computing methods for large-scale multinomial malware families classification
Abstract:There exist different methods of malware identification, while the most common is signature-based used by antivirus vendors that includes one-way cryptographic hash sums to characterize each particular malware sample. In most cases such detection results in a simple classification into malware and goodware. In a modern Information Security society it is not enough to separate only between goodware and malware. The reason for this is increasingly complex functionality used by various malware families, in which there has been several thousand of new ones created during the last decade. In addition to this, a number of new malware types have emerged. This paper proposes a large-scale multinomial malware classification method based on soft computing. First, we use the open source tools Peframe and VirusTotal to extract the PE structural features of large-scale malware, and perform feature selection and fusion to build a sample feature database. The feature vector is input into the model to train and construct the classification model. Finally, according to the judgment results of the test samples of the Microsoft anti-virus database, the software computing method and the traditional hard computing method adopted in this paper are used to perform performance from classification accuracy and generalization.
Keywords: malware family classification soft computing PEstructural features
基金:
引用
No.****
同行评议
共计0人参与
勘误表
一种基于软计算的大规模多种类恶意代码家族分类方法
评论
全部评论0/1000