首页> 外文期刊>Empirical Software Engineering >Empirical assessment of machine learning-based malware detectors for Android Measuring the gap between in-the-lab and in-the-wild validation scenarios
【24h】

Empirical assessment of machine learning-based malware detectors for Android Measuring the gap between in-the-lab and in-the-wild validation scenarios

机译:基于Android的基于机器学习的恶意软件检测器的经验评估衡量实验室和野外验证方案之间的差距

获取原文
获取原文并翻译 | 示例

摘要

To address the issue of malware detection through large sets of applications, researchers have recently started to investigate the capabilities of machine-learning techniques for proposing effective approaches. So far, several promising results were recorded in the literature, many approaches being assessed with what we call in the lab validation scenarios. This paper revisits the purpose of malware detection to discuss whether such in the lab validation scenarios provide reliable indications on the performance of malware detectors in real-world settings, aka in the wild. To this end, we have devised several Machine Learning classifiers that rely on a set of features built from applications' CFGs. We use a sizeable dataset of over 50 000 Android applications collected from sources where state-of-the art approaches have selected their data. We show that, in the lab, our approach outperforms existing machine learning-based approaches. However, this high performance does not translate in high performance in the wild. The performance gap we observed-F-measures dropping from over 0.9 in the lab to below 0.1 in the wild-raises one important question: How do state-of-the-art approaches perform in the wild?
机译:为了解决通过大量应用程序进行恶意软件检测的问题,研究人员最近开始研究机器学习技术的功能,以提出有效的方法。到目前为止,文献中记录了一些有希望的结果,许多方法都在实验室验证场景中用我们称为的方法进行了评估。本文再次探讨了恶意软件检测的目的,以讨论在实验室验证场景中进行这种检测是否可以提供可靠的指示,说明真实世界中(即在野外)设置中恶意软件检测器的性能。为此,我们设计了几种机器学习分类器,这些分类器依赖于从应用程序的CFG构建的一组功能。我们使用了一个庞大的数据集,该数据集来自超过5万个Android应用程序,这些数据源是通过最新方法选择了它们的数据而来的。我们证明,在实验室中,我们的方法优于现有的基于机器学习的方法。但是,这种高性能不能在野外转化为高性能。我们观察到的性能差距-F指标从实验室的0.9下降到野外的0.1以下,这提出了一个重要的问题:最新的方法在野外如何表现?

著录项

  • 来源
    《Empirical Software Engineering》 |2016年第1期|183-211|共29页
  • 作者单位

    Univ Luxembourg, Interdisciplinary Ctr Secur Reliabil & Trust, 4 Rue Alphonse Weicker, L-2721 Luxembourg, Luxembourg;

    Univ Luxembourg, Interdisciplinary Ctr Secur Reliabil & Trust, 4 Rue Alphonse Weicker, L-2721 Luxembourg, Luxembourg;

    Univ Luxembourg, Interdisciplinary Ctr Secur Reliabil & Trust, 4 Rue Alphonse Weicker, L-2721 Luxembourg, Luxembourg;

    Univ Luxembourg, Interdisciplinary Ctr Secur Reliabil & Trust, 4 Rue Alphonse Weicker, L-2721 Luxembourg, Luxembourg;

    Univ Luxembourg, Interdisciplinary Ctr Secur Reliabil & Trust, 4 Rue Alphonse Weicker, L-2721 Luxembourg, Luxembourg;

    Univ Luxembourg, Interdisciplinary Ctr Secur Reliabil & Trust, 4 Rue Alphonse Weicker, L-2721 Luxembourg, Luxembourg;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Machine learning; Ten-Fold; Malware; Android;

    机译:机器学习;十折;恶意软件;Android;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:[email protected]

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号

OSZAR »