In:
International Journal of Distributed Sensor Networks, SAGE Publications, Vol. 13, No. 4 ( 2017-04), p. 155014771770311-
Abstract:
In previous work, imbalanced datasets composed of more benign samples (the majority class) than the malicious one (the minority class) have been widely adopted in Android malware detection. These imbalanced datasets bias learning toward the majority class, so that the minority class examples are more likely to be misclassified. To solve the problem, we propose a new oversampling method called fuzzy–synthetic minority oversampling technique, which is based on fuzzy set theory and the synthetic minority oversampling technique method. As the sample size of the majority class increases relative to that of the minority class, fuzzy–synthetic minority oversampling technique generates more synthetic examples for each minority class examples in the fuzzy region, where the minority examples have a low degree of membership to the minority class and are more likely to be misclassified. Using the new synthetic examples, the classifiers build larger decision regions that contain more minority examples, and they are no longer biased to the majority class. Compared with synthetic minority oversampling technique and Borderline–synthetic minority oversampling technique methods, fuzzy–synthetic minority oversampling technique achieves higher accuracy on both the minority class and the entire datasets.
Type of Medium:
Online Resource
ISSN:
1550-1477
,
1550-1477
DOI:
10.1177/1550147717703116
Language:
English
Publisher:
SAGE Publications
Publication Date:
2017
detail.hit.zdb_id:
2192922-1