background-shape
feature-image

Link to github repo

These are the column of the dataset representing various aspects of a software. These will decide if a software is legit or malware

blog-img

Correlation between the attributes

blog-img

  • Data preparation was not that much needed cause it was already a well proccesed data.
  • Did Hyper parameter tunning using GridSearchCV
  • Among GaussianNB, AdaBoostClassifier, DecisionTreeClassifier, KNeighborsClassifier, SVM, RandomForest LogisticRegression, RandomForest performed best with score of 99.1%

blog-img

blog-img

blog-img