The Theoretical Study of Rare Event Weighted Logistic Regression for Classification of Imbalanced Data

Authors

  • Dian Eka Apriana Sulasih Institut Teknologi Sepuluh Nopember
  • Santi Wulan Purnami Institut Teknologi Sepuluh Nopember
  • Santi Puteri Rahayu Institut Teknologi Sepuluh Nopember

Abstract

One of the problems in data classification is imbalanced data. In two-class classification, imbalance problem occurs where one of the two classes has more samples than another class. In such situation, most of the classifier will be biased towards the major class, while the minor class will be subordinated eventually which leads to inaccurate classification. Therefore, a method to classify the imbalanced data is required. Rare Event Weighted Logistic Regression (RE-WLR) which is developed by Maalouf and Siddiqi is a method of classification applied to large imbalanced data and rare event. This study showed the review of RE-WLR for the classification of imbalanced data. It explicated the steps to obtain the estimator specifically, particularly for IRLS. RE-WLR is a combination of Logistic Regression (LR) rare events corrections and Truncated Regularized Iteratively Re-weighted Least Squares (TR-IRLS). Rare event correction in LR is applied to Weighted Logistic Regression (WLR). Regularization was added to reduce over-fitting. The estimation of ߚ is performed by using the method of maximum likelihood (ML), while WLR maximum likelihood estimates (MLE) were obtained by using IRLS method of Newton-Raphson algorithm. In order to solve large optimization problems, Truncated-Newton method is applied.

Downloads

Download data is not yet available.

Downloads

Published

2015-12-07