The trim package Tree methods for classification of imbalanced data
In a lot of real situations, class distributions are imbalanced. This especially occurs when studying group of vulnerable peolple, as for instance people experiencing long-term unemployment, or people experiencing a critical health problem (cancer, etc.). When we want to extract profiles of people falling in these critical situations, we oftenly perform a classification task. But this imbalance impacts the classification quality. Although we succeed in getting a good overall classification rate, most of the error is done on the rare class. However, our interest is more on the rare class (vulnerable people) than on the majority class (well-being people). In the past decade, several methods has been designed to reduce/overcome the class imbalance problem, especially for decision tree methods. This package aims at providing these decision tree methods especially designed for classification of imbalanced categorical data.
Authors: Emmanuel Rousseaux and Gilbert Ritschard.
Availability: Beta version in development.
Development: The development version is hosted on the R-Forge platform.