Towards Mitigating the Class-Imbalance Problem for Partial Label Learning
Jing Wang (Southeast University); Min-Ling Zhang (Southeast University)
Partial label (PL) learning aims to induce a multi-class classifier from training examples where each of them is associated with a set of candidate labels, among which only one is valid. It is well-known that the problem of class-imbalance stands as a major factor affecting the generalization performance of multi-class classifier, and this problem becomes more pronounced as the ground-truth label of each PL training example is not directly accessible to the learning approach. To mitigate the negative influence of class-imbalance to partial label learning, a novel class-imbalance aware approach named CIMAP is proposed by adapting over-sampling techniques for handling PL training examples. Firstly, for each PL training example, CIMAP disambiguates its candidate label set by estimating the confidence of each class label being ground-truth one via weighted k-nearest neighbor aggregation. After that, the original PL training set is replenished for model induction by over-sampling existing PL training examples via manipulation of the disambiguation results. Extensive experiments on artificial as well as real-world PL data sets show that CIMAP serves as an effective data-level approach to mitigate the class-imbalance problem for partial label learning.