roam/implpaper-beyond-rebalancing.org at 16c7ccac8a8b6cafc974a1d6b04eeaea5c81641a

aner/roam

Fork 0

Files

T

aner 16c7ccac8a backup: 2026-06-19 18:09

2026-06-19 18:09:39 +03:00

2.4 KiB

Raw Blame History

impl/paper-beyond-rebalancing

impl/paper-beyond-rebalancing

impl/research

Beyond Rebalancing: Benchmarking Binary Classifiers Under Class Imbalance Without Rebalancing Techniques 2024 — arXiv:2509.07605

Problem

The imbalanced learning literature almost exclusively evaluates methods with rebalancing (SMOTE, oversampling, etc.). This paper asks: which classifiers are intrinsically robust to class imbalance, with no rebalancing at all?

Setup

19 real-world UCI/Kaggle datasets (IR 0.0015–0.54) + 5 synthetic datasets
Synthetic decision boundaries of increasing complexity: linear → moderate non-linear → non-linear+redundancy → Gaussian quantiles → XOR (hardest)
Minority class progressively reduced to 100%, 50%, 25%, 10%, 5%, 1%, plus one-shot/few-shot (k=1,3,5)
12 classifiers; 2×5-fold stratified CV

Classifiers Tested

Traditional: Decision Tree, k-NN, SVM Ensemble: Random Forest, XGBoost, LightGBM, CatBoost, BaggingRF, RUSBoost Advanced: impl/paper-tabpfn One-class: OCSVM, Isolation Forest, LOF

Metrics

AUC-ROC, AUC-PR, F1, G-mean, Accuracy, Precision, Recall

Key Findings

TabPFN wins overall — best performer at all imbalance levels including extreme; only method that holds up one-shot/few-shot
Ensembles second — CatBoost, XGBoost, LightGBM degrade moderately; RF degrades faster
Traditional classifiers collapse — DT and k-NN fail sharply below 25% minority
Decision boundary complexity is a major factor — on linear data, most classifiers survive extreme imbalance; on XOR, nearly all collapse
Practical advice: use impl/paper-tabpfn or CatBoost/SVM when rebalancing is not feasible

Datasets Used (real-world)

Breast Cancer, Pen Local/Global, Letter, Annthyroid, Satellite, Glass, Segment, Pima, Yeast4/5/6, Abalone/Abalone9-18, Ecoli4, PC1/CM1/KC1/KC2 — all UCI

Relevance to ROLL

Directly in ROLL's territory: binary tabular classification under imbalance, AUC/G-mean metrics
Strong candidate as a baseline paper to cite
Does not use any custom loss or ROC-optimization — ROLL's TPR-at-FPR objective is orthogonal and potentially more practically useful
Dataset list is a good target for ROLL coverage: Annthyroid, Abalone9-18, Satellite, Yeast4/5/6 are missing from ROLL (see impl/research gap table)

2.4 KiB Raw Blame History Unescape Escape