Files
roam/implpaper-beyond-rebalancing.org
2026-06-19 18:09:39 +03:00

48 lines
2.4 KiB
Org Mode
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
* impl/paper-beyond-rebalancing
[[id:151d5686-6f40-4158-a59a-b0be94cdc969][impl/research]]
*Beyond Rebalancing: Benchmarking Binary Classifiers Under Class Imbalance Without Rebalancing Techniques*
2024 — arXiv:2509.07605
** Problem
The imbalanced learning literature almost exclusively evaluates methods *with* rebalancing (SMOTE, oversampling, etc.). This paper asks: which classifiers are intrinsically robust to class imbalance, with no rebalancing at all?
** Setup
- 19 real-world UCI/Kaggle datasets (IR 0.00150.54) + 5 synthetic datasets
- Synthetic decision boundaries of increasing complexity: linear → moderate non-linear → non-linear+redundancy → Gaussian quantiles → XOR (hardest)
- Minority class progressively reduced to 100%, 50%, 25%, 10%, 5%, 1%, plus one-shot/few-shot (k=1,3,5)
- 12 classifiers; 2×5-fold stratified CV
** Classifiers Tested
Traditional: Decision Tree, k-NN, SVM
Ensemble: Random Forest, XGBoost, LightGBM, CatBoost, BaggingRF, RUSBoost
Advanced: [[id:bf0fc08a-e806-48df-b188-7a2c4c41c693][impl/paper-tabpfn]]
One-class: OCSVM, Isolation Forest, LOF
** Metrics
AUC-ROC, AUC-PR, F1, G-mean, Accuracy, Precision, Recall
** Key Findings
1. *TabPFN wins overall* — best performer at all imbalance levels including extreme; only method that holds up one-shot/few-shot
2. *Ensembles second* — CatBoost, XGBoost, LightGBM degrade moderately; RF degrades faster
3. *Traditional classifiers collapse* — DT and k-NN fail sharply below 25% minority
4. *Decision boundary complexity is a major factor* — on linear data, most classifiers survive extreme imbalance; on XOR, nearly all collapse
5. Practical advice: use [[id:bf0fc08a-e806-48df-b188-7a2c4c41c693][impl/paper-tabpfn]] or CatBoost/SVM when rebalancing is not feasible
** Datasets Used (real-world)
Breast Cancer, Pen Local/Global, Letter, Annthyroid, Satellite, Glass, Segment, Pima, Yeast4/5/6, Abalone/Abalone9-18, Ecoli4, PC1/CM1/KC1/KC2 — all UCI
** Relevance to ROLL
- Directly in ROLL's territory: binary tabular classification under imbalance, AUC/G-mean metrics
- Strong candidate as a baseline paper to cite
- Does *not* use any custom loss or ROC-optimization — ROLL's TPR-at-FPR objective is orthogonal and potentially more practically useful
- Dataset list is a good target for ROLL coverage: Annthyroid, Abalone9-18, Satellite, Yeast4/5/6 are missing from ROLL (see [[id:151d5686-6f40-4158-a59a-b0be94cdc969][impl/research]] gap table)