4.1 KiB
4.1 KiB
impl/research
Survey of academic literature on class imbalance in deep learning, relevant to ROLL's thesis positioning.
Key Papers
| Paper | Venue | Node |
|---|---|---|
| CLIMB (arXiv:2505.17451) | NeurIPS 2025 | — |
| impl/paper-beyond-rebalancing | 2024 | detailed node |
| Simplifying NN Training Under Class Imbalance (arXiv:2312.02517) | 2023 | — |
| Investigating Group DRO (arXiv:2303.02505) | 2023 | — |
| impl/paper-tabpfn | ICLR 2023 | detailed node |
| Survey on Imbalanced Learning (Springer 2024) | Springer AI Review | — |
| Rethinking Class Imbalance (arXiv:2305.03900) | 2023 | — |
Competing Strategies
Methods the literature benchmarks against (relevant as ROLL baselines):
- Resampling: SMOTE, ADASYN, CSMOUTE, BorderlineSMOTE, ROSE
- Cost-sensitive: class weighting, focal loss, asymmetric loss
- Ensemble: BalancedBagging, EasyEnsemble, RUSBoost, BalancedRandomForest
- Threshold moving: post-hoc calibration on decision threshold
- DL-specific: LDAM-DRW, M2m, MiSLAS, BBN (mostly image long-tail)
- Tabular DL baselines: XGBoost, LightGBM, CatBoost, MLP, ResNet, FT-Transformer, impl/paper-tabpfn
- CLIMB finding: ensembles dominate; naive rebalancing (SMOTE alone) often underperforms
Metrics used: AUC-ROC, G-Mean, F1, Precision/Recall. AUC and G-Mean are the standard for imbalanced eval. ROLL's TPR-at-FPR framing is non-standard but more practically useful — position this as an advantage.
Dataset Coverage vs Literature
Well Covered by ROLL
- All glass variants (glass0–6) — standard KEEL
- Yeast3, ecoli-0-1_vs_5, wisconsin, cleveland, pima, haberman, iris0, vowel0, vehicle2, page-blocks, new-thyroid1, led7digit
- Adult, Forest Cover, Bank Marketing (medium tabular)
- Credit Card Fraud (~285K, IR 577:1) — common in fraud literature
Gaps vs Literature (datasets in papers ROLL doesn't have)
| Dataset | IR | Samples | Appears In |
|---|---|---|---|
| Abalone9-18 | ~130 | 731 | Beyond Rebalancing, CLIMB |
| Annthyroid | 7.2 | 6916 | Beyond Rebalancing, many UCI surveys |
| Satellite | 22 | 6435 | Beyond Rebalancing |
| Segment | 6 | 2310 | Beyond Rebalancing |
| Yeast4/5/6 | 8–33 | ~1484 | Beyond Rebalancing, CLIMB |
| Ecoli4 | 15.8 | 336 | Beyond Rebalancing |
| KC1/KC2/PC1/CM1 (software) | 5–13 | 415–1783 | Beyond Rebalancing |
| Pen-local/Pen-global | 9–671 | 7291 | Beyond Rebalancing |
Non-Standard or Unusual in ROLL
- Higgs: ROLL samples 500K balanced (50/50) — not a standard imbalanced benchmark; physics ML context
- Home Credit: Kaggle competition dataset; rare in academic imbalance papers
- CIFAR-10 binary (class 1 vs rest, IR ~9): DL imbalance papers use long-tail formulation instead — results not directly comparable to LDAM/MiSLAS tables
Recommendations for Baseline Strengthening
Priority additions (available in KEEL, low effort):
- Yeast4, Yeast5, Yeast6 — stress-test high IR range
- Annthyroid — one of the most cited UCI imbalanced datasets
- Abalone9-18 — extreme IR (130:1), covers the hard regime
- Ecoli4 — rounds out ecoli coverage at IR 15.8
Lower priority (useful if sweeping many baselines):
- Satellite, Segment, Pen-local — common in full KEEL sweeps
- KC1/PC1 — software metrics datasets; different domain from biology/finance
Paper Subnodes
- impl/paper-tabpfn — TabPFN: in-context learning for small tabular classification (ICLR 2023)
- impl/paper-beyond-rebalancing — benchmark of 12 classifiers under imbalance, no rebalancing (2024)