Files
roam/implresearch.org
2026-06-19 18:09:39 +03:00

4.1 KiB
Raw Permalink Blame History

impl/research

Survey of academic literature on class imbalance in deep learning, relevant to ROLL's thesis positioning.

Key Papers

Paper Venue Node
CLIMB (arXiv:2505.17451) NeurIPS 2025
impl/paper-beyond-rebalancing 2024 detailed node
Simplifying NN Training Under Class Imbalance (arXiv:2312.02517) 2023
Investigating Group DRO (arXiv:2303.02505) 2023
impl/paper-tabpfn ICLR 2023 detailed node
Survey on Imbalanced Learning (Springer 2024) Springer AI Review
Rethinking Class Imbalance (arXiv:2305.03900) 2023

Competing Strategies

Methods the literature benchmarks against (relevant as ROLL baselines):

  • Resampling: SMOTE, ADASYN, CSMOUTE, BorderlineSMOTE, ROSE
  • Cost-sensitive: class weighting, focal loss, asymmetric loss
  • Ensemble: BalancedBagging, EasyEnsemble, RUSBoost, BalancedRandomForest
  • Threshold moving: post-hoc calibration on decision threshold
  • DL-specific: LDAM-DRW, M2m, MiSLAS, BBN (mostly image long-tail)
  • Tabular DL baselines: XGBoost, LightGBM, CatBoost, MLP, ResNet, FT-Transformer, impl/paper-tabpfn
  • CLIMB finding: ensembles dominate; naive rebalancing (SMOTE alone) often underperforms

Metrics used: AUC-ROC, G-Mean, F1, Precision/Recall. AUC and G-Mean are the standard for imbalanced eval. ROLL's TPR-at-FPR framing is non-standard but more practically useful — position this as an advantage.

Dataset Coverage vs Literature

Well Covered by ROLL

  • All glass variants (glass06) — standard KEEL
  • Yeast3, ecoli-0-1_vs_5, wisconsin, cleveland, pima, haberman, iris0, vowel0, vehicle2, page-blocks, new-thyroid1, led7digit
  • Adult, Forest Cover, Bank Marketing (medium tabular)
  • Credit Card Fraud (~285K, IR 577:1) — common in fraud literature

Gaps vs Literature (datasets in papers ROLL doesn't have)

Dataset IR Samples Appears In
Abalone9-18 ~130 731 Beyond Rebalancing, CLIMB
Annthyroid 7.2 6916 Beyond Rebalancing, many UCI surveys
Satellite 22 6435 Beyond Rebalancing
Segment 6 2310 Beyond Rebalancing
Yeast4/5/6 833 ~1484 Beyond Rebalancing, CLIMB
Ecoli4 15.8 336 Beyond Rebalancing
KC1/KC2/PC1/CM1 (software) 513 4151783 Beyond Rebalancing
Pen-local/Pen-global 9671 7291 Beyond Rebalancing

Non-Standard or Unusual in ROLL

  • Higgs: ROLL samples 500K balanced (50/50) — not a standard imbalanced benchmark; physics ML context
  • Home Credit: Kaggle competition dataset; rare in academic imbalance papers
  • CIFAR-10 binary (class 1 vs rest, IR ~9): DL imbalance papers use long-tail formulation instead — results not directly comparable to LDAM/MiSLAS tables

Recommendations for Baseline Strengthening

Priority additions (available in KEEL, low effort):

  1. Yeast4, Yeast5, Yeast6 — stress-test high IR range
  2. Annthyroid — one of the most cited UCI imbalanced datasets
  3. Abalone9-18 — extreme IR (130:1), covers the hard regime
  4. Ecoli4 — rounds out ecoli coverage at IR 15.8

Lower priority (useful if sweeping many baselines):

  1. Satellite, Segment, Pen-local — common in full KEEL sweeps
  2. KC1/PC1 — software metrics datasets; different domain from biology/finance

Paper Subnodes