45 lines
2.7 KiB
Org Mode
45 lines
2.7 KiB
Org Mode
* impl
|
|
|
|
ROLL (Ranking via Optimized Label Learning) — PyTorch research project implementing custom loss functions for binary classification using kernel density estimation (KDE) to optimize TPR at target FPR thresholds. Targets imbalanced classification problems.
|
|
|
|
** Architecture
|
|
|
|
- =src/roll.py= — Loss function implementations (Normal/Beta/Kernelized ROLL)
|
|
- =src/experiment.py= — Training loop, evaluation infra, =run_configurations()= entry point
|
|
- =src/datasets.py= — 10+ dataset loaders (KEEL, UCI, Kaggle, synthetic)
|
|
- =src/networks.py= — =KeelNet= MLP architecture
|
|
- =src/summary.py= — Plotly HTML visualization (ROC, score distributions, ECDF)
|
|
- =src/utils.py= — Logging, output dir creation (=init_experiment=)
|
|
- =experiments/keel/=, =experiments/other/=, =experiments/large/= — experiment scripts
|
|
|
|
** Conventions
|
|
|
|
- All hyperparameters in Python dataclasses (=ExperimentConfiguration=), no CLI parsing
|
|
- Experiments follow =experiment-*.py= naming; all call =run_configurations()=
|
|
- KEEL experiments share =experiments/keel/_base.py= runner; individual files just call it
|
|
- All datasets expose: =__getitem__=, =__len__=, =.x=, =.y= attributes
|
|
- Episode-based eval: N independent train runs per config, results aggregated
|
|
- GPU enabled via =cudaSupport = true= in flake.nix; =get_device()= in utils.py auto-selects GPU/CPU
|
|
- Beta distribution variant (=roll_beta_loss_from_fpr=) is kept for thesis writing but is not actively developed or used
|
|
|
|
** Gotchas
|
|
|
|
- Dataset paths injected as env vars by Nix shell hook (=$keel_wisconsin_dir=, etc.) — must use =nix develop=
|
|
- =_calc_moments()= in roll.py is unused and has a variable typo (=array= vs =arr=)
|
|
- CIFAR-10 binary: class 1 vs rest (not class 0)
|
|
- Multiprocessing uses =spawn= method via =torch.multiprocessing=
|
|
|
|
** Key Files
|
|
|
|
- [[file:src/roll.py][src/roll.py]] — Core loss: =KernelizedROLLoss= custom autograd Function with KDE backward pass
|
|
- [[file:src/experiment.py][src/experiment.py]] — =ExperimentConfiguration= dataclass, =Criteriorator= ABC, =run_configurations()=
|
|
- [[file:experiments/keel/_base.py][experiments/keel/_base.py]] — shared KEEL runner =run_keel_experiment()=
|
|
- [[file:flake.nix][flake.nix]] — Nix env with dataset downloads, hash-pinned, exports path env vars
|
|
- [[file:AGENTS.md][AGENTS.md]] — Project guidelines (naming conventions, env, dataset list)
|
|
|
|
** Subnodes
|
|
|
|
- [[id:001430d5-e1e7-4e72-baf6-17399bfd6447][impl/loss-functions]] — Loss variants, KDE internals, gradient computation
|
|
- [[id:a53cbe84-cd8d-45c2-a8cf-34ab520a3ea5][impl/experiments]] — Experiment structure, training flow, metrics, output layout
|
|
- [[id:b8a9886a-d349-43e5-a745-817a148c1fd8][impl/datasets]] — Dataset catalog, KEEL list, eval metrics
|