Skip to content

SparseL/LibAUC

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Logo by Zhuoning Yuan

LibAUC: A Machine Learning Library for AUC Optimization

PyPI version Python Version PyTorch Tensorflow PyPI LICENSE

Website | Updates | Installation | Tutorial | Research | Github

LibAUC aims to provide efficient solutions for optimizing AUC scores (auroc, auprc).

Why LibAUC?

Deep AUC Maximization (DAM) is a paradigm for learning a deep neural network by maximizing the AUC score of the model on a dataset. In practice, many real-world datasets are usually imbalanced and AUC score is a better metric for evaluating and comparing different methods. Directly maximizing AUC score can potentially lead to the largest improvement in the model’s performance since maximizing AUC aims to rank the prediction score of any positive data higher than any negative data. Our library can be used in many applications, such as medical image classification and drug discovery.

Key Features

  • Easy Installation - Integrate AUROC, AUPRC training code with your existing pipeline in just a few steps
  • Large-scale Learning - Handle large-scale optimization and make the training more smoothly
  • Distributed Training - Extend to distributed setting to accelerate training efficiency and enhance data privacy
  • ML Benchmarks - Provide easy-to-use input pipeline and benchmarks on various datasets

Installation

$ pip install libauc

Usage

Official Tutorials:

  • Creating Imbalanced Benchmark Datasets [Notebook][Script]
  • Optimizing AUROC loss with ResNet20 on Imbalanced CIFAR10 [Notebook][Script]
  • Optimizing AUPRC loss with ResNet18 on Imbalanced CIFAR10 [Notebook][Script]
  • Training with Pytorch Learning Rate Scheduling [Notebook][Script]
  • Optimizing AUROC loss with DenseNet121 on CheXpert [Notebook][Script]
  • Optimizing AUROC loss with DenseNet121 on CIFAR100 in Federated Setting (CODASCA) [Preliminary Release]
  • Optimizing Multi-Label AUROC loss with DenseNet121 on CheXpert [Notebook][Script]

Quickstart for Beginners:

Optimizing AUROC (Area Under the Receiver Operating Characteristic)

>>> #import library
>>> from libauc.losses import AUCMLoss
>>> from libauc.optimizers import PESG
...
>>> #define loss
>>> Loss = AUCMLoss(imratio=[YOUR NUMBER])
>>> optimizer = PESG()
...
>>> #training
>>> model.train()    
>>> for data, targets in trainloader:
>>>	data, targets  = data.cuda(), targets.cuda()
        logits = model(data)
	preds = torch.sigmoid(logits)
        loss = Loss(preds, targets) 
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
...	
>>> #restart stage
>>> optimizer.update_regularizer()

Optimizing AUPRC (Area Under the Precision-Recall Curve)

>>> #import library
>>> from libauc.losses import APLoss_SH
>>> from libauc.optimizers import SOAP_SGD, SOAP_ADAM
...
>>> #define loss
>>> Loss = APLoss_SH()
>>> optimizer = SOAP_ADAM()
...
>>> #training
>>> model.train()    
>>> for index, data, targets in trainloader:
>>>	data, targets  = data.cuda(), targets.cuda()
        logits = model(data)
	preds = torch.sigmoid(logits)
        loss = Loss(preds, targets, index) 
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()	

Useful Tips

Checklist before Running Experiments:

  • Your data should have binary labels 0,1 and 1 is the minority class and 0 is the majority class
  • Compute the imbalance_ratio from your train set and pass it to AUCMLoss(imratio=xxx)
  • Adopt a proper initial learning rate, e.g., lr=[0.1, 0.05] usually works better
  • Choose libauc.optimizers.PESG to optimize AUCMLoss(imratio=xxx)
  • Use optimizer.update_regularizer(decay_factor=10) to update learning rate and regularizer in stagewise
  • Add activation layer, e.g., torch.sigmoid(logits), before passing model outputs to loss function
  • Reshape both variables preds and targets to (N, 1) before calling loss function

Citation

If you find LibAUC useful in your work, please cite the following paper for our library:

@inproceedings{yuan2021robust,
	title={Large-scale Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification},
	author={Yuan, Zhuoning and Yan, Yan and Sonka, Milan and Yang, Tianbao},
	booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
	year={2021}
	}

Contact

If you have any questions, please contact us @ Zhuoning Yuan [yzhuoning@gmail.com] and Tianbao Yang [tianbao-yang@uiowa.edu] or please open a new issue in the Github .

About

An end-to-end machine learning library to directly optimize AUC (AUROC, AUPRC)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published